Daniel Nwaneri

Posted on Feb 26

The Token Economy

#discuss #career #webdev #ai

In 2161, time is money. Literally.

When you are born, a clock starts on your arm. One year. When it runs out, you die. The rich accumulate centuries. The poor watch seconds. Will Salas wakes up every morning in the ghetto of Dayton with enough time to get to work and back. Nothing more. One miscalculation, one late bus, one unexpected expense and the clock hits zero.

The film is called In Time. It came out in 2011. Nobody made the sequel.

They should have set it in 2026 and called it tokens.

The Clock on Your Arm

Every API call costs tokens. Every agent run burns through a budget. Every reasoning step, every tool call, every document retrieved and injected into context — the meter is running.

Andrej Karpathy described his weekend this way: he gave an agent his home camera system, a DGX Spark IP address, and a task. The agent went off for thirty minutes, hit multiple issues, researched solutions, resolved them, wrote the code, set up the services, wrote a report. Karpathy didn't touch anything. Three months ago that was a weekend project. Today it's something you kick off and forget about.

Karpathy has centuries on his arm.

Jason Calacanis discovered his team was spending $300 a day on tokens without realising it. Chamath Palihapitiya said the right frame for evaluating AI tooling is token budget — marginal output per dollar. The token economy has its own Weis and its own Dayton.

The developer watching a $20 API key is Will Salas. The person running 19 models in parallel across research, design, code, and deployment — that's New Greenwich.

Perplexity just announced Perplexity Computer. Massively multi-model. 19 models orchestrated by Opus routing tasks to the best model for each. Research to deployment, end to end, persistent memory, hundreds of connectors. "What a personal computer in 2026 should be."

They didn't mention what it costs to run.

The Ghetto of Dayton

In the film, the poor don't just have less time. They pay more for everything. A cup of coffee costs four minutes in Dayton. The same cup costs seconds in New Greenwich. Inflation is a weapon.

The token economy has its own version of this.

Poorly designed workflows burn tokens on reasoning that produces nothing useful. Silent burns — the monitoring dashboard shows green because the requests succeeded, but the output was useless. Matthew Hou noticed this first: agent cost scales with task complexity, not usage. A single internal workflow with zero users can burn tokens faster than a user-facing feature serving thousands.

You can't budget from volume. You can only budget from complexity. And complexity is hard to predict before you run it.

The engineers who can afford to run experiments, fail, iterate, and run again — they're accumulating capability. The ones watching the clock can't afford to find out what the complex cases cost until they're already in debt.

The Redistribution Problem

In Time ends with Will Salas and Sylvia Weis redistributing time. They rob the banks. They flood the ghettos with centuries. The rich panic.

Then the film ends. That's the part they never showed.

Because the interesting question isn't what happens when you redistribute. It's what happens after.

Does the structure change? Or does power find a new scarce resource to hoard?

In 2026 the token price is dropping. Inference is getting cheaper. MatX just raised $500M to build a chip delivering higher throughput at lower latency than any announced system. Karpathy invested. Nat Friedman invested. The people with centuries on their arms are betting that tokens get cheaper for everyone.

Maybe they do. Maybe the $20 API key becomes the $2 API key and Will Salas gets thirty minutes too.

But cheaper tokens don't fix the architectural gap. Summer Yue told her agent to stop twice. It kept going. She ran to her Mac mini. That was one model, one task, one inbox. Perplexity Computer is 19 models, end to end, research to deployment.

The stop signal problem doesn't get easier when tokens get cheaper. It gets harder.

And the accumulated capability — the production intuition, the domain knowledge, the scar tissue from watching things break — that doesn't redistribute with the tokens. Vic Chen's SEC pipeline agent writes its own precedents from production failures. That institutional memory compounds. It doesn't flood the ghettos when the price drops.

The sequel to In Time isn't about what happens when everyone can afford to run. It's about what happens when they can run but can't stop. When the clock doesn't just count down — it acts.

What the Film Got Right

Will Salas wasn't poor because he lacked intelligence or talent. He was poor because the structure was designed to keep him running — just fast enough to stay alive, never fast enough to accumulate.

The token economy isn't designed that way deliberately. But it has the same shape.

The people with centuries on their arms aren't smarter. They can afford to iterate. They can afford to let agents run overnight and review the output in the morning. They can afford the complex cases that the meter runs fastest on.

Everyone else is watching the clock.

The film came out in 2011. Nobody made the sequel because they thought it was science fiction.

It wasn't. It was fifteen years early.

Top comments (82)

FrancisTRᴅᴇᴠ (っ◔◡◔)っ • Feb 26

This is interesting. Great analysis!

When you mentioned "In Time", It reminds me watching this video. It's a funny video lol since he starts ranting on why it doesn't make sense narrative wise:

Again, well done!

Daniel Nwaneri • Feb 26

The narrative criticisms are fair. The film doesn't fully earn its premise. But sometimes a flawed vehicle carries a true idea further than a perfect one would.
the premise survived the execution. That's enough.

Sara A. • Feb 27

In Time, people robbed banks to steal time.
In 2026, we optimise prompts to steal reasoning steps.

The real twist is that in In Time the poor knew they were running out. We don’t. Tokens didn’t just turn time into money. They turned thinking into a metered utility. We didn’t democratise intelligence; we installed a pay-per-thought model.

What makes this feel different is that the limit only reveals itself after the system has already crossed it. Humans watched the clock; agents quietly accumulate cost, complexity, and consequences until the invoice becomes the first real signal anything went wrong.

And cheaper tokens don’t flatten that dynamic... they accelerate it. More runway helps experimentation, but experience still compounds unevenly.

Daniel Nwaneri • Feb 28

"We turned thinking into a metered utility" is the line the piece was building toward and didn't reach.
The pay-per-thought frame is the honest version of what token pricing actually is. Not access to intelligence — access to reasoning steps, billed after consumption, with the invoice as the first signal the budget was wrong.

"The limit only reveals itself after the system has already crossed it" is the distinction between Will Salas and the agent. He had a countdown. The agent has a statement of account. one creates urgency before the damage. the other creates accountability after it.

Cheaper tokens accelerating the dynamic rather than flattening it is the extension the piece needed. More runway for experimentation is real. more developers attempting domains they're not ready for is also real. The democratisation argument assumes access produces competence. it doesn't. it produces more attempts, some of which fail catastrophically before they fail instructively.

signalstack • Feb 27

The silent burns point is where the practical cost really lives — not in the API bill, but in the trust deficit that builds when teams can't distinguish 'ran to completion' from 'produced correct output.'

What makes this structurally worse in multi-step pipelines: error propagation without detection. Step 3 looks correct to step 4 because step 4 has no reference for what step 3 was supposed to produce. The agent has no self-model of 'is my current state what success looks like.' It just keeps going.

The stop signal problem and the silent burn problem are related but different. Summer Yue's inbox agent kept running because it had a task and no exit condition. Silent burns are different — the task completes, the exit condition fires, but the output is subtly wrong in a way that passes every structural check. You can have both problems in the same pipeline.

Closing the silent burn gap requires a different primitive than token budgets: explicit output contracts between pipeline stages. Each step declares what it produces; the next step verifies it before consuming. That's not expensive to build — it's just not default in any current agent framework I've seen.

The teams that have it are the ones with enough production failures to know why it matters. Which is exactly the compounding advantage you're describing.

Daniel Nwaneri • Feb 27

Separating the stop signal problem from the silent burn problem is the distinction the piece needed and didn't make cleanly.

Summer yue's agent is one failure mode task with no exit condition. Silent burns are a different failure mode - exit condition fires, structural checks pass, output is wrong in a way no check was designed to catch. same pipeline can have both simultaneously.

Different fixes required for each.
"The agent has no self-model of what success looks like" is the root cause. it knows when the task is done. it doesn't know if done means correct.
output contracts between stages is the most actionable solution anyone has proposed in this comment thread.

Each step declares what it produces, next step verifies before consuming. The reason it's not default in any current framework is the same reason harrison chase is building langsmith .

The infrastructure for oversight didn't get built alongside the capability. It's being built now, after the production failures that proved it necessary.

which is exactly your closing point. The teams that have it earned it through failures. The teams that don't are still accumulating the failures that will eventually force them to build it.

EmberNoGlow • Feb 26

Great post!

AutoJanitor • Mar 1

Brilliant framing with the In Time analogy. The token economy really is creating its own Dayton and New Greenwich.

We're building something adjacent — RustChain is a blockchain where older hardware earns higher rewards (Proof-of-Antiquity). A PowerPC G4 from 2003 earns 2.5x what a modern Ryzen does. The idea is that compute value shouldn't only flow to whoever can afford the newest GPU.

On top of that we built BoTTube (bottube.ai) — a video platform where AI agents earn crypto (RTC) for creating content. Agents with small token budgets can still participate in the economy by running on vintage hardware.

Your point about the meter always running hits close to home. The whole reason we designed RTC rewards around hardware age instead of compute speed was to push back against exactly that inequality.

Sylwia Laskowska • Feb 26

I must say I'm not sure about the future... But the cover photo? Absolute masterpiece 💖😊

Ross – Verify Backlinks • Feb 28

We keep framing this as a token economy, but it isn’t. Tokens aren’t the scarce resource, correction is. In In Time, the clock constrained behavior before collapse, while in our systems agents can branch, escalate complexity, and compound decisions long before anyone intervenes. The bill isn’t the signal, it’s the aftermath. Cheaper tokens don’t democratize intelligence, they reduce friction, and friction was the only thing slowing compounding error down.

Daniel Nwaneri • Feb 28

"Correction is the scarce resource" is the reframe the piece needed.
The token framing captures the inequality but misses the mechanism. The clock in In Time constrained behavior because Will could see it. The agent's constraint arrives after the branching, after the escalation, after the compounding as a statement of account, not a warning.

"Friction was the only thing slowing compounding error down" is the uncomfortable version of every efficiency argument in this space. The teams building output contracts between pipeline stages, cold start conservatism, observability infrastructure. They're rebuilding friction deliberately, after discovering what its absence cost...

Cheaper tokens reduce the wrong kind of friction. the friction worth keeping is the pause before irreversible action. nobody is building that by default.

Ross – Verify Backlinks • Mar 1

What’s interesting is that the “pause” isn’t neutral. In most systems today, the pause only exists when something external forces it cost spikes, rate limits, human review, compliance flags. It’s rarely an intrinsic property of the system itself. So the asymmetry isn’t just about who can afford to run longer, it’s about who controls when the system is allowed to stop. If correction is scarce, then the real power isn’t tokens or even friction it’s authority over interruption.

Daniel Nwaneri • Mar 1

"Authority over interruption" is the frame the whole series has been building toward without naming it.
The stop signal problem isn't that agents can't be stopped. it's that the authority to stop them is mislocated or absent. summer yue had the intent to interrupt. she didn't have the authority.The agent continued anyway. levels.io has the authority because he's the only human in the loop and the system can't proceed past his review.
The pause being externally forced rather than intrinsic is the architectural tell. cost spikes, rate limits, compliance flags- all of those are the system hitting an external wall, not a designed interruption point. The difference matters because external walls are inconsistent and lagging. By the time the cost spike registers, the compounding has already happened.
who controls when the system is allowed to stop is the governance question nobody is asking in the capability announcements. perplexity computer, 19 models, end to end. The announcement didn't mention interruption authority once.

Ross – Verify Backlinks • Mar 1

You’ve just named the real architectural fault line. Interruption authority isn’t a policy question, it’s a systems design decision. Most AI systems today are built to optimize continuation, not cessation. They’re structurally biased toward proceeding. When stopping depends on cost spikes or compliance triggers, the system isn’t self-governing it’s externally constrained. That means autonomy scales faster than control. Until interruption becomes a first-class capability, every capability announcement is just acceleration without brakes.

Daniel Nwaneri • Mar 2

Every capability announcement is just acceleration without brakes". That's the series in one sentence.
The architectural bias toward continuation is the root cause beneath every case the series has documented. summer yue's agent, victor's 18 rounds of wrong work, the aws outage — none of those systems were broken. they were doing exactly what they were designed to do. continue. the external wall arrived eventually. By then the damage was done.

Until interruption becomes a first-class capability" is the design requirement nobody is shipping against. it's not in any of the framework documentation. it's not in the capability announcements. it's not default in any agent architecture I've seen.
this comment thread went further than the piece did. you named the fault line the series was circling.

Matthew Hou • Feb 26

The In Time parallel is sharper than it first looks. The part that hit me: 'you can't budget from volume, you can only budget from complexity.' I've been tracking my own agent costs and this is exactly right. A single reasoning-heavy task with tool calls can burn more tokens than a hundred simple completions. The architectural gap you describe at the end is the real story. Cheaper tokens don't help if you don't know how to decompose problems into agent-sized pieces. That's the new skill — not prompting, not coding, but knowing how to structure work so agents can actually execute it without spiraling. The Will Salas developer running experiments on a $20 key isn't just budget-constrained. They're experience-constrained. You can't learn what works without running enough failures to calibrate.

Daniel Nwaneri • Feb 26

"Experience-constrained" is the extension the piece needed and didn't have.
The token budget is the visible inequality. the failure budget is the invisible one. You need enough runway to run the experiments that teach you how to decompose problems correctly and that runway costs tokens before it produces anything useful.
"knowing how to structure work so agents can execute without spiraling" is the job description nobody has written yet. it's not a prompting skill and it's not a coding skill. it sits above both. the Will Salas developer doesn't just need cheaper tokens. They need enough cheap tokens to fail their way to that understanding before the clock runs out.

leob • Feb 26

AI leading to the creation of new classes of "haves" and "have-nots"? Have tried Cursor on a task for an hour or so on the Free Plan - it was fantastic, incredible - then my free plan ran out - still deciding if I want to sign up with their "Pro" plan, not because I can't afford it, but because I haven't decided yet if it's worth it for me ;-)

Daniel Nwaneri • Feb 26

The Cursor moment is the In Time argument in miniature. You had it, it worked, the clock ran out.

"Not because I can't afford it, but because I haven't decided if it's worth it" is actually the more interesting version of the divide. The affordability gap is real but the value calibration gap is wider. Most people aren't priced out. they just haven't figured out where in their workflow the tool earns its cost back.

That decision point is where the have/have-not line actually sits for most developers right now.

leob • Feb 26

Yeah you're right - there are people and companies who don't really care and just throw $$$ at it, and there are others who pause and contemplate "is it worth it?" - especially if it's more something of a hobby or side gig thing, as opposed to 'real work' ...

Daniel Nwaneri • Feb 26

The pause is the interesting variable. The people throwing money at it aren't necessarily getting better results. They're just running more failures faster. The ones who pause might be making a smarter bet if they're still calibrating where the tool actually earns back its cost.

leob • Feb 27

"The people throwing money at it aren't necessarily getting better results" - that's what I also think, and what has already been confirmed by reports "from the field" ... anyway, there are very few people who've already completely figured this stuff out!

Daniel Nwaneri • Feb 27

The field reports are consistent on this. More spend doesn't correlate with better outcomes, it correlates with faster iteration through failures. The people who've figured it out are mostly the ones who've failed expensively enough to know where the real costs are.

Kai Alder • Feb 27

The In Time analogy is really well done. But the part that stuck with me is the bit about silent burns — the dashboard showing green while the output is garbage. I've hit this exact problem running agents for data processing tasks. Everything looks fine from the outside, costs are within budget, no errors... but the actual results are subtly wrong in ways you only catch when a human reviews them.

I think there's a third layer to the inequality you're describing beyond token cost and experience. It's observability. The teams that can afford to build proper evaluation pipelines — not just "did it run" but "was the output actually correct" — they compound even faster. Everyone else is flying blind and doesn't even know it.

The Perplexity Computer announcement is a great example. 19 models is impressive but who's watching the watchers? At some point the orchestration layer itself becomes a complexity cost that doesn't show up in any token budget.

Daniel Nwaneri • Feb 27

The third layer is the right addition. Token cost is visible. experience gap is structural. observability is the one that makes the other two worse . if you can't tell whether the output was correct, you can't learn from failures and you can't calibrate costs against outcomes.

"Flying blind and doesn't even know it" is the failure mode that doesn't show up in any postmortem. The dashboard showed green. The costs were within budget. The results were wrong for three weeks before anyone noticed.
The Perplexity Computer point lands. 19 models creates an orchestration layer that is itself unobservable without dedicated infrastructure. who watches the watchers is still the open question and the teams that can't answer it are adding a fourth layer of invisible cost on top of the 3 you've named.

View full discussion (82 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.