Est. Reading: 9 minutes

The Token Trap: AI Capabilities Grew 10×. Pricing Grew 0×.

TL;DR: Most AI companies are stuck in the Token Trap; pricing by compute while capabilities compound at 10× per year. The winners will price on workflows and outcomes, not tokens.

Anthropic went from zero to ten billion dollars in annual revenue in three years. Dario Amodei says the industry will hit trillions before 2030. And last week, a Chinese startup matched GPT-5's performance at one-tenth the cost.

These numbers don't contradict each other. They reveal the same thing: the gap between what AI can do and what companies charge for it is the largest mispricing in tech history.

I've been watching startups price AI products for the last two years, helping teams figure out what to charge, how to package, and when to shift models. The pattern I keep seeing: the exponential is here, but the pricing strategy hasn't caught up. One quote captures why.

The Exponential Nobody Priced In

In a recent conversation on the state of the industry:

"If the tokens are being used to restart someone's Mac, those tokens are worth pennies. If the tokens are being used to redesign a molecule that can cure cancer… those tokens are worth tens of millions of dollars."

Same tokens. Same compute cost. Radically different value. And yet most AI companies price as if every token is equal.

This is the core tension. AI capabilities are compounding at 10× annually. Anthropic's own revenue went $0 → $100M → $1B → $10B in successive years, and they added "another few billion" in January 2026 alone but the dominant pricing model treats AI like a commodity. Per-token consumption pricing values the input (compute) instead of the output (value created).

If the industry stays on per-token rails, most of that value leaks to customers while providers race each other to the bottom.

February's Annual Shock

That race to the bottom? It already has a calendar. It's becoming a February tradition. Last year, DeepSeek stunned the industry, matching OpenAI's o1 at a fraction of the price and training their V3 model for $5.6 million. This year, the floodgates opened wider:

MiniMax M2.5 — open-source, rivals Claude Opus 4.6 and GPT-5 on agentic coding benchmarks, at 10% of the cost
Alibaba Qwen 3.5 — 60% cheaper than its predecessor, 8× better throughput, outperforms GPT-5.2, Claude Opus 4.5, and Gemini 3 Pro
Moonshot Kimi K2.5 — open-source, outperforms Gemini 3 Pro on SWE-Bench with a full coding agent built on top

And at least two more frontier releases landed in the same week (Zhipu GLM-5, ByteDance Doubao 2.0), plus "red-envelope" subsidy campaigns that drove 7× user growth for competing chatbots.

Chinese labs aren't just competing on capability. They're waging a price war as a distribution strategy, open-sourcing frontier models and subsidizing usage to acquire users at scale. For anyone pricing AI by the token, this is an existential signal. The cost floor isn't just falling, it's being deliberately demolished.

But here's what the "race to zero" panic misses: the best-capitalized labs aren't worried. They expect the market to consolidate around 3–4 dominant players who sustain margin, not by being cheapest, but by capturing value.

I call this the Token Trap: the longer you price on compute, the more you're competing on the one axis China is optimizing to zero. The way out is to climb the pricing ladder.

Three Pricing Models for the Exponential Era

Bessemer's AI Pricing and Monetization Playbook studied how AI-native companies actually monetize. They found three emerging models, each one step further from the Token Trap:

Model 1: Consumption — Price per Token

The current default. You pay for what you use; tokens in, tokens out.

When it works: Developer tools, API platforms, low-switching-cost infrastructure.

The problem: It commoditizes instantly. When MiniMax offers the same capability at 10% of the price, your margins vanish overnight. AI gross margins already run 50–60%, far below SaaS's 80–90%, and consumption pricing puts relentless downward pressure on every cycle. I've seen startups lose half their pricing power in a single quarter after a new open-source model drops.

Model 2: Workflow — Price per Task

You pay for a completed unit of work; a support ticket resolved, a document processed, a code review completed.

When it works: When the task is well-defined and the customer already understands the value of that task being done.

The leap: You're no longer selling compute. You're selling labor substitution. The customer compares your price not to another API, but to the human who used to do that work.

In practice: Intercom's AI agent Fin charges $0.99 per resolution; not per message, not per token. A human support agent costs $5–15 per ticket. The value gap is obvious, and it's immune to underlying model costs dropping. If MiniMax cuts token prices in half tomorrow, Intercom's $0.99 resolution is still worth $0.99.

Model 3: Outcome — Price per Result

You pay for a measurable business outcome; revenue generated, cost saved, risk mitigated.

When it works: When you can measure the outcome and attribute it to the AI. This is the hardest model to implement, but the least vulnerable to commoditization.

The leap: You're not selling compute or labor, you're selling business impact. The customer pays based on what the AI achieved, not what it consumed.

The Pricing Ladder

Model	What You're Pricing	Commoditization Risk	Who Sets the Price
Consumption	Compute	High — China proves this weekly	The cheapest competitor
Workflow	Labor substitution	Medium — tied to task value	The cost of the human alternative
Outcome	Business impact	Low — tied to results	The value of the outcome itself

The companies that will capture those trillions aren't the ones running the cheapest models. They're the ones who figure out how to price the molecule, not the token.

Why Most Companies Are Stuck on the First Rung

The framework is clear. So why isn't everyone already pricing on outcomes? Because climbing the ladder requires infrastructure most AI companies haven't built yet.

Three things stand in the way:

1. Attribution infrastructure. You need to prove your AI caused the outcome. In SaaS, this was hard enough, see every marketing attribution debate of the last decade. In AI, where the model is one step in a multi-step workflow, it's harder still.

2. Customer trust. Outcome pricing means the customer pays more when the AI works well. That requires transparency about how the AI reached its result, the exact thing most model providers treat as a black box.

3. Courage to charge for value. The bluntest advice to founders? "Have the courage to charge for the value they create rather than defaulting to cost-plus." Most AI startups, terrified of churn, default to cheap per-token pricing and leave enormous value on the table.

The practical bridge? Hybrid pricing. Start with a base subscription for platform access, then layer usage or outcome-based tiers on top that scale with value delivered. This lets you start capturing value today while building the attribution infrastructure for full outcome pricing tomorrow.

The window matters. A "renewal cliff" in 2026 is looming, enterprise customers who adopted AI tools over the last two years are coming up for renewal and will demand proof of ROI. If your pricing doesn't connect to outcomes, that renewal conversation gets painful fast.

So what do you do about it this week?

The Monday Morning Checklist

For Founders / CEOs:

[ ] Audit your current pricing model. Are you charging for compute, tasks, or outcomes? If it's purely consumption, you're one Chinese open-source release away from a margin crisis
[ ] Identify your "molecule". The highest-value use case where your AI creates measurable business impact and build a pricing tier around it
[ ] Implement hybrid pricing now. Base subscription + outcome or workflow tiers. Don't wait for the renewal cliff to force the conversation

For CTOs / Engineering Leaders:

[ ] Build attribution infrastructure. Instrument your AI workflows to measure business outcomes, not just model performance metrics
[ ] Benchmark your unit economics against open-source alternatives (MiniMax M2.5, Qwen 3.5, Kimi K2.5) if your margin depends on customers not knowing these exist, that's not a margin
[ ] Run the "molecule test" on your top 5 use cases: which ones are restarting machines and which are solving novel problems?

For Investors / Board Members:

[ ] Ask portfolio companies to report revenue per outcome, not just revenue per API call. This is the metric that survives the China price floor
[ ] Model the renewal cliff scenario: what happens to ARR if enterprise customers demand outcome-linked pricing at their next renewal?
[ ] Reassess AI investments through the pricing ladder: consumption-priced companies face structural margin pressure; outcome-priced companies build defensible moats

The single takeaway: AI capabilities compound at 10× per year. AI pricing must compound with them, from tokens to tasks to outcomes, or the value accrues to everyone except the company that built the model.

Whether you're building, buying, or funding AI, what's the use case where you'd pay 100× more than token rates for a guaranteed outcome? I'd love to hear examples, reply or drop a comment.

Sources: