- Factor Capital
- Posts
- Factor Capital Update - May 2026
Factor Capital Update - May 2026
The new line item on every org chart isn't a person.
Last month I wrote that agents were about to be everywhere. A month in, the more interesting story isn’t that they arrived. It’s how companies are budgeting for them.
The headlines read like another wave of layoffs. What’s actually happening is a re-architecture of the cost structure. Headcount is no longer the only unit of organizational capacity. Inference is becoming a parallel labor market, and the scarce people are the ones who know how to deploy it.
Token spend is the new headcount line
Yesterday morning Coinbase announced it is cutting 14% of its workforce, about 700 people, with Brian Armstrong pointing to AI as part of the reason they can run leaner. Coinbase also flipped the org chart, eliminating “pure managers” in favor of “player-coaches”— individual contributors who also mentor and coordinate.
That’s the same basic shape Jack Dorsey put in place at Block in February, when Block cut to under 6,000 employees and reorganized around individual contributors, directly responsible individuals on 90-day cycles, and player-coaches. Coinbase has its own crypto-specific reasons to resize around a narrower but clearer opportunity set — stablecoins, custody, and institutional rails — but the org-chart pattern is the more durable signal.
One distinction matters here. Some companies will use AI as cover to trim obvious fat: roles that existed because coordination was expensive, work was manual, or no one had forced the issue. I do not think that fully explains Block or Coinbase. Those cuts look more like business rightsizing plus a sorting function: who is leaning into AI hard enough to become a 5x or 10x operator, and who is still acting like a 1x employee in a world where 1x work is being commoditized.
So the common framing is too simple. AI is not just taking jobs. It is changing which jobs are worth creating. The job growth does not disappear, but it concentrates around people who can use agents to produce nonlinear output. Companies are reallocating budget from undifferentiated headcount to inference, agents, and the people who know how to make those agents useful.
The new role is the AI orchestrator: the person who builds the harnesses everyone else operates within. Adding one of those people is not adding one unit of output. It is closer to adding five, because everything they ship creates leverage across the rest of the organization. And functionally, it seems to mirror what both Anthropic and OpenAI are building new companies around through partnerships with PE giants announced this week, since most companies simply won’t be able to navigate this without outside support.
![]() | ![]() |
Run that through the income statement. The line that used to read “salaries and benefits” is splitting in two: human labor on one side, inference cost on the other. The companies that get this right end up with smaller human teams producing dramatically more output per dollar. The ones that don’t will pay for both and get the multiplier from neither.
Dylan Patel made this point sharply on Invest Like the Best a couple of weeks ago. Patel runs SemiAnalysis, the research firm tracking AI infrastructure and the chip supply chain feeding it. His own firm went from tens of thousands in AI spend last year to seven million this year. His argument: trying to save money on cheaper models is a losing strategy. If your team is not pushing the best model’s limit, a competitor who is will leapfrog you trivially.
This is customer acquisition cost versus lifetime value — the math every consumer business uses for its marketing budget — applied to tokens. If $50,000 of inference produces work that can be resold for $100,000, the right question is not how to minimize the token bill. It is how fast you can scale the loop. Token spend is a growth lever, not an expense to suppress. The gate is return on investment, not budget.
Aaron Levie at Box has been among the more vocal CEOs on this topic. He says each engineer is 2x to 5x more capable than a year ago, so the right move is not simply to cut. Hire more, plus a new team-level role whose job is building the systems everyone else runs on. Box is hiring for it. Coinbase, by collapsing into player-coaches, is making room for it. The operator gap I wrote about back in February — engineering leverage closing, orchestration leverage blowing wide open — is now showing up in real reorgs.
There is a political consequence here too. As AI-heavy companies pull away, the public story will not be “better allocation of inference budgets.” It will be “digital workers replacing human ones.” That distinction may not matter electorally. The technology curve keeps moving, but the political environment around labs, employers, and AI-first companies is going to get much harder.
The frontier is now capacity-constrained
The second story this month is Mythos, Anthropic’s next frontier Claude model.
Anthropic announced the Claude Mythos Preview on April 7 with an unusual call: it is not generally available, and it will not be in the near term. Access is restricted to a small group of partners called Project Glasswing. Anthropic framed the restriction around security testing and the unusually large inference footprint Mythos requires.
Then the White House blocked Anthropic’s plan to expand Mythos to roughly 70 additional organizations, citing security concerns and limited compute.
That is the important part. Frontier capability is no longer guaranteed to ship to anyone with a credit card. For the first time, inference capacity is being rationed between private-sector partners and the federal government as a strategic resource. Model access is bifurcating: general-purpose APIs keep improving, but the absolute ceiling increasingly sits behind partnership agreements, security reviews, and capacity allocation.
Patel framed the incentive well: if Ken Griffin walks into Anthropic with $10 billion from Citadel and offers to pre-buy the first $10 billion of inference on the next frontier model, what is Anthropic going to do?
Open source is the caveat. Within a few months, we will probably have an open source model approaching Mythos-level capability. Anthropic’s rationing is not about permanent access. It is about who has access now, when the lead time matters, and broader access accelerates replication of the current frontier. At Two Sigma, we called this alpha decay: how long a model’s edge lasts before everyone else catches up. The frontier is rationed because the GPUs are.
That explains the infrastructure frenzy underneath it. Every Stripe, Ramp, Coinbase, and SemiAnalysis is independently going exponential on token spend because each has done the math and concluded inference produces revenue at a multiple of cost. Stack that demand across every operator company reaching the same conclusion at the same time, and the labs face aggregate exponential demand.
Anthropic alone is now at a $30 billion revenue run rate, up from roughly $9 billion at the end of 2025. The labs’ response over the last 30 days has been to lock in compute at unprecedented scale. Anthropic committed $100 billion to AWS over the next decade for 5 gigawatts of Trainium capacity, with Amazon putting $25 billion back into Anthropic. Four days later, Google committed up to $40 billion of investment into Anthropic and another 5 gigawatts of dedicated TPU capacity, on top of a Broadcom partnership for next-generation TPUs coming online in 2027.
One lab locked in 10 gigawatts across two hyperscalers and a custom-silicon partner in less than a week because demand for Claude has already outrun what they can serve. And then, for the icing on the cake, Anthropic agreed with SpaceX (the parent company of xAI) to lease its entire Colossus 1 data center to boost availability.

That demand cascades down the stack: NVIDIA and TSMC orders, ASML lithography, HBM memory from Hynix and Micron, substrates, packaging, power, and data centers. ASML just raised its 2026 sales guidance, which is what you would expect when chipmakers see no slowdown in sight. All these companies’ stocks have effectively doubled in the past month.
Stripe is the clearest current example of an operator going exponential. At Stripe Sessions last week, they shipped 288 new products, many organized around agents, including Link’s agent wallet: You delegate a card, approve each spend in an app on your phone, and Stripe issues a one-time-use card per task. Credentials never reach the agent.
That is exactly what I was writing about last month — agents needing wallets, not passwords — shipping 30 days later. That cadence is what an AI-heavy operator looks like from the outside. Stripe’s Minions ships over a thousand reviewed pull requests a week with no human-written code.
The historical analog is the mid-2000s cloud migration. Many startups got founded, grew fast, and disappeared. AWS won because it became the core infrastructure layer supporting the entire transition. The pattern is rhyming now. Plenty of operators and wrappers will get their token math wrong and disappear, but NVIDIA, ASML, the memory makers, the hyperscalers, and the labs underneath them are accumulating revenue at a rate the public markets have only started to price in.
What this means for venture
Underneath all of this is the compounding effect of actually using these tools. The more you use AI, the more you use it. You delegate work, see it come back, and immediately discover three more things to delegate. The graph of what you ship per week bends upward, and the bend gets steeper.
This reshapes what venture capital is for. Once token spend is a growth lever rather than an expense, the central question for any AI business is what sits between your tokens and your customer’s wallet that the model provider cannot replicate.
If the answer is a thin user experience layer, the model provider will eventually ship the same product. Cursor is the live case study: the canonical Claude wrapper through 2025, now in a long fight with Anthropic, which is shipping Claude Code as a free alternative inside the same workflow.
The companies that survive are not selling tokens. They are selling a finished product the customer pays for separately.
Picture a CPA who builds a Claude-assisted workflow and files taxes for clients. She might spend $2,000 per client on tokens and bill $10,000 for the filing while serving five times as many clients as before. The token bill is real. It belongs in cost of goods sold against a higher-margin output. But she is not selling Claude. She is selling her professional license, her judgment, her sign-off, and the liability coverage a business actually needs. The model is one input that replaces the junior CPA. The product is everything else true about her.
That is where venture gets more nuanced than it has felt for the past several decades. A lot of these businesses will not need venture capital at all. The CPA is not pitching anyone. Neither is the solo founder running a vertical AI tool to $500,000 to $5 million of annual recurring revenue with one or two people, or the small agency replacing a ten-person engineering team with two operators plus inference. These are the profitable, bootstrapped businesses I have been writing about for over a year. They never enter the venture funnel because they do not need to.
For the businesses that do raise, the round itself may look less like classic venture capital and more like working capital. You are financing a proven cycle on the balance sheet, not making a speculative bet. The customer acquisition cost and lifetime value math is observed. The capital scales cycles that already convert. The right instrument might not even be equity. It might look closer to revenue-based financing or a token-specific credit line, priced off the company’s demonstrated multiple. Most old-school venture capitalists are holding the wrong instrument for that category. But the multi-billion-dollar, multi-stage VCs are set up for this.
Some of the companies posting $100M ARR in less than 6 months might sustain, and this working capital investment might pay off. But probably a lot won’t.
The closest analog is the late-2010s wave of direct-to-consumer brands. Allbirds and On Running both had fast growth, internet-native distribution, and working-capital needs that looked attractive when the checks were written. One became a mostly one-product story. The other became a durable multi-product franchise. AI will rhyme with that. The financing math will not tell you which is which. Both can show the same customer acquisition cost, the same lifetime value, the same forward-token-commitment profile, and the same compounding revenue line at the time of the raise.
The separating variable is founder quality: the ability to keep evolving the product as the platform and customers evolve beneath it. Stripe is the live case study — generational founders evolving the product faster than the trend moves around them. Coinbase reorganizing today, Block in February, Box hiring orchestrators — the same pattern is showing up at scale.
Same logic at every level: the solo CPA spending $2,000 per filing, the bootstrapped vertical AI company at $10 million of annual recurring revenue without ever raising, the venture-backed company spending $50 million to scale a proven loop. Whoever owns what the model cannot be is the one who gets to charge for it.
Thanks as always for reading.
— Jake Dwyer
Founder & Managing Partner
Factor Capital

