Writing · May 2026
The economics of deployment, not training
A frontier model costs somewhere between fifty and several hundred million dollars to train. This is the number that gets quoted, and it is the wrong number to anchor on.
Consider a product doing one billion queries a month. Modest, by the standard of anything embedded in a consumer surface. At one cent per query, that is ten million dollars a month, a hundred and twenty million dollars a year. At a tenth of a cent per query, it is twelve million a year. The training cost is amortised once. The inference cost is paid every day, forever, and it scales with success.
This is the part of the AI economy that is undertheorised in public. The training bill is a sunk capital expense that a handful of labs absorb. The inference bill is an operating expense that every product team inherits the moment they ship. It decides which products are viable, which features can be turned on by default, which user segments are economically reachable, and quietly, which startups survive their own growth.
Move a query from one cent to one tenth of a cent and the product changes shape. Features that were rationed become defaults. Free tiers become defensible. Latency budgets relax enough to allow a second model call inside the same user interaction, which is often where the actual product lives. The unit economics flip from managed scarcity to abundance, and abundance is what general-purpose technologies eventually require to become infrastructure. Electricity did not become civilisationally important at a dollar per kilowatt-hour.
The work to get there is unglamorous. It is kernel engineering, compiler passes, scheduler design, memory hierarchy discipline, quantisation that respects accuracy, batching that respects tail latency. It is the same posture I learned writing algorithms and firmware at EMotorad and shipping algorithms at Ola Electric: every joule, every cycle, every millisecond is a line item, and the system is only as good as the worst of them.
CoreOptX is built around this shift. Not the training frontier. That race is well-funded and well-staffed. The deployment frontier: the compute and compiler infrastructure that decides what a query costs once the model is already trained. The training bill announces the technology. The inference bill decides whether it gets to be everywhere.