
The cost of digital intelligence has officially collapsed. A new industry report tracking the price-performance of large language models reveals that the cost to access PhD-level reasoning has plummeted by over 99% in just two years, sparking what tech leaders are calling a “wicked race to the bottom”.
The report focuses on the rigorous GPQA Diamond benchmark, a test designed to measure expert-level knowledge in biology, physics, and chemistry.
Two years ago, early iterations of GPT-4 charged roughly $37.50 per million tokens to achieve a 50% score on this test. Today, the landscape is unrecognizable.
Topics
ToggleAlso Read: What Happens When ChatGPT Starts Recommending Brands Instead of Just Answers?
The New Price of Intelligence
According to the data, new variants like xAI’s Grok 4 are now scoring near 70% on the same benchmark at a cost of just $0.28 per million tokens. This represents a staggeringly steep decline in price paired with a massive leap in capability.
Google is pushing the floor even lower. Its Gemini 2.5 Flash-Lite model is reportedly hitting “strong marks” at a rock-bottom price of $0.10 per million tokens, while the more powerful Gemini 3 Flash sits at $0.50.
A Wicked Race

Salesforce CEO Marc Benioff has characterized this trend as a “wicked race to the bottom”. While the plummeting prices are squeezing profit margins for model providers, they are igniting a frenzy among developers.
The commoditization of “AI smarts” means that high-level reasoning is no longer a premium luxury. Builders are now deploying autonomous agents for complex tasks in healthcare, coding, and scientific research applications that were previously too expensive to run at scale.
However, the report notes a paradox: while per-token prices are dropping, overall spending is rising. The cheap access has encouraged “heavier use,” leading businesses to integrate AI into every facet of their operations.
Also Read: Alibaba Qwen’s New Flash Models Clone Voices in Just 3 Seconds



