Pricing thought: OpenAI will price reasoning tokens in o1
What is the cost of thought?
OpenAI has shared some interesting context for how it will price its o1 reasoning models. o1 is based on the Chain of Thought framework in which reasoning is done step by step and the steps are revealed. Understanding and using Chain of Thoughts and related approaches has become key to the effective use of generative AI for many use cases and OpenAI is doing important work here (I wonder if this would not be more effective with ReAct and other alternatives).
Much of human thought is hidden from the thinker. It takes special training and attention to be able to understand how one is thinking about a problem and the steps one is taking to explore, expand, focus, and propose a solution. This is central to analytic philosophy and the adjacent work of Ludwig Wittgenstein and it is part of the training in most professional fields, from legal reasoning, to design (do we need a Chain of Design framework?), to engineering, to business management. At Ibbaka we teach ourselves and our clients to reason about value, packaging, and pricing and how to connect them to strategy.
The hidden nature of thought has been one of the barriers to the more widespread adoption of Chain of Thought approaches to AI prompting. In o1, OpenAI is making this more widely available and multiplying its power. Last week we asked how OpenAI might price this. Let’s look at the approach they are actually taking.
The above image comes from OpenAI’s page on Reasoning Models. There are a few things to think about here before we get to pricing.
o1 is multiple steps and is doing a lot more work than prompting a conventional foundation model. It will generally have a higher computing cost. It is also likely that many reasoning chains will exceed the context window. (Will OpenAI and other vendors offer larger context windows in premium packages?) But the most interesting thing here is ‘reasoning tokens.’
What is a reasoning token?
Reasoning tokens represent the internal "thought process" of o1 models as they work through complex problems. They are generated by the model during its reasoning process but are not visible in the final output.
The o1 models use these reasoning tokens to "think" by:
Breaking down their understanding of the prompt
Considering multiple approaches to generating a response
Working through the problem step-by-step
Reasoning tokens count as part of context window usage (an impediment to performance) and will be charged as output tokens.
OpenAI o1 Pricing
In last week’s post How should OpenAI price o1 we suggested that there should be no premium for accessing o1 through ChatGPT as part of the strategy is to use o1 to generate data that can be used to train future models. On the other hand, we suggested that customized, industrial applications should be priced at a premium as reasoning models offer a lot more value. This seems to be directionally correct.
The two o1 models are priced as follows:
OpenAI o1-preview
Input Tokens: $15.00 per million tokens
Output Tokens: $60.00 per million tokens
OpenAI o1-mini
Input Tokens: $3.00 per million tokens
Output Tokens: $12.00 per million tokens
With reasoning models operating costs will be higher, and differentiated value better be higher too
The initial reasoning models are being provided at a high premium. Comparing o1-preview with GPT 4.0 there is a 3X premium for Input Tokens and a 4X premium for Output Tokens. And remember, all those reasoning tokens are being counted as Output Tokens. The computing costs for people building on these models are going to be a lot higher than those using conventional foundation models. They are going to have to offer differentiated value that is orders of magnitude higher. To manage spend, applications will want to know when they need to use reasoning models and when they can get away with more commoditized models.
Note that output tokens are much more expensive than input tokens. In many of the likely applications of reasoning models, there will be a lot more input than output. The price difference here can make a big difference to compute costs. By capturing the hidden reasoning tokens as output tokens OpenAI will be pricing much higher than it it had considered reasoning tokens as input tokens.
There have been some comments in the chatter (on LinkedIn and on Reddit) that reasoning tokens make OpenAI’s pricing much less transparent. The API does not tell you how many reasoning tokens you are consuming and this will make it much more difficult to optimize reasoning and to reduce the cost of thought.
Prediction: in future versions, OpenAI will include the number of reasoning tokens being generated/used at each step, and optimization applications to manage and optimize this will emerge. These are likely to be in the market in late 2H 2025.
Book a demo today and discover how our unique B2B SaaS solution can increase sales, improve retention, and drive growth.