GPT-5 Mini token & cost calculator
OpenAI GPT-5 Mini is the workhorse of the GPT-5 family — the model designed to handle the 80% of production AI traffic that doesn't need flagship reasoning. At $0.25 input / $2 output per million tokens, it sits in the same per-token tier as Claude 4.5 Haiku and Gemini 2.5 Flash, with the GPT-5 family's quality genealogy and the same 400,000-token context window as full GPT-5.
The math that makes Mini economic: at these prices, the model bill stops being the primary cost conversation. A short chat turn costs fractions of a cent. A 5,000-token system prompt at 100,000 daily requests rounds to $125/day in input cost — defensible for any AI feature with even modest engagement. The lever to optimize at this price level is correctness and reliability of routing, not cost-per-token.
Saved scenariosnone yet
Saved on this browser only — never uploaded. Up to 10 scenarios.
Tip: save a scenario when you have a prompt + model + response length you might revisit. Useful for sizing features before committing to a vendor.
Verify privacysince this page loaded — updates live
Open DevTools → Network. Type into the calculator. No request bodies should contain your prompt text.
Pricing
Mini is flat-priced. Note the 8× input/output ratio mirrors the rest of the GPT-5 family — the same prompt-engineering instincts transfer up and down the tier.
| Tier | Input $/M | Output $/M |
|---|---|---|
| All input | $0.25 | $2 |
| Context window | 400,000 tokens | |
Verified against openai.com on 2026-05-09.
Worked examples
Scenarios at three realistic prompt sizes. Mini's price floor means even the long-document Q&A scenario stays well under a dime per request.
| Scenario | Input | Output | Cost |
|---|---|---|---|
Short chat turn A typical Q&A turn with a small system prompt. | 800 | 400 | <$0.01 |
System prompt + tool spec A larger context window with a tool schema, single response. | 5,000 | 500 | <$0.01 |
Long document Q&A A long-form input (e.g. transcript) with a structured response. | 50,000 | 1,500 | $0.015 |
A useful framing: Mini is where you should default until you have evidence you need GPT-5. Production teams that ship cost-aware routing typically run 70–90% of traffic on Mini and reserve GPT-5 for an explicit "hard request" path — usually triggered by content classifier output or a task-difficulty heuristic. If you're running 100% on GPT-5 because it's "the smart one," you're likely overspending by 3–4× without quality lift.
How is this counted?
Mini uses OpenAI's canonical o200k_base tokenizer, shipped via the MIT-licensed gpt-tokenizer package. The count is exact — calibration factor 1.0, no approximation. Inputs over 50,000 characters tokenize in a Web Worker.
FAQ
When is Mini good enough?
How accurate is the token count?
What's the right max_tokens for Mini-driven workloads?
Does Mini support the same context as GPT-5?
How does GPT-5 Mini compare to Claude 4.5 Haiku?
Compare against every other model
To see this exact prompt scored against every supported model, sorted by total cost, paste it into the home calculator and toggle Compare across all models. GPT-5 Mini numbers are exact.
Related models
The two most useful comparisons: GPT-5 (when Mini's quality bar isn't enough) and Claude 4.5 Haiku (the cross-vendor budget tier with a similar capability profile).