GPT-4o mini token & cost calculator
OpenAI GPT-4o mini was, for a long stretch, the default cheap workhorse — the model that made high-volume classification, extraction, tagging, and routing economically boring. At $0.15 / $0.60 per million input / output tokens it's still inexpensive, but it is now a legacy model: GPT-5 Nano undercuts it and the GPT-4.1 generation beats its 128k context window.
Like full GPT-4o, its relevance in 2026 is mostly incumbency. Enormous volumes of batch and pipeline traffic are still pointed at gpt-4o-mini because it was the obvious choice when those systems were built and it has kept working. The useful question this calculator answers: at your volume, is the gap to GPT-5 Nano big enough to justify re-validating the pipeline?
Saved scenariosnone yet
Saved on this browser only — never uploaded. Up to 10 scenarios.
Tip: save a scenario when you have a prompt + model + response length you might revisit. Useful for sizing features before committing to a vendor.
Verify privacysince this page loaded — updates live
Open DevTools → Network. Type into the calculator. No request bodies should contain your prompt text.
Pricing
Flat-rated, no context tier. Because this model lives in high-volume workloads, the per-token rate compounds fast — a 3x input-price gap to GPT-5 Nano that looks trivial per call is a real line item across millions of requests. The calculator above is most useful here when you multiply its per-call number by your actual request volume.
| Tier | Input $/M | Output $/M |
|---|---|---|
| All input | $0.15 | $0.6 |
| Context window | 128,000 tokens | |
Verified against openai.com on 2026-05-18.
Worked examples
These per-call costs are small enough to look negligible — which is exactly the trap with a high-throughput model. The number that matters isn't the cost of one call, it's that number times your monthly request count.
| Scenario | Input | Output | Cost |
|---|---|---|---|
Short chat turn A typical Q&A turn with a small system prompt. | 800 | 400 | <$0.01 |
System prompt + tool spec A larger context window with a tool schema, single response. | 5,000 | 500 | <$0.01 |
Long document Q&A A long-form input (e.g. transcript) with a structured response. | 50,000 | 1,500 | <$0.01 |
A useful pattern for legacy mini-tier models: decide migrations on aggregate spend, not per-call cost. If GPT-4o mini is handling a few thousand calls a month, the price gap to GPT-5 Nano is rounding error and not worth the re-eval. If it's handling tens of millions, the same gap funds the migration several times over. Price your real volume before deciding either way.
How is this counted?
GPT-4o mini uses OpenAI's canonical o200k_base tokenizer. We count via gpt-tokenizer (MIT) — same exact vocab, calibration factor 1.0. Inputs over 50,000 characters tokenize in a Web Worker so the page stays responsive.
FAQ
Is GPT-4o mini still worth using in 2026?
How does it compare to GPT-5 Nano?
Is the token count exact?
Should I use the Batch API for this model?
Does my prompt leave the browser?
Compare against every other model
To see this exact prompt scored against every supported model, sorted by total cost, paste it into the home calculator and toggle Compare across all models. For a high-volume legacy model this is the fastest way to see whether GPT-5 Nano's lower rate actually moves your number enough to act on.
Related models
The comparisons that decide things here: GPT-5 Nano (the cheaper current replacement), GPT-4.1 Mini (when you also need long context), and full GPT-4o (the in-family step up when mini-tier quality isn't holding).