GPT-5 Mini token & cost calculator

OpenAI GPT-5 Mini is the workhorse of the GPT-5 family — the model designed to handle the 80% of production AI traffic that doesn't need flagship reasoning. At $0.25 input / $2 output per million tokens, it sits in the same per-token tier as Claude 4.5 Haiku and Gemini 2.5 Flash, with the GPT-5 family's quality genealogy and the same 400,000-token context window as full GPT-5.

The math that makes Mini economic: at these prices, the model bill stops being the primary cost conversation. A short chat turn costs fractions of a cent. A 5,000-token system prompt at 100,000 daily requests rounds to $125/day in input cost — defensible for any AI feature with even modest engagement. The lever to optimize at this price level is correctness and reliability of routing, not cost-per-token.

Expected response (output tokens)

Prompt

Client-side. Never uploaded.

0 / 1,000,000 charactersContext window: 400,000 tokens

Or start with an example

Total estimated cost

<$0.01GPT-5 Mini

Tokensexact

Input cost

$0.00

Output cost (est.)

<$0.01

@ 1,024 response tokens

Context used

of 400,000

Verified 2026-05-09 · exact

Saved scenariosnone yet

Saved on this browser only — never uploaded. Up to 10 scenarios.

Tip: save a scenario when you have a prompt + model + response length you might revisit. Useful for sizing features before committing to a vendor.

Verify privacysince this page loaded — updates live

Prompt uploads0Always 0 — by design

Outgoing requests0Analytics + page assets only — no prompt content

Cookies on this origin0Vercel Analytics + Clarity may set first-party cookies

localStorage keys0Theme preference + saved scenarios live here

Server endpoints1/api/og only — accepts title + subtitle, never prompt text

Inspect

Open DevTools → Network. Type into the calculator. No request bodies should contain your prompt text.

Pricing

Mini is flat-priced. Note the 8× input/output ratio mirrors the rest of the GPT-5 family — the same prompt-engineering instincts transfer up and down the tier.

Tier	Input $/M	Output $/M
All input	$0.25	$2
Context window	400,000 tokens

Verified against openai.com on 2026-05-09.

Worked examples

Scenarios at three realistic prompt sizes. Mini's price floor means even the long-document Q&A scenario stays well under a dime per request.

Scenario	Input	Output	Cost
Short chat turn A typical Q&A turn with a small system prompt.	800	400	<$0.01
System prompt + tool spec A larger context window with a tool schema, single response.	5,000	500	<$0.01
Long document Q&A A long-form input (e.g. transcript) with a structured response.	50,000	1,500	$0.015

A useful framing: Mini is where you should default until you have evidence you need GPT-5. Production teams that ship cost-aware routing typically run 70–90% of traffic on Mini and reserve GPT-5 for an explicit "hard request" path — usually triggered by content classifier output or a task-difficulty heuristic. If you're running 100% on GPT-5 because it's "the smart one," you're likely overspending by 3–4× without quality lift.

How is this counted?

Mini uses OpenAI's canonical o200k_base tokenizer, shipped via the MIT-licensed gpt-tokenizer package. The count is exact — calibration factor 1.0, no approximation. Inputs over 50,000 characters tokenize in a Web Worker.

FAQ

When is Mini good enough?

For extraction, classification, summarization, structured rewriting, schema-validated output, and most agent-loop plumbing. Mini handles the majority of production traffic in well-engineered systems. The places where it strains: multi-step ambiguous reasoning, long-document synthesis without retrieval, and tasks where output prose quality is the product. Promote those to GPT-5 (or to GPT-4.1 if context length is the binding constraint).

How accurate is the token count?

Exact. gpt-tokenizer ships OpenAI's canonical o200k_base vocab, so the number you see matches what OpenAI bills. No approximation, no calibration factor.

What's the right max_tokens for Mini-driven workloads?

Lower than you'd set for GPT-5. Mini's output rate is $2/M — 5× cheaper than GPT-5 — but at high volume even small savings compound. For structured-output workloads, set max_tokens to roughly 1.5× your expected schema size and validate the response.

Does Mini support the same context as GPT-5?

Yes — 400,000 tokens. The full GPT-5 family shares the same context window, so you don't lose long-context capability when routing to Mini.

How does GPT-5 Mini compare to Claude 4.5 Haiku?

They occupy the same niche — frontier-quality cheap models for production plumbing. Haiku is slightly more expensive on output ($5/M vs Mini's $2/M); Mini's price advantage is real. The decision should come from your eval set — instruction-following, structured-output reliability, and your specific task mix matter more than the price gap.

Compare against every other model

To see this exact prompt scored against every supported model, sorted by total cost, paste it into the home calculator and toggle Compare across all models. GPT-5 Mini numbers are exact.

Related models

The two most useful comparisons: GPT-5 (when Mini's quality bar isn't enough) and Claude 4.5 Haiku (the cross-vendor budget tier with a similar capability profile).