Gemini 2.5 Pro token & cost calculator

Google Gemini 2.5 Pro is the long-context flagship of the Gemini 2.5 family. The pricing structure is the most interesting thing about it from a budgeting perspective: a tiered model where input and output rates roughly double once your input exceeds 200,000 tokens. For workloads that fit under that threshold, Pro is competitive with Claude Sonnet on price. Above that threshold, the math shifts — you're paying for a capability (large context) that other vendors price differently or don't offer at all at this rate.

The right way to use this calculator is to plug in a realistic prompt size and watch which tier the cost lands in. The result card automatically applies the correct rate, so the dollar figure reflects what Google would actually bill — not a flat per-million approximation that would be misleading at the upper tier.

Expected response (output tokens)

Prompt

Client-side. Never uploaded.

0 / 1,000,000 charactersContext window: 1,000,000 tokens

Or start with an example

Total estimated cost

$0.010Gemini 2.5 Pro

Tokens±3% approx

Input cost

$0.00

Output cost (est.)

$0.010

@ 1,024 response tokens

Context used

of 1,000,000

Verified 2026-05-09 · ±3%

Saved scenariosnone yet

Saved on this browser only — never uploaded. Up to 10 scenarios.

Tip: save a scenario when you have a prompt + model + response length you might revisit. Useful for sizing features before committing to a vendor.

Verify privacysince this page loaded — updates live

Prompt uploads0Always 0 — by design

Outgoing requests0Analytics + page assets only — no prompt content

Cookies on this origin0Vercel Analytics + Clarity may set first-party cookies

localStorage keys0Theme preference + saved scenarios live here

Server endpoints1/api/og only — accepts title + subtitle, never prompt text

Inspect

Open DevTools → Network. Type into the calculator. No request bodies should contain your prompt text.

Pricing

The tier boundary at 200,000 input tokens is the structural fact to internalize. Both input and output rates are higher above that line, which means the marginal cost of every token past 200k is roughly 2× the cost of every token before.

Tier	Input $/M	Output $/M
≤ 200,000 input tokens	$1.25	$10
All input	$2.5	$15
Context window	1,000,000 tokens

Verified against ai.google.dev on 2026-05-09.

Worked examples

The scenarios below stay inside the lower tier, where Pro is most price-competitive. For a long-context workload that crosses 200k tokens — say a full transcript or a codebase scan — paste your real input above and the calculator will apply the upper-tier rate automatically.

Scenario	Input	Output	Cost
Short chat turn A typical Q&A turn with a small system prompt.	800	400	<$0.01
System prompt + tool spec A larger context window with a tool schema, single response.	5,000	500	$0.011
Long document Q&A A long-form input (e.g. transcript) with a structured response.	50,000	1,500	$0.077

A useful instinct: decide the tier before you decide the model. If your average request is 5,000 tokens, Pro is the wrong frame — comparing it head-to-head against Claude Sonnet at chat-sized prompts misses what makes Pro distinctive. If your average request is 350,000 tokens, the comparison set is "Pro vs. retrieval-augmented Sonnet vs. Opus with chunking," and the right answer depends on whether your task tolerates retrieval losses.

How is this counted?

We approximate Gemini's tokenizer with o200k_base from js-tiktoken (MIT). The o200k family is the closest public encoding to the modern frontier-model tokenizer style; drift on Gemini specifically is typically in the ~3% range on English. For code-heavy inputs (especially languages with unusual whitespace conventions) the approximation can be looser — pad your budget by 5% if the workload is code-dominated. Inputs over 50,000 characters tokenize in a Web Worker so the page stays responsive.

FAQ

How does the tiered pricing work?

Gemini 2.5 Pro charges one rate up to 200,000 input tokens and a higher rate above that. Both input and output rates roughly double in the upper tier. The calculator above selects the right tier automatically based on your input size, so the cost number reflects what you would actually be billed.

Is the 1M context window really usable?

Yes, but cost scales sharply. A single request at the upper end of the context window — say 800,000 input tokens with a 2,000-token response — costs ~$2 just for the input at the high tier. The window is genuinely useful for jobs that need it (whole-codebase Q&A, transcript analysis), but it is not the right tool for routine chat.

How is the token count approximated?

Google does not publish a JavaScript tokenizer for Gemini, so we approximate using o200k_base (via js-tiktoken) — the closest publicly available encoding for the modern frontier-model token family. The drift is typically within ~3% on natural language, slightly more on code-heavy inputs. Treat the result as a budgeting estimate.

When should I prefer Gemini 2.5 Pro over Claude 4.5 Sonnet?

When your prompts routinely exceed 200,000 tokens, when you need multimodal input handling that Gemini does well, or when your eval set shows Gemini outperforming on your specific task. At typical chat-sized prompts, Sonnet is meaningfully cheaper; the Gemini Pro story is "I have a lot of context and I want it cheap on the input side."

Does my prompt leave the browser?

No. Tokenization runs entirely client-side. There is no server endpoint that ever sees prompt content. The only serverless function on this site is /api/og, used for social preview images, and it only accepts title and subtitle query strings.

Compare against every other model

To see this exact prompt scored against every supported model, sorted by total cost, paste it into the home calculator and toggle Compare across all models. Pro's tier surcharge above 200,000 input tokens is applied automatically when relevant.

Related models

The two most useful comparisons: Gemini 2.5 Flash (the budget tier within the same family) and Claude 4.5 Sonnet (the cross-vendor mid-range that Pro is most often compared to at chat-sized prompts).