Is Claude 4.5 Sonnet or Gemini 2.5 Flash cheaper?

For a typical request (10,000 input + 2,000 output tokens), Gemini 2.5 Flash is cheaper — about 87% less, or roughly $5200 saved per 100,000 requests. Claude 4.5 Sonnet runs $3/$15 per 1M input/output tokens; Gemini 2.5 Flash runs $0.3/$2.5.

Which has the larger context window?

Gemini 2.5 Flash, at 1,000,000 tokens versus 200,000.

How accurate are these token counts?

Claude 4.5 Sonnet: Approximated with cl100k_base — drift typically <2% on English and code. Gemini 2.5 Flash: Approximated with o200k_base; drift typically ~3% on English and code. The dollar math itself is exact once the token count is known.

Claude 4.5 Sonnet vs Gemini 2.5 Flash: pricing & cost comparison

On input tokens, Gemini 2.5 Flash is the cheaper of the two — 90% less per million ($3 vs $0.3). On output, Gemini 2.5 Flash is 83% cheaper ($15 vs $2.5) — and since output is usually the dominant cost driver, that gap matters more than it looks.

Side by side

	Claude 4.5 Sonnet	Gemini 2.5 Flash
Input / 1M tokens	$3	$0.3
Output / 1M tokens	$15	$2.5
Context window	200,000	1,000,000
Token-count accuracy	±2%	±3%
Cost — 10,000 input + 2,000 output tokens	$0.06	$0.008

What a real request costs

Take a representative turn — 10,000 input + 2,000 output tokens. Claude 4.5 Sonnet comes to $0.06, Gemini 2.5 Flash to $0.008. Across 100,000 requests that's a $5200 swing in favour of Gemini 2.5 Flash. To run the numbers on your actual prompt, paste it into the calculator and toggle Compare across all models.

Which should you pick?

These are different vendors, so a switch means a different API and a slightly different tokenizer — budget a small calibration buffer. OpenAI models give exact counts; the others land within a few percent. See the full breakdown on the dedicated pages for Claude 4.5 Sonnet and Gemini 2.5 Flash.

FAQ

Is Claude 4.5 Sonnet or Gemini 2.5 Flash cheaper?: For a typical request (10,000 input + 2,000 output tokens), Gemini 2.5 Flash is cheaper — about 87% less, or roughly $5200 saved per 100,000 requests. Claude 4.5 Sonnet runs $3/$15 per 1M input/output tokens; Gemini 2.5 Flash runs $0.3/$2.5.
Which has the larger context window?: Gemini 2.5 Flash, at 1,000,000 tokens versus 200,000.
How accurate are these token counts?: Claude 4.5 Sonnet: Approximated with cl100k_base — drift typically <2% on English and code. Gemini 2.5 Flash: Approximated with o200k_base; drift typically ~3% on English and code. The dollar math itself is exact once the token count is known.