Claude 4.5 Sonnet token & cost calculator

Anthropic positions Claude 4.5 Sonnet as the default workhorse model in the Claude 4.5 family — capable enough to drive production assistants, agentic tool use, and long-document reasoning, while sitting at a price point that's tractable for high-volume workloads. Most teams shipping AI features will spend most of their token budget here, with Haiku underneath for cheap classification and Opus reserved for the small share of requests that genuinely need the strongest reasoning available.

This page tokenizes whatever you paste below and multiplies by Sonnet's published per-million pricing so you can size a feature, a job, or a single prompt before you ever hit the API. Nothing about your input leaves the browser.

Expected response (output tokens)

Prompt

Client-side. Never uploaded.

0 / 1,000,000 charactersContext window: 200,000 tokens

Or start with an example

Total estimated cost

$0.015Claude 4.5 Sonnet

Tokens±2% approx

Input cost

$0.00

Output cost (est.)

$0.015

@ 1,024 response tokens

Context used

of 200,000

Verified 2026-05-09 · ±2%

Saved scenariosnone yet

Saved on this browser only — never uploaded. Up to 10 scenarios.

Tip: save a scenario when you have a prompt + model + response length you might revisit. Useful for sizing features before committing to a vendor.

Verify privacysince this page loaded — updates live

Prompt uploads0Always 0 — by design

Outgoing requests0Analytics + page assets only — no prompt content

Cookies on this origin0Vercel Analytics + Clarity may set first-party cookies

localStorage keys0Theme preference + saved scenarios live here

Server endpoints1/api/og only — accepts title + subtitle, never prompt text

Inspect

Open DevTools → Network. Type into the calculator. No request bodies should contain your prompt text.

Pricing

Sonnet is flat-priced — no tiered surcharge above a context threshold. The dataAsOf date below is when we last verified against the published rate.

Tier	Input $/M	Output $/M
All input	$3	$15
Context window	200,000 tokens

Verified against www.anthropic.com on 2026-05-09.

Worked examples

Below are three concrete scenarios at Sonnet's current per-million rates. The calculator above uses the same underlying math; these are starting points for budget conversations.

Scenario	Input	Output	Cost
Short chat turn A typical Q&A turn with a small system prompt.	800	400	<$0.01
System prompt + tool spec A larger context window with a tool schema, single response.	5,000	500	$0.022
Long document Q&A A long-form input (e.g. transcript) with a structured response.	50,000	1,500	$0.173

A few patterns worth internalizing. First, the input/output ratio matters more than people expect: at $3 input vs. $15 output per million, a chat product whose typical turn is 800 input + 400 output spends about ⅓ of its money on tokens it didn't even generate. Second, system prompts are paid for on every request — a 5,000-token system prompt at 100,000 requests per day is $1,500/day in input cost alone. Cache or prompt-compress aggressively. Third, long-document Q&A is cheap on the input side but the output is the lever — clamping max_tokens is a surprisingly effective cost control.

How is this counted?

We approximate Sonnet's tokenizer with cl100k_base (via the MIT-licensed gpt-tokenizer package), which empirically tracks Claude 4.x within ~2% on English prose and source code. Anthropic does not publish a current client-side Claude tokenizer, so a perfect match isn't available off-the-shelf — but cl100k is closer than any other public encoding. Inputs longer than 50,000 characters are tokenized in a Web Worker so the page stays responsive while you scroll a long prompt.

FAQ

Is the token count exact?

No. Anthropic does not publish a current Claude 4.x client tokenizer, so we approximate with cl100k_base (gpt-tokenizer). For typical English prose and code the count is within ~2% of the vendor count. For pathological input (long Unicode runs, repeated rare bytes) drift can be larger; treat the result as a budgeting estimate, not a billing oracle.

How is output cost computed?

You set an expected response length (default 1,024 tokens). The result card multiplies that by the published per-million output rate. The actual response will land in a range — Sonnet rarely returns more than the max_tokens you pass to the API, so your real cost ceiling is set by your client config, not by the model.

Does my prompt leave the browser?

No. Tokenization runs in JavaScript on the page (in a Web Worker for inputs over 50,000 characters). There is no server route that ever receives prompt text. The only serverless function on the site is /api/og for social preview images, and it only accepts title and subtitle query strings.

What context window does Claude 4.5 Sonnet support?

The published context window is 200,000 tokens. The calculator warns you when input alone would exceed this — Anthropic will reject the request before the model runs.

Why does my count differ slightly from the Anthropic console?

The Anthropic console uses the live, internal Claude 4.5 tokenizer; we approximate with cl100k_base. The two agree to within ~2% for natural language and source code. If you need exact counts for billing reconciliation, fetch them from the Anthropic API response headers — but for estimating spend before a request, this calculator is the same order-of-magnitude as the console.

Compare against every other model

To see this exact prompt scored against every supported model, sorted by total cost, paste it into the home calculator and toggle Compare across all models. Numbers are exact for OpenAI and within ±2–3% for Claude and Gemini.

If Sonnet's price profile doesn't fit your workload, the closest alternatives are below. Haiku is the budget pick when latency or volume dominates; Opus is the premium pick when the task genuinely benefits from stronger reasoning; the Gemini 2.5 family is the cross-vendor comparison set, with very different pricing geometry on long context.