Claude 4.5 Sonnet token & cost calculator
Anthropic positions Claude 4.5 Sonnet as the default workhorse model in the Claude 4.5 family — capable enough to drive production assistants, agentic tool use, and long-document reasoning, while sitting at a price point that's tractable for high-volume workloads. Most teams shipping AI features will spend most of their token budget here, with Haiku underneath for cheap classification and Opus reserved for the small share of requests that genuinely need the strongest reasoning available.
This page tokenizes whatever you paste below and multiplies by Sonnet's published per-million pricing so you can size a feature, a job, or a single prompt before you ever hit the API. Nothing about your input leaves the browser.
Saved scenariosnone yet
Saved on this browser only — never uploaded. Up to 10 scenarios.
Tip: save a scenario when you have a prompt + model + response length you might revisit. Useful for sizing features before committing to a vendor.
Verify privacysince this page loaded — updates live
Open DevTools → Network. Type into the calculator. No request bodies should contain your prompt text.
Pricing
Sonnet is flat-priced — no tiered surcharge above a context threshold. The dataAsOf date below is when we last verified against the published rate.
| Tier | Input $/M | Output $/M |
|---|---|---|
| All input | $3 | $15 |
| Context window | 200,000 tokens | |
Verified against www.anthropic.com on 2026-05-09.
Worked examples
Below are three concrete scenarios at Sonnet's current per-million rates. The calculator above uses the same underlying math; these are starting points for budget conversations.
| Scenario | Input | Output | Cost |
|---|---|---|---|
Short chat turn A typical Q&A turn with a small system prompt. | 800 | 400 | <$0.01 |
System prompt + tool spec A larger context window with a tool schema, single response. | 5,000 | 500 | $0.022 |
Long document Q&A A long-form input (e.g. transcript) with a structured response. | 50,000 | 1,500 | $0.173 |
A few patterns worth internalizing. First, the input/output ratio matters more than people expect: at $3 input vs. $15 output per million, a chat product whose typical turn is 800 input + 400 output spends about ⅓ of its money on tokens it didn't even generate. Second, system prompts are paid for on every request — a 5,000-token system prompt at 100,000 requests per day is $1,500/day in input cost alone. Cache or prompt-compress aggressively. Third, long-document Q&A is cheap on the input side but the output is the lever — clamping max_tokens is a surprisingly effective cost control.
How is this counted?
We approximate Sonnet's tokenizer with cl100k_base (via the MIT-licensed gpt-tokenizer package), which empirically tracks Claude 4.x within ~2% on English prose and source code. Anthropic does not publish a current client-side Claude tokenizer, so a perfect match isn't available off-the-shelf — but cl100k is closer than any other public encoding. Inputs longer than 50,000 characters are tokenized in a Web Worker so the page stays responsive while you scroll a long prompt.
FAQ
Is the token count exact?
How is output cost computed?
Does my prompt leave the browser?
What context window does Claude 4.5 Sonnet support?
Why does my count differ slightly from the Anthropic console?
Compare against every other model
To see this exact prompt scored against every supported model, sorted by total cost, paste it into the home calculator and toggle Compare across all models. Numbers are exact for OpenAI and within ±2–3% for Claude and Gemini.
Related models
If Sonnet's price profile doesn't fit your workload, the closest alternatives are below. Haiku is the budget pick when latency or volume dominates; Opus is the premium pick when the task genuinely benefits from stronger reasoning; the Gemini 2.5 family is the cross-vendor comparison set, with very different pricing geometry on long context.