Opus 4.8 vs GPT-5.5 vs Gemini 3.1 Pro
The frontier three, priced side by side.
Independent LLM cost intelligence · since 2026
LLMCalculators
Today's lead · interactive
Drag the sliders to size a request, then compare the cost across 27 models from Claude, GPT, Gemini, Grok, DeepSeek, Mistral, Amazon Nova, Cohere, and Meta Llama — per request and per month, with Batch API and prompt-caching discounts.
Set the sliders to a typical request — input tokens (your prompt plus any context), output tokens (the reply), and how many requests you send per month. The table ranks every model by cost per request and per month. Turn on caching if you reuse a prompt prefix, or Batch for async jobs. Apply it by picking the cheapest model that clears your quality bar, and by seeing how much shorter output saves.
Loading pricing…
Live from Hacker News and Mastodon — recent, popular posts about the latest models. Links go straight to the source.
Loading the latest…
Real ways the newest models are being used. Pick one to price it or find the right model. We add to this regularly.
Loading ideas…
The frontier three, priced side by side.
The workhorse tier, where most production traffic lives.
Two of the cheapest credible options, head to head.
The gap between a budget and a frontier model is often 10x. Use the smallest one that passes your own test.
Stable context billed at about a tenth on a cache hit. Keep fixed content first, variable bits last.
Every follow-up re-sends the whole conversation. Ask once, well.
Loading…
Cheapest by blended cost (input + output, per 1M tokens). See how prices have fallen over time on the pricing history.
Paste text for a quick token estimate, then push it into the calculator.
Rough heuristic (~4 chars/token). Real tokenization is model-specific — for exact Claude counts use the count_tokens API; OpenAI and Google have their own tokenizers.