MODELS
Gemini 2.0 Flash
Cheap, fast multimodal workhorse with a 1M-token context window.
Specs
- Context window
- 1,000,000
- Max output
- 8,192
- Modalities
- text, image, audio, video
- Tool use
- ✓
- Vision
- ✓
- Streaming
- ✓
- License
- proprietary
- Released
- 2024-12-11
Pricing
- Input / 1M
- $0.10
- Output / 1M
- $0.40
- Cached input / 1M
- $0.03
Cost estimate
Gemini 2.0 Flash is Google's low-latency, low-cost workhorse model. It handles text, images, audio and video in, with native tool use and streaming, plus a 1M-token context window that few competitors match at this price. At $0.10/$0.40 per million tokens it sits well below GPT-4o mini and Claude 3.5 Haiku on cost, making it a default pick for high-volume pipelines, document Q&A and multimodal preprocessing.
Editor's verdict
Pick this when throughput, cost and long context matter more than top-tier reasoning. It beats GPT-4o mini on context length and native video/audio input, and beats Claude 3.5 Haiku on price. The trade-off: on hard reasoning, code and nuanced writing it trails GPT-4o, Claude 3.5 Sonnet and even Gemini 2.5 Pro by a clear margin. Treat it as the cheap muscle in a multi-model setup, not the model you ship hard problems to.
Reviews
No reviews yet. Be the first.
Last updated: 2026-04-29