MODELS
DeepSeek V3
Open-weight 671B MoE that matches GPT-4o on most tasks at a fraction of the price.
Specs
- Context window
- 128,000
- Max output
- 8,192
- Modalities
- text
- Tool use
- ✓
- Vision
- —
- Streaming
- ✓
- License
- mit
- Released
- 2024-12-26
Pricing
- Input / 1M
- $0.27
- Output / 1M
- $1.10
- Cached input / 1M
- $0.07
Cost estimate
DeepSeek V3 is a 671B-parameter Mixture-of-Experts model (37B active per token) released under MIT license. It handles general chat, coding, math and tool use with a 128K context window, and delivers GPT-4o-class quality on most English and Chinese benchmarks. The hosted API is unusually cheap at $0.27/$1.10 per million tokens, and the weights are downloadable for self-hosting.
Editor's verdict
Pick V3 when you need GPT-4o-level quality at Haiku-level prices, or when you must self-host weights for compliance reasons. It's not a reasoning model — for hard math or multi-step planning, R1 or o3-mini will outperform it. No vision either, so multimodal apps need to look elsewhere.
Reviews
No reviews yet. Be the first.
Last updated: 2026-04-29