DeepSeek

A Chinese AI lab from Hangzhou that shocked the industry in early 2025 with DeepSeek-V3 and R1 — frontier-class open-weight models trained at a fraction of typical cost.

DeepSeek is a Chinese AI lab founded in 2023 by Liang Wenfeng, who built it as a research-first arm of his quant-trading firm High-Flyer. It's based in Hangzhou and has stayed deliberately small. DeepSeek-V3 (December 2024) and DeepSeek-R1 (January 2025) caused a market-shaking reaction worldwide because they matched or approached GPT-4o-level performance with open weights, on a much smaller training budget than US labs were assumed to need. It matters because DeepSeek showed that frontier capability isn't proprietary to OpenAI / Anthropic / Google. The R1 reasoning model in particular was open-weight, which let everyone — researchers, competitors, and indie developers — replicate and build on its techniques. The release dropped Nvidia's stock by ~17% in one day on the implication that you might not need as many GPUs as the bull case assumed. Key contributions: DeepSeek-V2 (efficient MoE architecture), DeepSeek-V3 (671B parameter MoE, 37B active), DeepSeek-R1 (reasoning model rivaling o1, fully open), Multi-head Latent Attention (MLA, a KV cache compression technique), and aggressive open-sourcing of code, papers, and weights. DeepSeek is now the most widely-used Chinese open-source LLM and a benchmark Western labs are forced to compare against. Related: DeepSeek family, Mixture of Experts, MLA, open-source, Qwen.