DeepMind released Gemma 4 in April 2026 across four sizes: E2B, E4B, 26B MoE, and 31B Dense — all under Apache 2.0 (fully commercially permissive).
Official rankings: 31B sits at #3 on the LMSYS Arena text leaderboard; 26B at #6. DeepMind's headline claim: "outcompetes models 20x its size."
What's different for developers:
- Native multi-step reasoning and agentic workflow support (function-calling, structured JSON output)
- Native vision + audio processing, no external pipeline needed
- Context window 128K-256K
- 140+ languages
- Available on Hugging Face, Kaggle, Ollama, Google AI Studio, vLLM, llama.cpp, NVIDIA NIM, LM Studio — local, cloud, and edge deployments all covered
Why it matters: DeepMind isn't just "playing along with open source" anymore. It's challenging Llama / Qwen / Mistral peers head-on with size-efficiency claims. If the Arena placement holds, small teams using open-source for domain fine-tuning start from a meaningfully higher floor, and the worry about vendor lock-in eases somewhat.