Google DeepMind releases Gemma 4 — 'most capable open models, byte for byte'

DeepMind released Gemma 4 in April 2026 across four sizes: E2B, E4B, 26B MoE, and 31B Dense — all under Apache 2.0 (fully commercially permissive).

Official rankings: 31B sits at #3 on the LMSYS Arena text leaderboard; 26B at #6. DeepMind's headline claim: "outcompetes models 20x its size."

What's different for developers:

Native multi-step reasoning and agentic workflow support (function-calling, structured JSON output)
Native vision + audio processing, no external pipeline needed
Context window 128K-256K
140+ languages
Available on Hugging Face, Kaggle, Ollama, Google AI Studio, vLLM, llama.cpp, NVIDIA NIM, LM Studio — local, cloud, and edge deployments all covered

Why it matters: DeepMind isn't just "playing along with open source" anymore. It's challenging Llama / Qwen / Mistral peers head-on with size-efficiency claims. If the Arena placement holds, small teams using open-source for domain fine-tuning start from a meaningfully higher floor, and the worry about vendor lock-in eases somewhat.