MODELS

Mixtral 8x7B

Name: Mixtral 8x7B
Brand: mistral

Mistral's classic open-source MoE — 47B total, 13B active per token.

mistralmistralopen source

Go to official site →API docs →

Specs

Context window: 32,768
Max output: 4,096
Modalities: text
Tool use: —
Vision: —
Streaming: ✓
License: apache-2.0
Released: 2023-12-11

Pricing

Mixtral 8x7B (December 2023) is Mistral's first open-source mixture-of-experts model under Apache 2.0 — 8 experts of 7B each, 47B total parameters but only 13B active per token, giving 7B-class inference speed at 30B+ class quality. 32K context. The model that proved sparse MoE could be open-sourced cleanly. Successors: Mixtral 8x22B, Mistral Large family.

Editor's verdict

Architecturally important — every later open MoE (DeepSeek V3, Qwen MoE) inherits patterns Mixtral popularised. For new production builds, Llama 3.3 70B or Qwen 2.5 72B beats it on quality at similar serving cost; DeepSeek V3 destroys it on Chinese. Keep Mixtral on your radar as the canonical Apache-2.0 MoE if licence purity matters; otherwise newer is usually better.

Reviews

No reviews yet. Be the first.

Last updated: 2026-04-29