Skip to content
Mixtral 8x7B logo

MODELS

Mixtral 8x7B

Mistral's classic open-source MoE — 47B total, 13B active per token.

mistralmistralopen source

Specs

Context window
32,768
Max output
4,096
Modalities
text
Tool use
Vision
Streaming
License
apache-2.0
Released
2023-12-11

Pricing

Mixtral 8x7B (December 2023) is Mistral's first open-source mixture-of-experts model under Apache 2.0 — 8 experts of 7B each, 47B total parameters but only 13B active per token, giving 7B-class inference speed at 30B+ class quality. 32K context. The model that proved sparse MoE could be open-sourced cleanly. Successors: Mixtral 8x22B, Mistral Large family.

Editor's verdict

Architecturally important — every later open MoE (DeepSeek V3, Qwen MoE) inherits patterns Mixtral popularised. For new production builds, Llama 3.3 70B or Qwen 2.5 72B beats it on quality at similar serving cost; DeepSeek V3 destroys it on Chinese. Keep Mixtral on your radar as the canonical Apache-2.0 MoE if licence purity matters; otherwise newer is usually better.

Reviews

No reviews yet. Be the first.

Last updated: 2026-04-29

We use cookies

Anonymous analytics help us improve the site. You can opt out anytime. Learn more