Skip to content
Yi-Lightning logo

MODELS

Yi-Lightning

Cheap, fast Chinese-first chat model from 01.AI, tuned for high-throughput production use.

01-aiyi

Specs

Context window
16,000
Max output
4,096
Modalities
text
Tool use
Vision
Streaming
License
proprietary
Released
2024-10-17

Pricing

Input / 1M
$0.14
Output / 1M
$0.14

Cost estimate

Estimated monthly cost$0.17

Yi-Lightning is 01.AI's flagship low-latency model, priced at $0.14 per million tokens for both input and output. It supports tool calling and streaming but is text-only with a relatively small 16K context window. The model performs well on Chinese benchmarks (it briefly ranked in LMSYS Arena's top tier in late 2024) and is aimed at production chat, classification, and routing workloads where cost-per-call matters more than long-context reasoning.

Editor's verdict

Pick Yi-Lightning if you need a cheap, fast Chinese-capable model for high-volume tasks and Qwen or DeepSeek don't fit your stack. The 16K context is the real limitation — it rules out long-document RAG and longer agent traces, where Qwen-Plus or DeepSeek-V3 give you far more headroom at similar prices. No vision and no open weights either, so it's a narrow but legitimate choice for short-form Chinese inference at scale.

Reviews

No reviews yet. Be the first.

Last updated: 2026-04-29

We use cookies

Anonymous analytics help us improve the site. You can opt out anytime. Learn more