DICTIONARY

Chinese AI dictionary

Plain-language Chinese explanations of transformer, RAG, agent, fine-tuning, context window, prompt, and other AI technical terms — covering architecture, technique, metrics, companies, people, model families, and tasks.

CategoryAll Architecture Technique Metric Company Person Model family Task Misc

BLEU

Metric

An automatic metric for machine translation quality, comparing n-gram overlap between the model output and one or more reference translations.

C-Eval

Metric

A Chinese-language counterpart to MMLU — about 14,000 multiple-choice questions across 52 subjects in Chinese, covering everything from middle school to professional certification level.

CMMLU

Metric

Another Chinese MMLU-style benchmark covering 67 subjects with about 12,000 multiple-choice questions, with stronger coverage of China-specific knowledge than C-Eval.

HumanEval

Metric

OpenAI's coding benchmark of 164 hand-written Python problems where models are scored by whether their generated code passes hidden unit tests (pass@k).

MMLU (Massive Multitask Language Understanding)

MMLU

Metric

A widely-cited benchmark of 57 multiple-choice subjects (high-school to professional level) used to measure an LLM's broad knowledge — accuracy in % is the headline number.

Perplexity

Metric

A metric measuring how surprised a language model is by the actual next token — lower is better. The exponentiated average negative log-likelihood.

ROUGE

Metric

A family of metrics for summarization quality based on n-gram overlap between generated summary and human reference — ROUGE-1, ROUGE-2, and ROUGE-L are the common variants.

SuperCLUE

Metric

A comprehensive Chinese LLM benchmark suite covering reasoning, knowledge, language, code, and safety — published as a regularly-updated leaderboard.

8 total