DICTIONARY
Chinese AI dictionary
Plain-language Chinese explanations of transformer, RAG, agent, fine-tuning, context window, prompt, and other AI technical terms — covering architecture, technique, metrics, companies, people, model families, and tasks.
01.AI (零一萬物)
Company
An AI startup founded by Kai-Fu Lee in 2023, builder of the Yi (易) open-source bilingual model family — based in Beijing with a strong technical team and high profile.
AGI (Artificial General Intelligence)
Misc
A hypothetical AI system that matches or exceeds human capability across the full range of cognitive tasks — not just narrow domains. There's no agreed definition or test.
AI alignment
Technique
The research field and engineering work focused on making AI systems pursue the goals and values their human users actually want — not their literal instructions or proxy metrics.
Alibaba (Qwen)
Alibaba (Qwen team)Company
Alibaba's AI division and the Qwen (通義千問) open-source model family — currently the most influential Chinese open-source LLM platform, with rapid release cadence.
Anthropic
Company
An AI safety-focused lab founded by ex-OpenAI researchers (Dario and Daniela Amodei), creator of Claude — known for Constitutional AI and a research-heavy safety culture.
ASI (Artificial Superintelligence)
Misc
Hypothetical AI that exceeds human intelligence across all domains by a wide margin — usually framed as the level beyond AGI.
Attention
Architecture
A mechanism that lets a model decide which other tokens in the input matter most when processing each token.
BLEU
Metric
An automatic metric for machine translation quality, comparing n-gram overlap between the model output and one or more reference translations.
Byte Pair Encoding (BPE)
Technique
A sub-word tokenization algorithm that builds a vocabulary by repeatedly merging the most frequent pair of adjacent tokens in the training data.
C-Eval
Metric
A Chinese-language counterpart to MMLU — about 14,000 multiple-choice questions across 52 subjects in Chinese, covering everything from middle school to professional certification level.
Chain-of-thought (CoT)
Technique
A prompting technique that gets the model to write out its reasoning step by step before giving the final answer, dramatically improving performance on math and logic tasks.
Claude (family)
Model family
Anthropic's flagship LLM family — Claude 1, 2, 3 (Haiku/Sonnet/Opus tiers), Claude 3.5/3.7, Claude 4 — known for long context, strong coding, and rigorous safety training.
CMMLU
Metric
Another Chinese MMLU-style benchmark covering 67 subjects with about 12,000 multiple-choice questions, with stronger coverage of China-specific knowledge than C-Eval.
Code generation
Task
The LLM task of writing or completing source code from natural-language description or existing code context — the core capability behind GitHub Copilot, Cursor, and Claude Code.
Constitutional AI (CAI)
Constitutional AITechnique
Anthropic's training method that uses a written set of principles ("a constitution") plus AI feedback to teach a model to be helpful and harmless without massive human-written safety labels.
Context window
Misc
The maximum number of tokens an LLM can read and reason over in a single call — covering the system prompt, conversation history, and any attached documents.
Convolutional Neural Network (CNN)
Architecture
A neural network architecture that uses convolution layers to detect spatial patterns, dominant in image recognition tasks.
Dario Amodei
Person
Co-founder and CEO of Anthropic, former VP of Research at OpenAI — physicist-turned-AI-researcher who has become the most prominent voice for taking AI safety seriously while still building frontier models.
Decoder
Architecture
The part of a neural network that generates output tokens one at a time, used in most modern LLMs like GPT and Claude.
DeepSeek
Company
A Chinese AI lab from Hangzhou that shocked the industry in early 2025 with DeepSeek-V3 and R1 — frontier-class open-weight models trained at a fraction of typical cost.
DeepSeek (family)
Model family
DeepSeek's open-weight LLM family — DeepSeek V2/V3 (efficient MoE), DeepSeek R1 (open-weight reasoning model rivaling o1), DeepSeek-Coder, DeepSeek-VL.
Demis Hassabis
Person
British neuroscientist and CEO of Google DeepMind — co-founded DeepMind in 2010, led AlphaGo and AlphaFold, and won the 2024 Nobel Prize in Chemistry for protein structure prediction.
Diffusion Model
Architecture
A generative model that creates images (or other data) by learning to reverse a step-by-step process of adding random noise.
DPO (Direct Preference Optimization)
Technique
A training method that aligns language models to human preferences directly from preference data, without needing a separate reward model or reinforcement learning.
Embedding
Technique
A list of numbers (a vector) that represents the meaning of a piece of text, image, or audio so that similar things sit near each other in vector space.
Emergent abilities
Misc
Capabilities that suddenly appear in large models but are absent in smaller ones — like multi-step reasoning, code generation, or following novel instructions.
Encoder
Architecture
A neural network component that converts input data into a dense vector representation capturing its meaning.
Encoder-Decoder
Architecture
A neural network architecture where one module compresses input into a representation and another generates output from it — common in translation and summarization.
Few-shot prompting
Technique
A prompting technique where you give the model a few worked examples in the prompt before asking it to do the same task on a new input.
Fine-tuning
Technique
Continuing to train a pre-trained model on a smaller, task-specific dataset so it specializes in a particular domain or behavior.
Frontier model
Misc
The current most-capable AI models — typically training-compute-frontier LLMs from OpenAI, Anthropic, Google DeepMind, with broad capability and high cost.
Gemini (family)
Model family
Google DeepMind's flagship LLM family — Gemini 1.0, 1.5 (1M context), 2.0/2.5 (multimodal, reasoning) — competes directly with GPT and Claude.
Generative Adversarial Network (GAN)
Architecture
A neural network architecture where two models — a generator and a discriminator — compete, training each other to produce realistic synthetic data.
Geoffrey Hinton
Person
British-Canadian computer scientist often called the "Godfather of AI" — co-invented backpropagation, won the 2018 Turing Award and 2024 Nobel Prize in Physics for foundational deep learning work.
GLM (family) / ChatGLM
GLM (family)Model family
Zhipu AI's GLM (General Language Model) family — including the open-source ChatGLM line and the commercial GLM-4 — strong on bilingual Chinese-English work.
Google DeepMind
Company
Google's AI research lab — formed in 2023 by merging DeepMind (London) and Google Brain — creator of AlphaGo, AlphaFold, and the Gemini model family.
GPT (family)
Model family
OpenAI's flagship language model family — from GPT-1 (2018) through GPT-4 and the o-series reasoning models. The line that powers ChatGPT.
Guardrails
Technique
Code or models that sit around an LLM to filter inputs/outputs, block unsafe content, enforce schemas, or stop the model from doing things it shouldn't.
Hallucination
Misc
When an LLM produces confident, fluent text that's factually wrong or invented — citing fake papers, fabricating quotes, or making up API endpoints.
Hugging Face
Company
The default platform for sharing open-source AI models, datasets, and demos — the "GitHub of machine learning". Hosts millions of models including Llama, Qwen, DeepSeek, and Mistral.
HumanEval
Metric
OpenAI's coding benchmark of 164 hand-written Python problems where models are scored by whether their generated code passes hidden unit tests (pass@k).
Ilya Sutskever
Person
Co-founder and former Chief Scientist of OpenAI, key contributor to AlexNet, GPT, and the foundational scaling-laws insight — left OpenAI in 2024 to found Safe Superintelligence Inc.
Image generation
Task
Producing images from text prompts (text-to-image) or other inputs — handled by diffusion models like Stable Diffusion, DALL-E, Midjourney, Flux, and Imagen.
In-context learning (ICL)
Technique
The ability of an LLM to learn a new task from examples shown inside the prompt at inference time, without any weight updates.
Instruction Tuning
Technique
A fine-tuning technique that trains a language model on (instruction, response) pairs so it learns to follow natural-language commands instead of just predicting the next token.
Kimi (family)
Model family
Moonshot AI's Kimi LLM family — Kimi K1, K1.5, K2 — known for long-context Chinese document handling and powering the popular Kimi consumer assistant.
Knowledge distillation
Technique
Training a smaller "student" model to match the outputs of a larger "teacher" model, producing a cheaper model that retains much of the teacher's quality.
KV cache
Technique
A cache of the Key and Value tensors from past tokens that lets transformers avoid recomputing them at each new generation step — the main reason long contexts use so much memory.