DICTIONARY

Chinese AI dictionary

Plain-language Chinese explanations of transformer, RAG, agent, fine-tuning, context window, prompt, and other AI technical terms — covering architecture, technique, metrics, companies, people, model families, and tasks.

CategoryAll Architecture Technique Metric Company Person Model family Task Misc

01.AI (零一萬物)

Company

An AI startup founded by Kai-Fu Lee in 2023, builder of the Yi (易) open-source bilingual model family — based in Beijing with a strong technical team and high profile.

AGI (Artificial General Intelligence)

Misc

A hypothetical AI system that matches or exceeds human capability across the full range of cognitive tasks — not just narrow domains. There's no agreed definition or test.

AI alignment

Technique

The research field and engineering work focused on making AI systems pursue the goals and values their human users actually want — not their literal instructions or proxy metrics.

Alibaba (Qwen)

Alibaba (Qwen team)

Company

Alibaba's AI division and the Qwen (通義千問) open-source model family — currently the most influential Chinese open-source LLM platform, with rapid release cadence.

Anthropic

Company

An AI safety-focused lab founded by ex-OpenAI researchers (Dario and Daniela Amodei), creator of Claude — known for Constitutional AI and a research-heavy safety culture.

ASI (Artificial Superintelligence)

Misc

Hypothetical AI that exceeds human intelligence across all domains by a wide margin — usually framed as the level beyond AGI.

Attention

Architecture

A mechanism that lets a model decide which other tokens in the input matter most when processing each token.

BLEU

Metric

An automatic metric for machine translation quality, comparing n-gram overlap between the model output and one or more reference translations.

Byte Pair Encoding (BPE)

Technique

A sub-word tokenization algorithm that builds a vocabulary by repeatedly merging the most frequent pair of adjacent tokens in the training data.

C-Eval

Metric

A Chinese-language counterpart to MMLU — about 14,000 multiple-choice questions across 52 subjects in Chinese, covering everything from middle school to professional certification level.

Chain-of-thought (CoT)

Technique

A prompting technique that gets the model to write out its reasoning step by step before giving the final answer, dramatically improving performance on math and logic tasks.

Claude (family)

Model family

Anthropic's flagship LLM family — Claude 1, 2, 3 (Haiku/Sonnet/Opus tiers), Claude 3.5/3.7, Claude 4 — known for long context, strong coding, and rigorous safety training.

CMMLU

Metric

Another Chinese MMLU-style benchmark covering 67 subjects with about 12,000 multiple-choice questions, with stronger coverage of China-specific knowledge than C-Eval.

Code generation

Task

The LLM task of writing or completing source code from natural-language description or existing code context — the core capability behind GitHub Copilot, Cursor, and Claude Code.

Constitutional AI (CAI)

Constitutional AI

Technique

Anthropic's training method that uses a written set of principles ("a constitution") plus AI feedback to teach a model to be helpful and harmless without massive human-written safety labels.

Context window

Misc

The maximum number of tokens an LLM can read and reason over in a single call — covering the system prompt, conversation history, and any attached documents.

Convolutional Neural Network (CNN)

Architecture

A neural network architecture that uses convolution layers to detect spatial patterns, dominant in image recognition tasks.

Dario Amodei

Person

Co-founder and CEO of Anthropic, former VP of Research at OpenAI — physicist-turned-AI-researcher who has become the most prominent voice for taking AI safety seriously while still building frontier models.

Decoder

Architecture

The part of a neural network that generates output tokens one at a time, used in most modern LLMs like GPT and Claude.

DeepSeek

Company

A Chinese AI lab from Hangzhou that shocked the industry in early 2025 with DeepSeek-V3 and R1 — frontier-class open-weight models trained at a fraction of typical cost.

DeepSeek (family)

Model family

DeepSeek's open-weight LLM family — DeepSeek V2/V3 (efficient MoE), DeepSeek R1 (open-weight reasoning model rivaling o1), DeepSeek-Coder, DeepSeek-VL.

Demis Hassabis

Person

British neuroscientist and CEO of Google DeepMind — co-founded DeepMind in 2010, led AlphaGo and AlphaFold, and won the 2024 Nobel Prize in Chemistry for protein structure prediction.

Diffusion Model

Architecture

A generative model that creates images (or other data) by learning to reverse a step-by-step process of adding random noise.

DPO (Direct Preference Optimization)

Technique

A training method that aligns language models to human preferences directly from preference data, without needing a separate reward model or reinforcement learning.

Embedding

Technique

A list of numbers (a vector) that represents the meaning of a piece of text, image, or audio so that similar things sit near each other in vector space.

Emergent abilities

Misc

Capabilities that suddenly appear in large models but are absent in smaller ones — like multi-step reasoning, code generation, or following novel instructions.

Encoder

Architecture

A neural network component that converts input data into a dense vector representation capturing its meaning.

Encoder-Decoder

Architecture

A neural network architecture where one module compresses input into a representation and another generates output from it — common in translation and summarization.

Few-shot prompting

Technique

A prompting technique where you give the model a few worked examples in the prompt before asking it to do the same task on a new input.

Fine-tuning

Technique

Continuing to train a pre-trained model on a smaller, task-specific dataset so it specializes in a particular domain or behavior.

Frontier model

Misc

The current most-capable AI models — typically training-compute-frontier LLMs from OpenAI, Anthropic, Google DeepMind, with broad capability and high cost.

Gemini (family)

Model family

Google DeepMind's flagship LLM family — Gemini 1.0, 1.5 (1M context), 2.0/2.5 (multimodal, reasoning) — competes directly with GPT and Claude.

Generative Adversarial Network (GAN)

Architecture

A neural network architecture where two models — a generator and a discriminator — compete, training each other to produce realistic synthetic data.

Geoffrey Hinton

Person

British-Canadian computer scientist often called the "Godfather of AI" — co-invented backpropagation, won the 2018 Turing Award and 2024 Nobel Prize in Physics for foundational deep learning work.

GLM (family) / ChatGLM

GLM (family)

Model family

Zhipu AI's GLM (General Language Model) family — including the open-source ChatGLM line and the commercial GLM-4 — strong on bilingual Chinese-English work.

Google DeepMind

Company

Google's AI research lab — formed in 2023 by merging DeepMind (London) and Google Brain — creator of AlphaGo, AlphaFold, and the Gemini model family.

GPT (family)

Model family

OpenAI's flagship language model family — from GPT-1 (2018) through GPT-4 and the o-series reasoning models. The line that powers ChatGPT.

Guardrails

Technique

Code or models that sit around an LLM to filter inputs/outputs, block unsafe content, enforce schemas, or stop the model from doing things it shouldn't.

Hallucination

Misc

When an LLM produces confident, fluent text that's factually wrong or invented — citing fake papers, fabricating quotes, or making up API endpoints.

Hugging Face

Company

The default platform for sharing open-source AI models, datasets, and demos — the "GitHub of machine learning". Hosts millions of models including Llama, Qwen, DeepSeek, and Mistral.

HumanEval

Metric

OpenAI's coding benchmark of 164 hand-written Python problems where models are scored by whether their generated code passes hidden unit tests (pass@k).

Ilya Sutskever

Person

Co-founder and former Chief Scientist of OpenAI, key contributor to AlexNet, GPT, and the foundational scaling-laws insight — left OpenAI in 2024 to found Safe Superintelligence Inc.

Image generation

Task

Producing images from text prompts (text-to-image) or other inputs — handled by diffusion models like Stable Diffusion, DALL-E, Midjourney, Flux, and Imagen.

In-context learning (ICL)

Technique

The ability of an LLM to learn a new task from examples shown inside the prompt at inference time, without any weight updates.

Instruction Tuning

Technique

A fine-tuning technique that trains a language model on (instruction, response) pairs so it learns to follow natural-language commands instead of just predicting the next token.

Kimi (family)

Model family

Moonshot AI's Kimi LLM family — Kimi K1, K1.5, K2 — known for long-context Chinese document handling and powering the popular Kimi consumer assistant.

Knowledge distillation

Technique

Training a smaller "student" model to match the outputs of a larger "teacher" model, producing a cheaper model that retains much of the teacher's quality.

KV cache

Technique

A cache of the Key and Value tensors from past tokens that lets transformers avoid recomputing them at each new generation step — the main reason long contexts use so much memory.