Skip to content

Architecture

Encoder-Decoder

A neural network architecture where one module compresses input into a representation and another generates output from it — common in translation and summarization.

An encoder-decoder is a two-part neural network architecture. The **encoder** reads the input (a sentence, image, audio clip) and compresses it into an internal representation — usually a sequence of vectors capturing meaning. The **decoder** then takes that representation and generates the output step by step, often in a different form or language. This split shows up everywhere sequence-to-sequence problems appear. Machine translation was the original killer app: the encoder reads an English sentence, the decoder writes the French one. The same pattern powers summarization (long article in, short summary out), speech recognition (audio in, text out), and image captioning (pixels in, sentence out). The original 2017 Transformer paper "Attention Is All You Need" used an encoder-decoder design, and models like T5 and BART still follow it. A useful analogy: think of a translator who first reads and fully understands a paragraph (encoding), then sets the original aside and rewrites it in another language (decoding). The "understanding" is the intermediate representation passed between the two halves. Not every modern model uses both halves. **Encoder-only** models like BERT are great for classification and embeddings — they understand but don't generate. **Decoder-only** models like GPT and Claude generate text autoregressively and have become the dominant design for general-purpose LLMs, since they can handle most tasks by framing them as text continuation. Encoder-decoder still wins when input and output are clearly distinct, like translation. Related concepts: transformer, attention mechanism, sequence-to-sequence, BERT, T5, autoregressive generation.

Last updated: 2026-04-29

We use cookies

Anonymous analytics help us improve the site. You can opt out anytime. Learn more