Summarization

The task of compressing a long input — article, transcript, document — into a shorter version that preserves the key information.

Summarization is the task of producing a shorter, faithful version of a longer input. The two main flavors are extractive (pull verbatim sentences from the source) and abstractive (write new sentences that capture the meaning). Modern LLMs do abstractive summarization by default and can adjust length, tone, focus, or audience on demand. It matters because text-glut is one of the most-felt problems modern AI solves. Long meeting transcripts, dense legal documents, twenty-page research papers, hour-long YouTube videos, full email threads — anything you'd otherwise have to read in full can be summarized to save time. Summarization underlies many practical AI products: meeting note-takers, document Q&A, news digests, podcast summaries, research assistants. A concrete example: paste a 30-page market research PDF into Claude and ask "summarize the key findings in 5 bullets, focused on impact for Taiwan". The model produces something usable in seconds instead of a half-hour read. The same prompt with "focused on US regulatory environment" produces a different summary from the same input. Quality depends on context window (can the source actually fit?) and the model's faithfulness — bad summaries can hallucinate facts not in the source. For high-stakes use, double-check critical claims against the original. Related: ROUGE, RAG, long-context, hallucination.