Named entity recognition (NER)

Identifying and classifying named entities — people, organizations, locations, dates, products — in unstructured text.

Named entity recognition (NER) is the task of finding and classifying named entities in text. Given the sentence "Apple released the iPhone 15 in Cupertino on September 12, 2023", a NER system should mark Apple → ORG, iPhone 15 → PRODUCT, Cupertino → LOC, September 12, 2023 → DATE. It matters because NER is a foundational building block for many information-extraction pipelines: turning news articles into knowledge graphs, populating databases from unstructured documents, redacting sensitive information from documents (PII detection), legal-document analysis, and search engine indexing. Before LLMs, dedicated NER models (spaCy, Stanford NER, Flair, Chinese tools like LTP and HanLP) were a separate part of every NLP stack. A concrete example: feed an LLM a 10-page contract with a prompt like "extract every party, date, monetary amount, and obligation, return as JSON" — modern LLMs handle this in one call with high accuracy. The same task pre-2022 would have needed a fine-tuned NER model plus rule-based post-processing. For extremely high-volume production NER (millions of documents per day, low latency), specialized models still beat LLM API calls on cost. But for one-off extraction, exploratory analysis, or moderate-volume work, LLMs with a clear prompt and JSON-structured output are usually the simplest path. Related: information extraction, RAG, prompt engineering, structured output.