Named entity recognition (NER) is the task of finding and classifying named entities in text. Given the sentence "Apple released the iPhone 15 in Cupertino on September 12, 2023", a NER system should mark Apple → ORG, iPhone 15 → PRODUCT, Cupertino → LOC, September 12, 2023 → DATE.
It matters because NER is a foundational building block for many information-extraction pipelines: turning news articles into knowledge graphs, populating databases from unstructured documents, redacting sensitive information from documents (PII detection), legal-document analysis, and search engine indexing. Before LLMs, dedicated NER models (spaCy, Stanford NER, Flair, Chinese tools like LTP and HanLP) were a separate part of every NLP stack.
A concrete example: feed an LLM a 10-page contract with a prompt like "extract every party, date, monetary amount, and obligation, return as JSON" — modern LLMs handle this in one call with high accuracy. The same task pre-2022 would have needed a fine-tuned NER model plus rule-based post-processing.
For extremely high-volume production NER (millions of documents per day, low latency), specialized models still beat LLM API calls on cost. But for one-off extraction, exploratory analysis, or moderate-volume work, LLMs with a clear prompt and JSON-structured output are usually the simplest path. Related: information extraction, RAG, prompt engineering, structured output.
We use cookies
Anonymous analytics help us improve the site. You can opt out anytime. Learn more