With LLMs being used everywhere (organically or forcefully -- just for the sake of "using LLM"), how will you decide whether to use the traditional retrieval strategy or LLM?
TL;DR
Retrieval (ex: ElasticSearch) → Great for fast, deterministic, keyword/semantic matches.
LLMs → Great for interpreting messy natural language, reasoning, and generating synthesized answers.
Often, the best solution is a hybrid (retrieval + LLM).
DETAILS
Let's quickly see here how the retrieval-based approach is different than LLM-based answer generation.
How it works
in a very simplistic way, a user's query is parsed and matched against the indexed documents. Then it returns top K elements, depending upon relevance.
Here's how the flow looks:
User Query --> ElasticSearch Index --> Top-k Documents --> Result Returned to User
Strengths of this approach
Extremely fast and easily scalable.
Deterministic (you know what it matches).
Cost-effective.
Weaknesses of this approach
Struggles with unstructured, ambiguous queries.
Returns documents, not direct answers.
How it works
User's query is passed to the LLM (either fine-tuned or with context of available docs/APIs etc.). LLM then interprets the query, extracts the intent, and synthesizes the response.
Here's how the flow looks:
User Query --> LLM (with knowledge/context) --> Parsed Intent + Answer
Strengths of this approach
Handles vague/unstructured natural language.
Can rephrase or summarize answers.
Useful for conversational or multi-turn queries.
Weaknesses of this approach
Higher cost (compute-heavy).
Can hallucinate if not grounded in real data.
Slower than retrieval.
Final Thought
Use traditional retrieval when speed, scalability, and exact matches matter.
Use LLMs when natural language interpretation, reasoning, or synthesized answers are needed.
Think of retrieval as your library catalog -- efficient but literal. Think of an LLM as a librarian -- they understand your intent, can summarize, and even reason. Choose based on trade-offs: speed and cost vs flexibility and intelligence.
In real-world systems, the winning formula is usually Retrieval + LLM (RAG): retrieval for grounding, LLM for reasoning.