Query Rewriting for RAG

A lot of RAG debugging starts in the wrong place.

Teams look at chunk size, embedding choice, vector databases, and rerankers while ignoring a simpler issue: the retrieval query itself may be poor. If the user’s phrasing is vague, incomplete, or conversational, the retriever may never see the clearest version of the information need.

That is where query rewriting helps.

What query rewriting is

Query rewriting means transforming the user’s original request into a better retrieval query before search runs.

Example:

User asks:

What changed with our remote policy after the office expansion?

A rewritten retrieval query might become:

remote work policy office expansion policy update relocation hybrid attendance requirements

The rewrite is not for the user. It is for the retriever.

Why it works

Users speak naturally. Retrieval systems perform better on clearer lexical or semantic signals.

Query rewriting can help when the original question contains:

pronouns with missing context
vague references to prior conversation
overly broad wording
conversational filler
missing synonyms or domain terms

Common rewriting strategies

Standalone question rewriting

Turn a context-dependent turn into a self-contained query. This is especially useful in multi-turn chat.

Expansion

Add related terminology, synonyms, product names, or domain vocabulary likely to appear in the documents.

Decomposition

Split one complex question into several smaller retrieval queries. This is helpful when one question actually contains multiple intents.

Constraint extraction

Pull out the exact filters hidden in the user’s language, such as timeframe, geography, customer tier, or document type.

Where teams overdo it

Query rewriting is helpful, but it is not free.

Poor rewrites can:

add noise
dilute the original intent
over-assume domain knowledge
cause the retriever to miss rare exact phrases

This is why many strong systems use both the original query and the rewritten query, then combine results.

A good production pattern

One reliable approach is:

Take the raw user question
Generate a concise standalone rewrite
Optionally generate one expanded keyword-style query
Run retrieval on both original and rewritten versions
Fuse and rerank results

This preserves the user’s wording while giving the search stack a cleaner angle of attack.

How to evaluate query rewriting

Do not judge the rewrite by how elegant it sounds.

Judge it by retrieval quality:

did relevant documents move up?
did recall improve?
did irrelevant matches increase?
did answer quality improve downstream?

The rewrite is infrastructure, not prose.

Bottom line

If your RAG system fails on real user language, query rewriting is one of the highest-leverage fixes to test.

Start with standalone-question rewriting and dual retrieval against both original and rewritten forms. Many retrieval problems are not embedding problems at all. They are translation problems between how people ask and how documents are written.