🟣 Technical 8 min read

Query Rewriting for RAG

Bad retrieval often starts with a weak query. Here's how query rewriting improves RAG systems, which strategies work, and how to avoid turning a simple question into a worse one.

View all rag depths β†’

A lot of RAG debugging starts in the wrong place.

Teams look at chunk size, embedding choice, vector databases, and rerankers while ignoring a simpler issue: the retrieval query itself may be poor. If the user’s phrasing is vague, incomplete, or conversational, the retriever may never see the clearest version of the information need.

That is where query rewriting helps.

What query rewriting is

Query rewriting means transforming the user’s original request into a better retrieval query before search runs.

Example:

User asks:

What changed with our remote policy after the office expansion?

A rewritten retrieval query might become:

remote work policy office expansion policy update relocation hybrid attendance requirements

The rewrite is not for the user. It is for the retriever.

Why it works

Users speak naturally. Retrieval systems perform better on clearer lexical or semantic signals.

Query rewriting can help when the original question contains:

  • pronouns with missing context
  • vague references to prior conversation
  • overly broad wording
  • conversational filler
  • missing synonyms or domain terms

Common rewriting strategies

Standalone question rewriting

Turn a context-dependent turn into a self-contained query. This is especially useful in multi-turn chat.

Expansion

Add related terminology, synonyms, product names, or domain vocabulary likely to appear in the documents.

Decomposition

Split one complex question into several smaller retrieval queries. This is helpful when one question actually contains multiple intents.

Constraint extraction

Pull out the exact filters hidden in the user’s language, such as timeframe, geography, customer tier, or document type.

Where teams overdo it

Query rewriting is helpful, but it is not free.

Poor rewrites can:

  • add noise
  • dilute the original intent
  • over-assume domain knowledge
  • cause the retriever to miss rare exact phrases

This is why many strong systems use both the original query and the rewritten query, then combine results.

A good production pattern

One reliable approach is:

  1. Take the raw user question
  2. Generate a concise standalone rewrite
  3. Optionally generate one expanded keyword-style query
  4. Run retrieval on both original and rewritten versions
  5. Fuse and rerank results

This preserves the user’s wording while giving the search stack a cleaner angle of attack.

How to evaluate query rewriting

Do not judge the rewrite by how elegant it sounds.

Judge it by retrieval quality:

  • did relevant documents move up?
  • did recall improve?
  • did irrelevant matches increase?
  • did answer quality improve downstream?

The rewrite is infrastructure, not prose.

Bottom line

If your RAG system fails on real user language, query rewriting is one of the highest-leverage fixes to test.

Start with standalone-question rewriting and dual retrieval against both original and rewritten forms. Many retrieval problems are not embedding problems at all. They are translation problems between how people ask and how documents are written.

Simplify

← Parent Document Retrieval: Solving RAG's Context Window Problem

Go deeper

Query Understanding for RAG: What Happens Before Retrieval β†’

Related reads

ragquery-rewritingretrievalsearchllm-systems

Stay ahead of the AI curve

Weekly insights on AI β€” explained at the level that's right for you. No hype, no jargon, just what matters.

No spam. Unsubscribe anytime. We respect your inbox.