Does question type determine the right retrieval strategy?
Explores whether different non-factoid question types require distinct retrieval and decomposition approaches. Matters because standard RAG fails when applied uniformly to debate, comparison, and experience questions despite being effective for factoid queries.
Standard RAG treats all queries as factoid: retrieve relevant documents, extract the answer. This is appropriate when there is a definitive answer. It is inappropriate for non-factoid questions (NFQs) that lack definitive answers and require synthesizing multiple perspectives, balancing competing viewpoints, or integrating personal experience.
Typed-RAG classifies NFQs into five types:
- Evidence-based: seeks definitions or characteristics of concepts. Single-aspect; standard retrieve-read works.
- Comparison: examines differences/similarities between targets. Multi-aspect; requires keyword extraction per comparison target, parallel retrieval, relevance-weighted aggregation.
- Experience: seeks advice from personal experience. Multi-aspect; requires experience-keyword extraction, similarity-based reranking, response aligned to question intent.
- Reason/Instruction: explains causes or procedures. Multi-aspect; decomposes into single-aspect sub-queries, individual retrieval and generation per sub-query, aggregation into structured response.
- Debate: explores multiple perspectives on a topic. Multi-aspect; extracts discussion topic and opposing opinions, generates per-opinion responses, debate mediator combines into balanced synthesis.
The key insight: question type determines whether aspects are contrasting (high contrast, opposing directions — debate, comparison) or related (lower contrast, aligned direction — experience, reason/instruction). Contrasting aspects require distinct retrieval per aspect. Related aspects allow shared retrieval with per-aspect filtering.
Without type classification, RAG systems apply the same strategy to all queries. Evidence-based questions succeed because they fit standard RAG. The other types fail — not because retrieval is poor but because the generation architecture does not match the question structure.
Researchy Questions adds that real-world non-factoid questions involve "unknown unknowns" — the questioner doesn't know what information is missing. Characteristic formats include relationship questions ("how does X affect Y"), causal questions ("why does X happen"), comparative questions (pros/cons), and analytical questions ("to what extent does X lead to Y"). A good non-factoid question "can lead to interesting and in-depth analysis" with a "clear and refutable thesis, supported by evidence and analysis." The 8-dimension scoring rubric (ambiguity, incompleteness, assumptions, multi-facetedness, knowledge-intensity, subjectivity, reasoning-intensity, harmfulness) can inform question type classification beyond simple topic categories. Source: Arxiv/Agentic Research.
Source: RAG
Related concepts in this collection
-
Does medical AI need knowledge or reasoning more?
Medical and mathematical domains may require fundamentally different AI training priorities. If medical accuracy depends primarily on factual knowledge while math depends on reasoning quality, should we build and evaluate these systems differently?
question type parallels domain type: different queries have different structural requirements, not just different content requirements
-
How do readers track segments, purposes, and salience together?
Can discourse processing actually happen in parallel rather than sequentially? This matters because understanding how readers coordinate multiple layers of meaning at once reveals where AI systems break down in comprehension.
non-factoid responses require tracking multiple discourse segments (one per aspect) with different purposes (describe vs argue vs advise)
-
How do logic units preserve procedural coherence better than chunks?
Can structured retrieval units with prerequisites, headers, bodies, and linkers maintain step-by-step coherence in how-to answers where fixed-size chunks fail? This matters because procedural questions require sequential logic and conditional branching that chunk-based RAG cannot support.
how-to questions are a specific NFQ type (reason/instruction) requiring procedural coherence: logic units' prerequisite-header-body-linker structure directly provides the sequential coherence that reason/instruction questions demand
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
non-factoid question answering requires question type classification because type determines retrieval and decomposition strategy