INQUIRING LINE

Which knowledge structure types best fit different query types?

This explores how the *shape* you give your knowledge — a table, a graph, a taxonomy, plain text chunks — should be matched to the *kind* of question being asked, rather than forcing everything through one retrieval pipeline.


This explores how the *shape* you give your knowledge — a table, a graph, a taxonomy, or plain text chunks — should be matched to the kind of question being asked. The clearest statement of the idea in the corpus is StructRAG, which trains a router to pick among tables, graphs, algorithms, catalogues, and raw chunks depending on what the query demands, and grounds the choice in 'cognitive fit' theory: the right structure lowers the mental (and computational) load of reasoning, the wrong one inflates it Can routing queries to task-matched structures improve RAG reasoning?. The headline isn't any single structure — it's that *matching* beats *uniform* retrieval.

What counts as the 'kind' of question? One note breaks non-factoid questions into five types and shows each wants a different handling: evidence-seeking questions are happy with standard RAG, while comparison and debate questions need aspect-by-aspect retrieval, and 'why' or experience questions need to be decomposed before retrieval even helps Does question type determine the right retrieval strategy?. So the routing question has two halves — classify the query, then pick the structure — and the corpus has material on both.

On the structure side, graphs are the recurring answer for anything multi-hop or global. Knowledge-graph triples let small models reason transparently and cheaply by externalizing the steps Can structuring reasoning as knowledge graphs help smaller models solve complex tasks?; symbolic rules read off a graph's topology give an agent a navigational plan that plain semantic similarity can't Can symbolic rules from knowledge graphs guide complex reasoning?; and hierarchical multimodal graphs answer cross-chapter, whole-book questions that flat chunk retrieval simply cannot reach Can multimodal knowledge graphs answer questions that flat retrieval cannot?. But graphs aren't free — LogicRAG argues you don't need a pre-built corpus-wide graph at all; you can construct a small query-specific logic graph at inference time and dodge the staleness and build cost Can query-time graph construction replace pre-built knowledge graphs?. For training-time knowledge, the analog is taxonomy: StructTuning hits 50% of full-corpus performance on 0.3% of the data by teaching the model *where* a fact sits in a conceptual structure, the way a textbook does Can organizing knowledge structures beat raw training data volume?.

There's a subtler thread worth knowing: more structure isn't always better. Partial symbolic abstraction beats *full* formalization, because converting everything to logic throws away the semantic richness that natural language carries — the sweet spot is language enriched with selective symbolic scaffolding Why does partial formalization outperform full symbolic logic?. That's the same lesson as query-type routing, one level down: the goal is fit, not maximal formality.

Underneath all of this is *why* a single structure fails. Flat embedding retrieval has hard limits — embeddings measure association rather than task relevance, and the dimension of the vector caps which document sets are even representable, so compositional and global queries break in ways no amount of tuning fixes Where do retrieval systems fail and why?. That's the structural case for matching shape to query, and it pairs with the architectural one: separating query planning from answer synthesis, and coupling retrieval tightly to reasoning, is what lets multi-hop questions resolve at all Do hierarchical retrieval architectures outperform flat ones on complex queries? How should retrieval and reasoning integrate in RAG systems?. The thing you didn't know you wanted to know: the choice of knowledge structure is really a choice about how much reasoning you're willing to push *out* of the model and *into* the data's shape.


Sources 11 notes

Can routing queries to task-matched structures improve RAG reasoning?

StructRAG demonstrates that selecting knowledge structure type based on query demands—via DPO-trained router choosing among tables, graphs, algorithms, catalogues, and chunks—improves knowledge-intensive reasoning over standard retrieval. The approach grounds this in cognitive load and cognitive fit theory from cognitive science.

Does question type determine the right retrieval strategy?

Research shows non-factoid questions split into five types, each requiring different retrieval and aggregation methods. Evidence-based questions suit standard RAG, while debate and comparison need aspect-specific retrieval, and experience/reason questions need decomposition or filtering strategies.

Can structuring reasoning as knowledge graphs help smaller models solve complex tasks?

Knowledge Graph of Thoughts (KGoT) achieves 29% improvement on GAIA Level 3 tasks using GPT-4o mini by externalizing reasoning into iteratively constructed KG triples. The approach improves transparency, reduces bias, and enables quality control over reasoning steps.

Can symbolic rules from knowledge graphs guide complex reasoning?

SymAgent derives symbolic rules from KG structure using LLM reasoning to create navigational plans that align natural language with graph topology. This approach captures structural reasoning patterns explicitly, outperforming retrieval methods that rely on semantic similarity alone.

Can multimodal knowledge graphs answer questions that flat retrieval cannot?

MegaRAG builds hierarchical multimodal knowledge graphs from text and visuals to answer cross-chapter, global questions that flat chunk retrieval cannot reach. The hierarchy supports abstraction levels from high-level summaries to page-specific details while treating images as first-class graph nodes.

Can query-time graph construction replace pre-built knowledge graphs?

LogicRAG constructs directed acyclic graphs from queries at inference time rather than pre-building corpus-wide graphs, eliminating construction overhead, avoiding staleness, and enabling query-specific retrieval logic without sacrificing multi-hop reasoning capability.

Can organizing knowledge structures beat raw training data volume?

StructTuning achieves 50% of full-corpus performance using only 0.3% of training data by organizing chunks into auto-generated domain taxonomies. The model learns knowledge position within conceptual structures rather than raw text patterns, matching how students learn from textbooks.

Why does partial formalization outperform full symbolic logic?

QuaSAR and Logic-of-Thought both achieve 4-8% accuracy gains by enriching natural language with selective symbolic elements rather than replacing it. Full formalization loses semantic information; pure language lacks structure. Augmentation preserves both.

Where do retrieval systems fail and why?

RAG systems fail at three structural levels: adaptive triggering (fixed intervals waste context), semantic-task mismatch (embeddings measure association, not relevance), and mathematical limits (embedding dimension constrains representable document sets). These require fundamentally different retrieval approaches, not tuning.

Do hierarchical retrieval architectures outperform flat ones on complex queries?

Separating query planning from answer synthesis into distinct components reduces interference and improves multi-hop query performance. This architectural principle mirrors documented benefits of separating planning from execution in agent design.

How should retrieval and reasoning integrate in RAG systems?

Research shows that tight coupling between retrieval and reasoning—via Markov Decision Processes and step-level feedback—substantially improves accuracy and efficiency. Graph-based retrieval and metacognitive monitoring address limitations of vector embeddings and prevent retrieval failures on compositional tasks.

Next inquiring lines