You Don't Need Pre-built Graphs for RAG: Retrieval Augmented Generation with Adaptive Reasoning Structures
Large language models (LLMs) often suffer from hallucination, generating factually incorrect statements when handling questions beyond their knowledge and perception. Retrieval-augmented generation (RAG) addresses this by retrieving query-relevant contexts from knowledge bases to support LLM reasoning. Recent advances leverage preconstructed graphs to capture the relational connections among distributed documents, showing remarkable performance in complex tasks. However, existing Graph-based RAG (GraphRAG) methods rely on a costly process to transform the corpus into a graph, introducing overwhelming token cost and update latency. Moreover, real-world queries vary in type and complexity, requiring different logic structures for accurate reasoning. The pre-built graph may not align with these required structures, resulting in ineffective knowledge retrieval. To this end, we propose a Logic-aware Retrieval-Augmented Generation framework (LogicRAG) that dynamically extracts reasoning structures at inference time to guide adaptive retrieval without any pre-built graph. LogicRAG begins by decomposing the input query into a set of subproblems and constructing a directed acyclic graph (DAG) to model the logical dependencies among them. To support coherent multi-step reasoning, LogicRAG then linearizes the graph using topological sort, so that subproblems can be addressed in a logically consistent order. Besides, LogicRAG applies graph pruning to reduce redundant retrieval and uses context pruning to filter irrelevant context, significantly reducing the overall token cost. Extensive experiments demonstrate that LogicRAG achieves both superior performance and efficiency compared to state-of-the-art baselines.
Real-world RAG systems often face significant challenges when handling large-scale, unstructured domain corpora (Peng et al. 2024; Zhang et al. 2025b; Su et al. 2025). Documents sourced from research papers, textbooks, or technical reports vary widely in reliability and completeness (Guo et al. 2025; Shen et al. 2025; Zhong et al. 2024; Wu et al. 2025), and the retrieved information is often complex and disorganized, as domain knowledge is typically scattered across multiple sources without clear dependencies (Sun et al. 2024; Ma et al. 2024; Hong et al. 2024). To manage this complexity, RAG systems commonly segment documents into smaller chunks for indexing (Borgeaud et al. 2022; Izacard et al. 2023; Jiang et al. 2023). This approach, however, sacrifices critical contextual information, leading to reduced retrieval accuracy and limited capability for complex reasoning tasks, particularly those requiring multi-hop reasoning across interconnected concepts.
To address this, recent advances (Zhang et al. 2024b; Procko and Ochoa 2024; Xiang et al. 2025; Bi et al. 2024c; Xiao et al. 2025a; Li et al. 2023a; Zhou et al. 2025b) leverage pre-constructed graphs to capture the relational connections among distributed documents, showing remarkable performance in complex tasks.
Despite recent advances, GraphRAG systems still face critical limitations in real-world scenarios. (i) Efficiency issues. Existing GraphRAG models rely on a costly process to transform the corpus into a graph, introducing overwhelming token cost and update latency as shown in Figure 1. It is hard to generalize to practical scenarios where knowledge bases are large-scale or dynamically evolving (Edge et al. 2024). (ii) Low quality of the pre-built graph. Existing methods leverage LLMs to automatically build the graph without any guidance, which may introduce irrelevant or redundant information, leading to inefficiencies in both retrieval and reasoning (Guo et al. 2024). (iii) Lack of flexibility. Real-world queries vary in type and complexity, requiring different logic structures for accurate reasoning (Peng et al. 2024). The pre-built graph may not align with these required structures, resulting in ineffective knowledge retrieval. These challenges highlight the need for a more adaptive and efficient approach.
To this end, we propose LogicRAG, which dynamically extracts reasoning structures at inference time to guide adaptive retrieval without any pre-constructed graph. Specifically, LogicRAG begins by decomposing the input query into a set of subproblems and constructing a directed acyclic graph (DAG) to model the logical dependencies among them. This structured representation enables adaptive planning of the retrieval process by identifying which evidence chunks are logically connected to each subproblem. To support coherent multi-step reasoning, LogicRAG then linearizes the graph using topological sort, so that subproblems can be addressed in a logically consistent order. To further improve efficiency without compromising performance, the model applies graph pruning to reduce redundant retrieval and uses context pruning to filter irrelevant context, significantly reducing the overall token cost.
To handle complex queries in RAG, we propose a structured inference framework that decomposes complex queries into interdependent subproblems and resolves them via a logicguided retrieval and generation process. The core of our method is the construction and utilization of a Query Logic Dependency Graph, a directed acyclic graph (DAG) that models the logical structure underlying the query. Each node in the DAG represents a subproblem, while edges encode the directional dependencies required for reasoning.
The framework operates in three sequential stages. First, the input query is decomposed into subproblems, and a DAG is constructed to capture their logical relationships. This graph is dynamically adapted during inference to reflect evolving retrieval needs. Second, the DAG is topologically sorted to produce a linear execution order that respects the dependencies among subproblems. Each subproblem is then resolved in a greedy, forward-pass manner, wherein retrieval is conditioned on the outputs of previously resolved subproblems.
This process ensures context-aware retrieval and avoids recursive dependencies that hinder efficiency. Finally, to enhance scalability, we apply a two-dimensional pruning strategy that reduces context redundancy and merges semantically similar subproblems. Context pruning uses LLMbased summarization to maintain a rolling memory of relevant information, while graph pruning consolidates loosely coupled subproblems for unified resolution. This logic-aware RAG pipeline transforms the traditionally flat retrieval paradigm into a dependency-sensitive inference mechanism. By aligning retrieval operations with the query’s internal reasoning structure, the framework enables efficient, accurate, and scalable multi-step reasoning over complex information needs.