LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering
Longcontext question answering (LCQA) (Caciularu et al., 2022), which has been recently advanced significantly by LLMs, is a complex task that requires reasoning over a long document or multiple documents to provide accurate answers to questions.
However, as shown in Figure 1, they frequently encounter the “lost in the middle” issue (Liu et al., 2024), that is, when the relevant context is in the middle of the document (rather than the beginning and end), they are prone to sub-optimal or even incorrect responses.
As depicted in Figure 1, Vanilla RAG only retrieves "Griffin" as the performer of "I’ll say it" but misses the university from which "Griffin" graduated. Although the "university" is mentioned in the same paragraph, the system ultimately produces an incorrect response. Second, low evidence density in long-context documents can lead to low retrieval quality. Considerable noise present in long-context documents impairs LLMs’ capacity to accurately identify key information (factual details), resulting in the retrieval of low-quality chunks and ultimately leading to erroneous answers (Zhang et al., 2023; Chen et al., 2024). Recently, several advanced RAG systems have attempted to mitigate the aforementioned issues. Specifically, Self-RAG (Asai et al., 2023) employs self-reflection tokens to facilitate the autonomous exploration of global information in a corpus. However, its reliance on the accuracy of reflection tokens may result in the potential deletion of valid retrieval chunks with factual details. CRAG (Yan et al., 2024) evaluates the question relevance of each chunk individually to enhance the identification of factual details.
we propose LongRAG, a general, dual-perspective, and robust RAG system paradigm that effectively addresses the above-mentioned issues for LCQA, comprising four plug-and-play components with multiple strategies: a hybrid retriever, an LLM-augmented information extractor, a CoT-guided filter, and an LLM-augmented generator. LongRAG enhances the RAG system’s ability to mine global long-context information and identify factual details. Specifically, the long context extractor employs a mapping strategy to orderly extend the semantic space of retrieved chunks into a higher dimensional long-context semantic space, then refining global information and contextual structure among chunks.