RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation

Paper · arXiv 2406.12566 · Published June 18, 2024

Existing studies mostly focus on question scenarios with clear user intents and concise answers. However, it is prevalent that users issue broad, open-ended queries with diverse sub-intents, for which they desire rich and long-form answers covering multiple relevant aspects. To tackle this important yet underexplored problem, we propose a novel RAG framework, namely RichRAG. It includes a sub-aspect explorer to identify potential sub-aspects of input questions, a multi-faceted retriever to build a candidate pool of diverse external documents related to these sub-aspects, and a generative list-wise ranker, which is a key module to provide the top-k most valuable documents for the final generator. These ranked documents sufficiently cover various query aspects and are aware of the generator’s preferences, hence incentivizing it to produce rich and comprehensive responses for users. The training of our ranker involves a supervised fine-tuning stage to ensure the basic coverage of documents, and a reinforcement learning stage to align downstream LLM’s preferences to the ranking of documents.

Though some advanced techniques (Jiang et al., 2023; Asai et al., 2024; Wang et al., 2023c; Li et al., 2024) have been proposed, existing studies primarily focus on addressing specific problems that require concise and definitive answers. However, user intents are complex and multi-faceted, necessitating rich and comprehensive answers. As Figure 1 shows, when a user inquires about rapping related information, a rich response about various aspects of rapping, such as origins, characteristics, and evolution could lead to a more satisfactory user experience than a superficial description.

In further, we claim that a promising top-k ranking should have the following desirable features: (1) Comprehensiveness. Incentivizing LLM to generate rich and reliable responses requires external documents to comprehensively cover various query aspects. Therefore, the ability to model relationships among documents, hence maximizing the coverage of the entire reference list is crucial for the ranking module. (2) Alignment with the LLMs’ preferences. In RAG systems, the users of IR models are LLMs instead of humans. Thus, the reference order should be LLM-friendly, hence enhancing the generator to produce satisfying responses.

To achieve this, we devise a generative list-wise ranker based on encoder-decoder structures. It takes as input the user query, its identified subaspects, and all candidates, then directly generates top-k document IDs as final ranking lists. This structure offers two key advantages: (1) Global Document Modeling. The seq-to-seq model structure equips the ranker to effectively model global interactions among candidate documents, queries, and sub-aspects, thereby capturing the overall utility of generated ranking lists in covering the query’s multi-aspects. (2) Efficiency. Following the FiD structure (Izacard and Grave, 2021), we parallelize the encoding of each candidate and further introduce pooling and reuse operations to the decoder module.