A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning

Paper · arXiv 2408.05141 · Published August 9, 2024

we introduce a hybrid RAG system enhanced through a comprehensive suite of optimizations that significantly improve retrieval quality, augment reasoning capabilities, and refine numerical computation ability. We refined the text chunks and tables in web pages, added attribute predictors to reduce hallucinations, conducted LLM Knowledge Extractor and Knowledge Graph Extractor, and finally built a reasoning strategy with all the references. We evaluated our system on the CRAG dataset through the Meta CRAG KDD Cup 2024 Competition. Both the local and online evaluations demonstrate that our system significantly enhances complex reasoning capabilities.

A basic RAG system consists of two parts: a retriever and a generator. Relevant text documents are extracted from external knowledge sources and used as conditioning inputs alongside the query during generation [20]. The retriever accurately computes the similarity between user queries and external factual knowledge using metrics such as cosine similarity. The top most relevant sentences are extracted from external databases and combined with the inputs for the generator. This process enables a general LLM to acquire domain-specific knowledge from a corresponding domain database without sacrificing its generalization capabilities. Additionally, by combining retrieved facts with input queries, the hallucination problem is mitigated, leading to more accurate and informed responses. Furthermore, by maintaining an up-to-date database, time-dependent information can be seamlessly integrated into the LLMs.

After all the previously introduced processing methods, we get text chunks, tables, triplets from KG, and knowledge from LLM weights as the references. We carefully designed a prompt template to let the LLM do reasoning from all these references and get the final answer. We control the reasoning process by output format demonstration and zero-shot CoT, which is useful for multi-hop questions. Leveraging the strong instruction-following capabilities of Llama3-70B-Instruct, we’ve successfully maintained steady progress in controlling reasoning tasks. We designed several rules to constrain the reasoning path and output format, including that the output should be precise, and guide the model reasoning by asking intermediate questions in the prompt.

For the domain attribute, we perform well in areas such as movies, music, and open topics, but our performance is lacking in finance and sports. This is due to the fact that these two domains require the model to have the ability to answer dynamic information that changes over time. To prevent hallucinations, our model opts to refuse to answer such queries. The performance regarding the attribute of dynamism reaches the same conclusion: as the dynamism of the model increases, the effectiveness of our system gradually declines.

"system_prompt": "You will be provided with a question. Your task is to identify whether this

question is a static question or a dynamic question. A static question is that the answer is fixed and will not change over time. A dynamic question is that the answer will change over time or needs time information. You MUST choose from one of the following choices: ["static", "dynamic"]. You MUST give the question type succinctly, using the fewest words possible.\nHere are some examples:\n" + \ system_prompt = """You are provided with a question. Your task is to answer the question with your reasoning process. If you can't answer it directly based on your knowledge, respond with 'I don't know'. If you think the premise of the question is wrong, for example, the question asks information about a person's husband, but you are sure that the person doesn't have one, you should answer with "Invalid question" without any other words. You MUST think if the question has a false premise, then think the final answer. You MUST generate the reasoning process before the answer. You MUST generate your output with the following format: ===START=== ## Reasoning: - Does it have a false premise? YOUR REASONING - What is the final answer? ------ ## Answer: YOUR FINAL ANSWER ===END=== IMPORTANT RULES: - If you can't answer it directly based on your knowledge, respond with 'I don't know'. - Your generation MUST starts with "===START===" and ends with "===END===". - YOUR FINAL ANSWER should be succinct, and use as few words as possible. - YOUR REASONING should be a detailed reasoning process that explains how you arrived at your answer. - If you think the premise of the question is wrong, for example, the question asks information about a person's husband, but you are sure that the person doesn't have one, you should answer with "Invalid question" without any other words. Let's think step by step now!""" ) -> list[list[str]]: system_prompt = f""" You are a helpful assistant in function calling. I have a knowledge graph and a set of functions that can be called. You will be given a question and the query time. Your task is to generate several function calls that can help me answer the question. Here are functions and their descriptions: {TOOLS} Remember your rules:

You MUST follow the function signature.
You MUST output the JSON format that can be read by json.loads. Return empty list if useful function calls can be found.
For each function call, you should output its function name and corresponding arguments. Here are examples: