Metacognitive Retrieval-Augmented Large Language Models
Retrieval-augmented generation have become central in natural language processing due to their efficacy in generating factual content. While traditional methods employ single-time retrieval, more recent approaches have shifted towards multi-time retrieval for multi-hop reasoning tasks. However, these strategies are bound by predefined reasoning steps, potentially leading to inaccuracies in response generation. This paper introduces MetaRAG, an approach that combines the retrieval-augmented generation process with metacognition. Drawing from cognitive psychology, metacognition allows an entity to self-reflect and critically evaluate its cognitive processes. By integrating this, MetaRAG enables the model to monitor, evaluate, and plan its response strategies, enhancing its introspective reasoning abilities. Through a three-step metacognitive regulation pipeline, the model can identify inadequacies in initial cognitive responses and fixes them.
Although previous methods have made strides in improving the quality of generated answers, they strictly adhere to predefined reasoning steps over all questions. Such inflexible approaches lack the ability to diagnose specific errors in their responses and consequently don’t possess mechanisms to enhance their performance. We argue that this limitation might stem from the model’s lack of awareness regarding its own reasoning processes. When humans confront complex issues, they often reflect on their thought patterns, gradually adjusting and optimizing their strategies. This ability comes from our innate metacognition, which enables introspection, self-assessment, and self-regulation. Inspired by it, we aim to integrate metacognitive ability into LLMs to enhance retrieval-augmented generation (RAG). By adopting this approach, the model is able to identify its own inaccuracies and dynamically adjust their reasoning strategies, leading to more precise answer generation.
we introduce the Metacognitive Retrieval-Augmented Generation framework (MetaRAG). As illustrated in Figure 1(b), MetaRAG features a “cognition-metacognition” collaborative framework. The cognition component is responsible for deriving answers from the provided question and references, while the metacognitive component, acting as a critic model, delves deep into potential mistakes during reasoning. Upon conducting an analysis of model performance under different conditions of knowledge (as detailed in Sec. 3.2), it has been observed that there are three main reasons causing the model fails to infer the correct answer: insufficient knowledge, conflicting knowledge, and erroneous reasoning. Endowed with the benefit of metacognitive mechanism, we expect the model to be aware of its own cognitive process in RAG tasks from two aspects: (1) The sufficiency and harmonization of external retrieved knowledge and LLM’s intrinsic knowledge. (2) The reliability and accuracy of multi-hop reasoning. By doing so, the model is capable of identifying potential issues present in knowledge integration and answer reasoning, thereby enabling targeted improvements.