Understanding Before Reasoning: Enhancing Chain-of-Thought with Iterative Summarization Pre-Prompting

Paper · arXiv 2501.04341 · Published January 8, 2025
Reading SummarizingSelf Refinement Self Consistency Feedback

CoT encounters difficulties when key information required for the reasoning process is either implicit or missing. It primarily stems from the fact that CoT emphasizes the stages of reasoning, while neglecting the critical task of gathering and extracting essential core information in the early stage. In this paper, we propose a pre-prompting methodology called Iterative Summarization Pre-Prompting (ISP2), which can effectively refine the reasoning ability of LLMs when key information is not explicitly presented. First, entities and their corresponding descriptions are extracted to form potential key information pairs from the question. Next, we introduce the reliability rating to assess the reliability of these information pairs. Then, two information pairs with the lowest rankings through the reliability rating are merged into a new potential information description, which includes a new entity and its corresponding description. This process is applied iteratively to guide the generation of a unique information pair. Finally, the obtained key information pair, along with the original question, is fed into LLMs for reasoning, resulting in the final answer.

It coordinates three key LLM steps: adaptive extraction of candidate information, reliability rating of information pairs, and iterative summarization for knowledge understanding. These steps include summarizing and integrating relevant information, as well as formulating strategies before tackling intricate real-world reasoning tasks. By engaging in these pre-prompting steps, LLMs can better explore and understand the nuances of complex problems, thereby improving their ability to perform sophisticated reasoning.

2.1 Chain-of-Thoughts Prompting

Wei et al. (2022) emphasize the importance of deriving conclusive answers through multi-step logical pathways by introducing the concept of Chain-of-Thoughts (CoT) reasoning. The method demonstrates that reasoning abilities can be elicited through a series of thoughtful steps. Kojima et al. (2022) discover that simply adding the phrase "let’s think step by step" in prompts allows LLMs to perform zero-shot logical reasoning without any additional human prompts. Subsequently,Wang et al. (2023) introduce Self Consistency (SC) to replace the greedy decoding strategy. Zhang et al. (2023) construct an automatic CoT framework based on the problem, eliminating the instability of manual prompts. Fu et al. (2023) employ complexity-based multi-step reasoning estimation to execute CoT. Yao et al. (2024) propose Tree-of-Thoughts (ToT), which introduces deliberation into decision-making by considering multiple reasoning paths. Xu et al. (2024) enhance the model’s understanding by re-reading the question. These studies underscore the importance of CoT in enhancing the reasoning and planning capabilities of LLMs in complex scenarios. Despite, CoT still requires further refinement in complex scenarios involving more complex problems.

2.2 In-Context Learning

In-context learning (ICL) enables LLMs to make predictions based on input examples without updating model parameters. Brown et al. (2020) introduce this concept in GPT- 3, demonstrating that LLMs can generalize tasks from a small number of examples embedded in the input context. Min et al. (2022) propose Meta-training for In-Context Learning (MetaICL), which significantly enhances ICL capabilities through continuous training on various tasks using demonstrations. Additionally, the concept of supervised context training(Chen et al. 2022) is proposed to bridge the gap between pre-training and downstream ICL tasks. LLM refines its prior knowledge through ICL, thereby improving performance across multiple tasks (Krishnamurthy et al. 2024). ICL allows a single model to perform various tasks universally, helping it better align its predictions with the semantic requirements of the prompts.

2.3 Task Decomposition

Perez et al. (2020) decompose complex problems into several independent subproblems by the LLMs, and then aggregates the answers to form the final response. Wang, Deng, and Sun (2022) address problems by modeling prompts as continuous virtual tokens and iteratively eliciting relevant knowledge from a LLM. Yang et al. (2022) decompose normal questions into a series of subproblems, which are then converted into SQL queries using a rule-based system. Wu, Terry, and Cai (2022) introduce the idea of linking LLM steps, where the output of one step becomes the input of the next, and developed an interactive system for users to build and modify these chains. Zhou et al. (2023) argue that generated subproblems are often interdependent and need to be solved in a specific order, with the answers to some subproblems serving as the foundation for others. They propose the Least-to-Most Prompting method, which links the problem decomposition process to the solving of subproblems. Zhang et al. (2024) propose the Cumulative Reasoning (CR), breaking down complex tasks into smaller manageable steps and utilizing iterative collaboration among three different LLMs to incrementally solve problems.

2.4 Self Evaluation

Researchers have proposed automated evaluation methods, such as Sentence-BERT (Reimers and Gurevych 2019) and SimCSE (Gao, Yao, and Chen 2021), to assess the reasoning process. However, these methods primarily concentrate on matching individual words and phrases, which limits their ability to fully assess the logical consistency and deeper meaning of the context. To address these limitations, the feasibility of using LLMs to evaluate their own predictions is becoming an increasingly important step in problem-solving. Shinn et al. (2023), Madaan et al. (2024) and Paul et al. (2024) introduce the Self Evaluation (SE) mechanism, where LLMs provide feedback on the candidate answers they generate. Chen et al. (2024) improves LLM code generation accuracy by using self-generated feedback. Similarly, Kim, Baldi, and McAleer (2024) introduce a review step to evaluate actions and states in operational tasks and decide the next steps. In terms of reasoning, Yao et al. (2024) emphasize SE guided decoding, where the LLM uses carefully designed prompts to evaluate candidate answers via a tree search procedure. Kumar et al. (2024) explore facilitate scalable self-reflection in LLM, demonstrating its effectiveness in improving student learning outcomes. By incorporating fair assessment in LLM learning, our approach injects the reflection mechanism into problem space understanding rather than just evaluating candidate answers, allowing for deeper consideration of the problem and focusing more on the essence of the problem. We believe that reasoning based on a thorough understanding of information leads to further refinement and improvement