𝙻𝙼𝟸: A Simple Society of Language Models Solves Complex Reasoning
Despite demonstrating emergent reasoning abilities, Large Language Models (LLMS) often lose track of complex, multi-step reasoning. Existing studies show that providing guidance via decomposing the original question into multiple subproblems elicits more robustness in LLM reasoning – a decomposer generates the subproblems, and a solver solves each of these subproblems. However, these techniques fail to accommodate coordination between the decomposer and the solver modules (either in a single model or different specialized ones) – the decomposer does not keep track of the ability of the solver to follow the decomposed reasoning. In this paper, we propose LM2 to address these challenges. LM2 modularizes the decomposition, solution, and verification into three different language models. The decomposer module identifies the key concepts necessary to solve the problem and generates step-by-step subquestions according to the reasoning requirement. The solver model generates the solution to the subproblems that are then checked by the verifier module; depending upon the feedback from the verifier, the reasoning context is constructed using the subproblems and the solutions.
Recent trends in solving complex reasoning tasks using Large Language Models (LLMs) typically follow two different dominant approaches: (i) well-curated prompting techniques (Zheng et al., 2023; Yao et al., 2024) on LLMs of exorbitant size like GPT-4 (OpenAI, 2023), or (ii) finetuning a relatively smaller LLM using domain-focused data (Shao et al., 2024; Toshniwal et al., 2024; Dutta et al., 2024). Methods from the former category heavily rely on the proprietary LLM being used and are prone to fail absolutely when employed with less powerful models. The latter category, though cost-effective compared to humongous LLMs, often loses in generalizability due to a narrow training domain.
decoupling the decomposer from the solver by fine-tuning a separate decomposer language model (LM) to coordinate with a larger solver LM is beneficial to simply prompting a single monolithic LM to decompose and solve. Echoing their findings, Wu et al. (2024) also found that distilling decomposition abilities from a larger LM to a smaller LM is much more generalizable