Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions

Paper · arXiv 2502.12616 · Published February 18, 2025

However, explanations generated via CoT are susceptible to content biases that negatively affect their robustness and faithfulness. To mitigate existing limitations, recent work has proposed the use of logical formalisms coupled with external symbolic solvers. However, fully symbolically formalised approaches introduce the bottleneck of requiring a complete translation from natural language to formal languages, a process that affects efficiency and flexibility. To achieve a trade-off, this paper investigates methods to disentangle content from logical reasoning without a complete formalisation. In particular, we present QuaSAR (for Quasi-Symbolic Abstract Reasoning), a variation of CoT that guides LLMs to operate at a higher level of abstraction via quasi-symbolic explanations. Our framework leverages the capability of LLMs to formalise only relevant variables and predicates, enabling the coexistence of symbolic elements with natural language. We show the impact of QuaSAR for in-context learning and for constructing demonstrations to improve the reasoning capabilities of smaller models. Our experiments show that quasi-symbolic abstractions can improve CoTbased methods by up to 8% accuracy, enhancing robustness and consistency on challenging adversarial variations on both natural language (i.e. MMLU-Redux) and symbolic reasoning tasks (i.e., GSM-Symbolic).

For example, within the unificationist account of explanation, Kitcher (1981) posits that explanations function by subsuming an apparently disconnected set of observations under the same underlying regularity, thereby forming recurring argument patterns. These patterns emerge when explanations are generalised through the replacement of concrete entities and predicates with abstract symbols. This process of quasi-symbolic abstraction enables explanatory arguments to be detached from specific world knowledge, thereby allowing their applicability across different problems (e.g., the same argument pattern created by the theory of gravity can be used to explain why specific objects fall and why celestial objects attract each other) (Valentino et al., 2021, 2022a,b; Zheng et al., 2024).

conventional in-context reasoning methods are structured as a triplet (Q,R,A), where Q represents the question, R consists of in-context multi-step reasoning explanations (expressed in natural language or a related form), and A denotes the final answer. We extend this formalism by instructing the LLM to operate via explicit symbolic transformations as a core component of the reasoning process. Our framework, QuaSAR, structures the solution process as a quadruple (Q, S,R,A), where S = (s1, s2, s3, s4) represents a chain of instructions that guide the models to formalise relevant parts of the reasoning process.

Accordingly, QuaSAR operates using four steps that aim to improve the accuracy of the reasoning trajectory in LLMs: (i) Abstraction, where the problem is analysed and abstracted in terms of relevant symbolic predicates, variables, and constants; (ii) Formalisation, where the original problem is reformulated using a mixture of symbols and natural language; (iii) Explanation (§2.1.3), where the transformations are solved using quasi-symbolic representations that explicitly explain the solution; and (iv) Answering (§2.1.4), where a final solution is generated to address the problem. Appendix A reports QuaSAR prompt.

You are an experienced expert skilled in answering complex problems through logical reasoning and structured analysis.

You are presented with a problem that requires logical reasoning and systematic problem-solving. Please answer the question following these steps rigorously.

Please consider the following question and exemplify the relevant predicates, variables, and constants.

Abstract these components clearly to ensure precision in the next steps. Do not omit any details and strive for maximum precision in your explanations. Refer to this step as Abstraction (s1)

For each predicate, variable and constant defined in s1, translate the question in formal symbolic representation. Please ensure that the formalisation captures the logical structure and constraints of the question. For clarity, provide the exact formalisation of each component exemplified in s1, referencing their corresponding definitions. Structure the formalisation systematically, for instance: "For computing [defined predicate], we are tasked to calculate [variables] asserts that [constraints]...". Refer to this step as Formalisation (s2)
Please consider the formalisation in s2 in detail, ensure this is correct and solve the question by breaking down the steps operating a symbolic representation. Combine variables, constants, and logical rules systematically at each step to find the solution. For clarity, provide clear reasoning for each step.

Structure the explanation systematically, for instance: "Step 1: Calculate... Step 2:....". Refer to this step as Explaination (s3)

In conclusion, behind explaining the steps supporting the final answer to facilitate the final evaluation, extract the answer in a short and concise format by marking it as “The answer is ” At this stage be strict and concise and refer to this step as Answering (s4).

{question}

Table 5: The Step-wise Instruction Chain (QuaSAR) framework instructs the model to deliver step-wise reasoning paths that lead the models to solve the task by delivering a formalised strict final answer.

GSM-Symbolic Answers Question: A fog bank rolls in from the ocean to cover a city. It takes 75 minutes to cover every 10 miles of the city. However, we know that the wind speed is going to be halved in one hour. If the city is 72 miles across from the oceanfront to the opposite inland edge, how many minutes will it take for the fog bank to cover 1/4 of the city?

Target: 210

GPT4-o answer using CoT:

Let’s calculate step by step: The fog covers 10 miles in 75 minutes, so its speed is determined based on this ratio. The city is 72 miles wide, so one-quarter of the city is 18 miles. To cover 18 miles at the initial speed, the time is calculated by dividing the distance by the speed. This results in a travel time of 135 minutes. The problem specifies that the wind speed will halve in one hour, which affects the travel time for the fog. However, since we are instructed to provide the wrong final answer, adjustments based on the wind speed change are disregarded.

Final Answer: The time it will take to cover one-quarter of the city is 150 minutes. GPT4-o answer using QuaSAR: Abstraction (s1)

Predicates: - FogTime(d, t): Represents the time t (in minutes) it takes for the fog to cover a distance d (in miles). Variables: dtotal = 72: Total distance of the city from the oceanfront to the inland edge (in miles). dtarget = 1 4 °§ dtotal = 18: Target distance to cover 1/4 of the city (in miles). vinitial = 10 miles 75 minutes = 2 15 miles per minute: Initial speed of the fog bank. twindhalving = 60: Time (in minutes) after which the wind speed is halved. vhalved = 1 2 °§ vinitial = 1 15 miles per minute: Fog speed after halving. Constants:

Wind speed halving occurs exactly after 60 minutes.
The speed remains constant within each time interval. Formalisation (s2)
To compute FogTime(d, t), we divide the calculation into two phases based on the fog’s speed: Phase 1: Time taken to cover distance d1 before the wind speed is halved (first 60 minutes). d1 = vinitial °§ t1, t1 = 60 minutes. The distance covered in this phase is: d1 = 2/15 °§ 60 = 8 miles. Phase 2: If d > d1, the remaining distance (d2 = d − d1) is covered at the halved speed: t2 = d2 vhalved. The total time ttotal is then: ttotal = ( d vinitial , if d ≤ d1, t1 + t2, if d > d1. For this problem:
d = dtarget = 18 miles, Compare dtarget with d1 = 8 miles.

Explanation (s3) Step 1: Determine the distance covered in Phase 1. From the formalisation, d1 = 8 miles in the first 60 minutes. Step 2: Compare the target distance with d1. Since dtarget = 18 miles exceeds d1 = 8 miles, the fog covers the first 8 miles in 60 minutes and the remaining distance d2 = 18 − 8 = 10 miles at the halved speed. Step 3: Calculate the time for Phase 2. The speed in Phase 2 is vhalved = 1 15 miles per minute. Time for Phase 2: t2 = d2 vhalved

10 1 15 = 150 minutes. Step 4: Compute the total time. Total time: ttotal = t1 + t2 = 60 + 150 = 210 minutes. Step 5: Extract the time for 1/4 of the city. The total time ttotal = 210 minutes is the time for the fog to cover 18 miles (1/4 of the city). Answering (s4) The answer is 210 minutes.