Bounds of Chain-of-Thought Robustness: Reasoning Steps, Embed Norms, and Beyond

Paper · arXiv 2509.21284 · Published September 25, 2025
Reasoning Methods CoT ToTReasoning CritiquesCognitive Models Latent

Existing research indicates that the output of Chain-of-Thought (CoT) is significantly affected by input perturbations. Although many methods aim to mitigate such impact by optimizing prompts, a theoretical explanation of how these perturbations influence CoT outputs remains an open area of research. This gap limits our in-depth understanding of how input perturbations propagate during the reasoning process and hinders further improvements in prompt optimization methods. Therefore, in this paper, we theoretically analyze the effect of input perturbations on the fluctuation of CoT outputs. We first derive an upper bound for input perturbations under the condition that the output fluctuation is within an acceptable range, based on which we prove that: (i) This upper bound is positively correlated with the number of reasoning steps in the CoT; (ii) Even an infinitely long reasoning process cannot eliminate the impact of input perturbations. We then apply these conclusions to the Linear Self-Attention (LSA) model, which can be viewed as a simplified version of the Transformer. For the LSA model, we prove that the upper bound for input perturbation is negatively correlated with the norms of the input embedding and hidden state vectors. To validate this theoretical analysis, we conduct experiments on three mainstream datasets and four mainstream models. The experimental results align with our theoretical analysis, empirically demonstrating the correctness of our findings.

Chain-of-Thought (CoT) is an effective method that enhances the performance of large language models (LLMs) by prompting the model to generate a step-by-step reasoning process, thereby improving the quality of the results [54]. However, numerous studies have indicated that CoT is highly sensitive to input, where subtle input perturbations can lead to significant performance fluctuations [67, 42]. To address this issue, researchers have proposed prompt optimization methods to enhance the reasoning performance of LLMs by refining the input prompt, lowering the effect of the input perturbation [45, 39]. For instance, TextGrad [62] optimizes prompts by constructing textual gradients, while OPRO [61] utilizes the LLM itself to iteratively generate more suitable prompts.

Despite this progress, a key gap remains: most studies treat CoT robustness as an empirical phenomenon, with little theoretical understanding of why and how perturbations propagating through the reasoning process of LLMs, thereby affecting the output fluctuation. Without such analysis, our understanding of CoT robustness remains incomplete, and prompt optimization risks being limited to ad-hoc techniques. This motivates a fundamental research question: what governs the CoT robustness of LLMs to input perturbations?

Following the previous work [21], we consider CoT as a multistep iterative process, with the output of each step serving as the input for the next. Our theoretical analysis shows that under the assumption of Lipschitz continuity [36, 7], longer CoT reasoning indeed reduces the fluctuation of outputs to input perturbations, but never fully eliminates them. Even with an infinite number of CoT steps, a non-zero robustness bound remains, suggesting that CoT inherently dampens but cannot completely neutralize perturbations.

To further ground our analysis, we investigate robustness in the Linear Self-Attention (LSA) model [47, 64], which is commonly adopted as a simplified version of Transformer [44] for analysis without loss of generality. We prove that CoT robustness highly depends on model-level factors: the sensitivity to perturbations correlates negatively with the norm of the input vector and the hidden state vectors. Additionally, we also discuss the impact of other factors in LSA on CoT robustness.