Chain of Draft: Thinking Faster by Writing Less

Paper · arXiv 2502.18600 · Published February 25, 2025

In this work, we propose Chain of Draft (CoD), a novel paradigm inspired by human cognitive processes, where LLMs generate minimalistic yet informative intermediate reasoning outputs while solving tasks. By reducing verbosity and focusing on critical insights, CoD matches or surpasses CoT in accuracy while using as little as only 7.6% of the tokens, significantly reducing cost and latency across various reasoning tasks.

intuition behind Chain of Draft is rooted in how humans externalize thought. When solving complex tasks — whether solving mathematical problems, drafting essays, or coding — we often jot down only the critical pieces of information that help us progress. By emulating this behavior, LLMs can focus on advancing toward solutions without the overhead of verbose reasoning

LLMs often produce excessively verbose reasoning steps, consuming a substantial number of tokens before arriving at a final answer. In contrast, humans tend to adopt a more concise approach when solving complex problems involving multi-step reasoning, such as mathematical or logical puzzles. Rather than elaborating on every detail, humans typically jot down only the essential intermediate results— minimal drafts — to facilitate their thought processes. Inspired by this natural tendency, we propose a novel prompting strategy called Chain-of- Draft (CoD). This approach aims to reduce verbosity by limiting the number of words used in each reasoning step, focusing only on the essential calculations or transformations needed to progress.

This verbosity inflates the token

count and increases the response latency. Chain-of-Thought Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny? A: Let’s think through this step by step:

Initially, Jason had 20 lollipops.
After giving some to Denny, Jason now has 12 lollipops.
To find out how many lollipops Jason gave to Denny, we need to calculate the difference between the initial number of lollipops and the remaining number.
We can set up a simple subtraction problem: Initial number of lollipops - Remaining number of lollipops = Lollipops given to Denny
Putting in the numbers: 20 - 12 = Lollipops given to Denny
Solving the subtraction: 20 - 12 = 8 Therefore, Jason gave 8 lollipops to Denny. 8 lollipops By contrast, Chain-of-Draft prompting condenses the reasoning process into minimal, abstract representations. Here, the reasoning is distilled into a concise equation, focusing solely on the essential mathematical operation required to arrive at the solution. By abstracting away irrelevant contextual details, CoD significantly reduces the token count while maintaining transparency and correctness. Chain-of-Draft A: 20 - x = 12; x = 20 - 12 = 8. #### 8