LLM Reasoning and Architecture Language Understanding and Pragmatics Psychology and Social Cognition

Can dialogue format help models reason more diversely?

Explores whether structuring internal reasoning as multi-agent dialogue rather than monologue can improve strategy diversity and coherency across different problem types, using the Compound-QA benchmark.

Note · 2026-02-22 · sourced from Conversation Architecture Structure
How should we allocate compute budget at inference time? What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

Current reasoning models (o1, R1, DeepSeek) use monologue-style reasoning within a think block: a single continuous chain of internal text. DialogueReason identifies two systematic weaknesses in this approach:

Low diversity — models persistently apply fixed strategies across diverse problems. When problems require different approaches (BFS for combinatorial, DFS for geometric proofs), monologue reasoning recycles the same strategy.

Low coherency — frequent shifts in attention within a single reasoning path. Repetitive hesitations ("Wait..."), unnecessary switches between ideas. The reasoning becomes fragmented, difficult to interpret, and often ineffective — swinging between overcommitting to one strategy and neglecting alternatives.

The Compound-QA task makes this visible: concatenating multiple independently solvable problems into a single prompt forces the model to demonstrate both diverse strategies and maintained coherency. Monologue reasoning fails at exactly this combination.

DialogueReason proposes dialogue-based internal reasoning structured through three dimensions:

The mechanism is scene-switching: the model sets up a dedicated scene for each question ("Quantum Café"), introduces characters with distinct expertise, and resolves through dialogue. When transitioning to the next question, it constructs a new environment ("Theoretical Physics Hall") with different characters. This prevents cross-problem interference while maintaining per-problem coherency.

This is distinct from multi-agent debate systems, which use SEPARATE models. DialogueReason is a SINGLE model that reasons in dialogue format — the diversity comes from internal role differentiation, not from aggregating multiple independent models. Since Why does parallel reasoning outperform single chain thinking?, DialogueReason achieves a related advantage through a different mechanism: not multiple parallel chains, but structured internal dialogue that naturally explores multiple strategies.

The connection to reasoning format effects is direct: since Does training data format shape reasoning strategy more than domain?, having the model reason in dialogue format activates different reasoning strategies than monologue format — the format IS the intervention.


Source: Conversation Architecture Structure

Related concepts in this collection

Concept map
16 direct connections · 176 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

dialogue-based reasoning outperforms monologue reasoning on diversity and coherency by structuring internal thought as multi-agent interaction within defined scenes