LLM Reasoning and Architecture Agentic and Multi-Agent Systems

Can iterative revision cycles match how humans actually write?

Does framing research writing as a diffusion process—where drafts are refined through retrieval-augmented cycles—better capture human cognition than linear pipelines and reduce information loss?

Note · 2026-05-03 · sourced from Diffusion LLM

Existing deep research agents combine test-time scaling techniques (CoT, best-of-n, MCTS, debate, self-refinement) without a deliberate cognitive design. Most public agents employ a linear or parallelized pipeline of planning → searching → generation, which loses global context and misses critical dependencies. Cognitive studies of human writing (Flower and Hayes, 1981) show that people do not write linearly — they establish a high-level plan, draft based on the plan, and then engage in multiple revision cycles that interleave further information gathering with rewriting.

TTD-DR observes a structural similarity between this human pattern and retrieval-augmented diffusion sampling: a noisy initial draft is iteratively denoised toward higher-quality outputs, with each denoising step informed by retrieved external information. The framework operationalizes this as report-level diffusion — a preliminary draft serves as an updatable skeleton that evolves through iterative refinement, with each step augmented by targeted retrieval. The draft is a global anchor that maintains coherence across iterations, addressing the information-loss problem of linear pipelines.

Two mechanisms make the analogy useful in practice. Denoising with retrieval drives report-level evolution: the draft and research plan jointly steer the next retrieval, and retrieved content drives the next denoising step. Self-evolution operates at the component level: each unit agent (plan generator, question generator, answer searcher, report generator) undergoes its own optimization, mitigating per-component information loss across long agentic trajectories. The interplay is essential — without component-level self-evolution, the draft-level diffusion lacks high-quality context to refine on.

The conceptual yield is that diffusion is not just a generation technique but a process model for cognitively-inspired iterative work. Any task that humans approach as draft-and-revise rather than write-once-correctly — research reports, design documents, complex prose — is a candidate for the same draft-centric, retrieval-augmented diffusion treatment. The draft becomes the persistent state that the agentic system refines, rather than a final output produced by a feed-forward pipeline — analogous to how Why does vanilla RAG produce shallow and redundant results? argues iterative loops are required for depth.

Source: Diffusion LLM

Related concepts in this collection

Can diffusion models perform evolutionary search in parameter space? Diffusion models and evolutionary algorithms share equivalent mathematical structures. Can we leverage this equivalence to build evolutionary search methods that preserve solution diversity better than traditional algorithms?
extends: same diffusion-as-process equivalence; this note applies it to multi-step agentic work rather than parameter-space search
What makes deep research fundamentally different from RAG? Explores whether current systems using the label 'deep research' actually meet a rigorous three-component definition involving multi-step gathering, cross-source synthesis, and iterative refinement, or if they're performing something narrower.
exemplifies: TTD-DR meets all three components — its draft-level diffusion is iterative query refinement made structural
Can retrieval be scaled like reasoning at test time? Standard RAG retrieves once, but multi-hop tasks need adaptive retrieval. Can we train models to plan retrieval chains and vary their length at test time to improve accuracy, the way test-time scaling works for reasoning?
complements: CoRAG scales retrieval as chain-of-thought; TTD-DR scales it as denoising-of-draft
Why does vanilla RAG produce shallow and redundant results? Standard RAG systems get stuck in a single semantic neighborhood because their initial query determines what documents are discoverable. The question asks whether fixed retrieval strategies fundamentally limit knowledge depth compared to iterative exploration.
extends: same iterative-depth argument; TTD-DR organizes the iteration around a persistent draft rather than around expanding queries
Does limiting reasoning per turn improve multi-turn search quality? When language models engage in iterative search cycles, does capping reasoning at each turn—rather than just total compute—help preserve context for subsequent retrievals and improve overall search effectiveness?
complements: per-turn budget constraint applies to TTD-DR's component-level self-evolution
Can RAG systems safely learn from their own generated answers? Explores whether retrieval-augmented generation can feed its outputs back into the corpus without corrupting knowledge with hallucinations. The core problem: how to prevent feedback loops from compounding errors.
complements: write-back as the inter-session analog of TTD-DR's draft-as-persistent-state

Concept map

16 direct connections · 149 in 2-hop network ·dense cluster

Can iterative revision cycles match how humans a… Can diffusion models perform evolutionary search i… What makes deep research fundamentally different f… Can retrieval be scaled like reasoning at test tim… Why does vanilla RAG produce shallow and redundant… Does limiting reasoning per turn improve multi-tur… Can RAG systems safely learn from their own genera…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

research report writing maps onto diffusion sampling — drafts are noisy outputs and revision cycles are denoising steps augmented by retrieval