INQUIRING LINE

How do insert-expansions differ from third position repair in timing?

This reads as a Conversation Analysis question — about the timing of two repair mechanisms in talk-in-interaction (insert-expansions, which suspend a sequence mid-stream to handle a problem before answering; third-position repair, which corrects a misunderstanding after a response reveals it) — and that's territory this collection doesn't actually cover.


Let me be straight about the fit first: you're asking a Conversation Analysis question — insert-expansion and third-position repair are sequence-organization concepts from the study of human talk, and they differ precisely in *when* repair lands (an insert-expansion pauses a sequence before the second move completes; third-position repair arrives only after a response exposes that the first speaker was misheard). This corpus is about LLM reasoning, verification, and inference-time compute. It has nothing on conversational repair, adjacency pairs, or turn-taking timing, so I can't synthesize an answer to the question you actually asked. I'd rather tell you that than dress up unrelated papers as if they fit.

What the collection *does* hold that rhymes structurally — and only structurally — is a body of work on *when in a process you catch and fix a problem*. That's a different domain wearing a faintly similar shape. The clearest example is the split between fixing errors mid-stream versus after the fact: process verification that checks intermediate steps *during* generation catches failures that scoring only the final answer misses entirely, lifting task success from 32% to 87% Where do reasoning agents actually fail during long traces?. The mechanism for doing this cheaply — verifiers that run alongside a single reasoning trace and intervene *only when a violation occurs* — is loosely the machine analog of an insert-expansion: interrupt the sequence to repair, then resume Can verifiers monitor reasoning without slowing generation down?.

If you want the deeper structural twin of "third-position repair" — repair that can only happen *after* the trouble has already propagated — the corpus has a sharp negative case: frontier models silently corrupt ~25% of document content over long relay tasks, with errors compounding round after round because nothing catches the misunderstanding until it's far downstream Do frontier LLMs silently corrupt documents in long workflows?. And there's a genuinely interesting finding that the *timing* of intervention isn't the real lever there — better tools don't help, because the trouble originates upstream in the model's judgment about what to change Can better tools fix LLM document editing errors?.

So the honest synthesis: the collection can't speak to repair timing in human conversation, but it has a recurring theme worth knowing about — *when* you verify (continuously, mid-trace) versus *whether you can repair at all once an error has already cascaded* turns out to be one of the load-bearing distinctions in how reliable these systems are. That's the nearest doorway, offered as a bridge, not as an answer to your question.


Sources 4 notes

Where do reasoning agents actually fail during long traces?

Reliability for long-trace reasoning comes from checking intermediate states and policy compliance during generation, not from scoring final outputs. Adding intermediate verification raised task success from 32% to 87% because most failures are process violations, not wrong answers.

Can verifiers monitor reasoning without slowing generation down?

Decoupling verification from generation lets verifiers run alongside a single trace, forking to extract verifiable state and intervening only on violations. On correct runs the latency penalty is near-zero; interwhen matches or beats CoT across benchmarks at similar token budgets.

Do frontier LLMs silently corrupt documents in long workflows?

Testing 19 models across 52 domains shows even advanced systems degrade documents by ~25% over extended relay tasks, with errors compounding silently without plateauing through 50 round-trips.

Can better tools fix LLM document editing errors?

DELEGATE-52 shows that agentic tool access fails to improve performance on long-horizon document tasks. The degradation mechanism originates upstream in the model's judgment about what to change, not in editing interface limitations.

Next inquiring lines