Why do conversational queries drift away from what triggered them?
This explores why an AI's answers in a back-and-forth conversation gradually lose touch with the moment or message that actually prompted the exchange — and what the corpus says causes that drift.
This explores why conversational answers wander off from the thing that actually triggered them. The corpus points to a surprising culprit: the cause of a query is often not the thing that looks most similar to it. Why do queries and their causes seem semantically different? shows that 'backtracing' — finding what prompted a question — diverges sharply from semantic similarity. A student asks about 'projection' after one specific remark, but the passage that looks closest discusses projection *matrices* instead. So any system that retrieves by surface resemblance quietly substitutes a look-alike for the real trigger, and the conversation drifts from there.
Drift also comes from the human side, not just the machine. Why do users drift away from their original information need? revives Belkin's idea that people who don't yet know enough to phrase their need will *unintentionally* slide into sub-topics — and that this slide is detectable (84% precision) rather than random. The conversation is leaking intent even before the model responds. The model then compounds it: Why do language models lose performance in longer conversations? argues that what looks like 'getting dumber over a long chat' is really an intent-alignment gap, because RLHF rewards confidently answering early over pausing to clarify. Why do language models respond passively instead of asking clarifying questions? names the mechanism precisely: training optimizes the *next* turn's helpfulness, so the model never learns that asking a clarifying question now pays off later. Drift is partly an artifact of how we reward models.
Then there's the problem of carrying everything along. Intuitively, dumping the whole conversation into context should keep things on-track — but Does including all conversation history actually help retrieval? finds the opposite: topic switches inject irrelevant turns, and *selecting* the relevant history beats including all of it. Memory structure matters too. Why do dialogue systems lose context when topics return? shows rigid stack-based structures lose context when a popped topic comes back, while attention can reach any earlier turn — human conversation interleaves and revisits, and brittle structures can't follow it. And Why do time-based queries fail in conversational retrieval systems? adds that vague references like 'tell me more about that' need disambiguation *before* retrieval even runs — get the referent wrong and you've drifted.
The deeper, more interesting framing is that some of this isn't a bug at all — it's a missing skill. Why do language models engage with conversational distractors? shows models learn 'what to do' instructions but never 'what to ignore,' so they happily engage with distractors; a mere 1,080 dialogues with planted distractors restores topic resilience. Why don't language models develop conversation maintenance skills? goes further: humans hold a thread together through implicit *social* repair — reference fixing, topic hand-offs — that models never acquire because training rewards predicting information, not doing relational work. So conversational queries drift away from their trigger for a stack of reasons that reinforce each other: the trigger was never the most similar thing, the human was already drifting, the reward signal punishes clarifying, naive memory drags in noise, and the model was never taught the quiet maintenance moves that keep a conversation anchored. The fixes worth watching are the ones that intervene *before* misunderstanding rather than after — When should AI agents ask users instead of just searching? and Can models learn to ask clarifying questions instead of guessing? both reframe drift as something a model can be trained to catch by asking, not guessing.
Sources 11 notes
Backtracing—finding what caused a query—diverges from semantic similarity especially in conversation and lecture domains. Students ask about projection after hearing a specific statement, but the semantically closest passage discusses projection matrices instead, showing that surface similarity misses the actual cause.
Belkin & Vickery's anomalous state of knowledge explains why users pursuing one information need gradually deviate into sub-topics. Topic shift detection models identify this drift with 84% precision without predetermined topic sets.
LLMs degrade in multi-turn settings because RLHF training rewards premature answers over clarification-seeking, creating pragmatic mismatch with individual user behaviors. A Mediator-Assistant architecture that explicitly parses user intent before execution recovers lost performance without retraining.
CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.
Research shows that automatically selecting relevant previous turns improves retrieval effectiveness more than including all context. Topic switches inject irrelevant information; joint optimization of selection and retrieval beats both full-context baselines and human annotation.
Research shows stack-based dialogue structures lose context when popped topics are revisited, while transformer attention enables systems to retrieve any previous turn without structural loss. Attention-based approaches naturally support the interleaved, revisiting nature of human conversation.
Conversational memory faces two distinct retrieval challenges absent from static databases: time-based queries ("what did we discuss Tuesday?") requiring metadata indexing, and ambiguous references ("tell me more about that") requiring contextual disambiguation before retrieval.
Fine-tuning on just 1,080 synthetic dialogues with distractor turns significantly improves topic resilience, revealing that the gap is not model capacity but absent training signal. Models learn to follow what-to-do instructions but not what-to-ignore instructions.
Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.
Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.
Reinforcement learning training increased proactive critical thinking accuracy from 0.15% to 73.98% on deliberately flawed math problems. Notably, inference-time scaling degraded this ability in untrained models but improved it after RL training, suggesting the capability is learnable but fragile without explicit training.