LLM Reasoning and Architecture Language Understanding and Pragmatics Conversational AI Systems

Can storing evolved thoughts prevent inconsistent reasoning in conversations?

When LLMs repeatedly reason over the same conversation history for different questions, they produce inconsistent results. Can storing pre-reasoned thoughts instead of raw history solve this problem?

Note · 2026-02-23 · sourced from Memory
Why do AI conversations reliably break down after multiple turns? How should we allocate compute budget at inference time? How should researchers navigate LLM reasoning research?

Think-in-Memory (TiM) addresses a specific failure mode: when memory-augmented LLMs repeatedly recall and reason over the same conversation history for different questions, they produce inconsistent reasoning results. The same facts, recalled for different purposes, yield different inferences — not because the facts changed, but because LLMs generate diverse reasoning paths for the same query.

The solution inverts the standard recall-then-reason cycle. Instead of storing raw history and reasoning over it each time, TiM stores THOUGHTS — the products of reasoning:

  1. Before responding: recall relevant thoughts from memory (not raw history)
  2. After responding: post-think — integrate both historical and new thoughts, then update memory

The memory evolves through three operations:

This is effectively sleep-time compute applied to conversational memory. Since Can models precompute answers before users ask questions?, TiM applies the same principle to conversation: rather than reasoning over raw history at query time (expensive, inconsistent), reason once during a post-thinking phase and store the result. Future queries retrieve pre-reasoned thoughts rather than re-deriving them.

The inconsistent reasoning problem is not trivial. If a user asks "what does Alice prefer for breakfast?" and later "what should I bring to Alice's house?", both queries retrieve the same conversational evidence about Alice. But the different framing of the query can lead the model to different conclusions from identical evidence. Storing the post-thinking thought ("Alice prefers coffee in the morning") eliminates this inconsistency because the reasoning is done once and reused.


Source: Memory

Related concepts in this collection

Concept map
13 direct connections · 128 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

post-thinking stores evolved thoughts in memory to eliminate repeated reasoning over conversation history