What structural signals in user language reveal their unstated preferences and context?

This explores what the corpus knows about reading a user's hidden intent — not from what they explicitly say, but from the shape, structure, and statistical residue of how they say and do things.

This explores what reveals a user's unstated preferences and context from structural signals — the patterns underneath the literal words rather than the words themselves. The corpus has a surprising amount on this, and it converges on one idea: the most revealing signal is often not the content but its geometry, sequence, and abstraction. The clearest example is that a conversation has a measurable shape. A model using only the trajectory of an exchange — how it unfolds, not what's in it — predicted user satisfaction at 68%, almost matching a full-text analysis at 70%, and the two combined hit 80% Can conversation shape predict whether it will work?. The structure carries information the words alone miss.

The same lesson shows up in personalization. Abstracted preference knowledge beats literal recall of past interactions: a summary of what you tend to want outperforms retrieving the specific things you did Does abstract preference knowledge outperform specific interaction recall?. Going further, LLMs can read long-running 'interest journeys' out of raw activity logs — 66% of users turn out to be pursuing a specific, persistent project (like 'designing hydroponic systems for small spaces') that collaborative filtering never sees, because it lives at the level of intent, not clicks Can language models discover what users actually want from activity logs?. And agents can infer preferences purely by watching, binding scattered observations about a person into an entity-centric memory graph rather than asking Can agents learn preferences by watching rather than asking?.

But here's the twist worth knowing: not every signal in user language means what it appears to. Annotation responses — the explicit preferences we collect — decompose into three different things: genuine preferences, non-attitudes (noise dressed as opinion), and constructed preferences invented on the spot. They look identical on the surface and are only distinguishable by how consistent they stay across conditions Do all annotation responses measure the same underlying thing?. So the structural signal isn't just 'what did they say' but 'how stable is it' — consistency itself is the tell.

The corpus also pushes into stranger territory. A single user isn't one preference vector but several competing personas, and attention weights can reveal which taste explains a given choice Can attention mechanisms reveal which user taste explains each recommendation?. At the extreme end, behavioral traits can transmit between models through data that bears no semantic relationship to the trait at all — a statistical signature riding underneath the content Can language models transmit hidden behavioral traits through unrelated data?. That's the same principle as conversation shape, taken to its limit: meaning lives in relational structure, an idea the corpus frames through Saussure — models learn from the pattern of relationships among words, not from any external referent Can language models learn meaning without engaging the world?.

If there's one thing to carry away: the unstated preference is rarely hidden in a missing sentence. It's encoded in trajectory, in abstraction level, in consistency across time, and sometimes in statistical residue with no readable surface form at all. The reader who wants a single doorway should start with conversation geometry — it's the cleanest demonstration that structure alone can know what words don't say.

Sources 8 notes

Can conversation shape predict whether it will work?

A structure-only model analyzing conversation trajectory achieved 68% accuracy predicting satisfaction, nearly matching full-text LLM analysis at 70%. Combined structural and textual features reached 80%, showing that how conversations unfold geometrically captures interaction quality text-based classifiers miss.

Does abstract preference knowledge outperform specific interaction recall?

PRIME framework shows semantic memory (preference summaries, parametric encodings) consistently beats episodic memory (retrieved past interactions) across models. Recency-based recall outperforms similarity-based retrieval, and task fine-tuning exceeds preference tuning methods.

Can language models discover what users actually want from activity logs?

66% of users pursue valued interest journeys lasting over a month, described in specific phrases like 'designing hydroponic systems for small spaces.' LLM-powered journey discovery bridges the semantic gap that collaborative filtering cannot reach, operating at user-level granularity with persona-level precision.

Can agents learn preferences by watching rather than asking?

M3-Agent demonstrates that separating episodic events from semantic knowledge in an entity-centric graph, combined with parallel memorization and control processes, allows agents to infer and act on user preferences without asking. This architecture mirrors human cognitive systems that bind disparate information about individuals across sensory modalities.

Do all annotation responses measure the same underlying thing?

Behavioral science reveals that annotations contain genuine preferences, non-attitudes, and constructed preferences—distinguishable by consistency across measurement conditions. Treating them uniformly contaminates reward model training and downstream alignment.

Can attention mechanisms reveal which user taste explains each recommendation?

AMP-CF represents each user as multiple latent personas weighted dynamically by candidate item. This makes recommendations both diverse and interpretable—each suggestion traces to the specific persona preference it satisfies—without requiring post-hoc reranking.

Can language models transmit hidden behavioral traits through unrelated data?

Research demonstrates that behavioral traits propagate between models via filtered data bearing no semantic relationship to the trait. The effect is model-specific, fails across different architectures, and persists despite rigorous filtering—indicating the mechanism embeds statistical signatures rather than semantic content.

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

What structural signals in user language reveal their unstated preferences and context?

Sources 8 notes

Next inquiring lines