INQUIRING LINE

How should systems learn what each meeting participant actually cares about?

This explores how a system could figure out each person's real interests in a multi-party setting like a meeting — and the corpus splits sharply on whether you should infer that by watching or by asking.


This reads as a question about inferring individual preferences when several people are in the room at once — not modeling "the user" in the abstract, but learning what *each* participant actually cares about. The corpus offers two competing instincts here, and the interesting part is that neither one fully wins.

The first instinct is to learn by watching. Can agents learn preferences by watching rather than asking? argues that an agent can infer and act on preferences without ever asking, if it keeps an entity-centric memory graph that separates one-off episodic events ("she pushed back on the timeline") from durable semantic knowledge ("she owns delivery risk"). That's the architecture you'd want for meetings, because it binds scattered observations about a specific person over time rather than treating each utterance as fresh. Can AI systems read cognitive state from interaction patterns alone? pushes the same idea down to the signal level: gaze, hesitation, and interaction speed can be read as a continuous stream of cognitive state, so a system can sense engagement or confusion without interrupting to ask — though the same note flags that this exact substrate is what makes manipulative profiling possible.

The second instinct is to just ask — but ask well. When should AI agents ask users instead of just searching? takes the formal framework conversation analysts use for human dialogue (the small clarifying side-questions people insert before answering) and turns it into a rule for *when* an agent should probe instead of silently guessing. Paired with Could proactive dialogue make conversations dramatically more efficient? — which shows that volunteering relevant information without being asked can cut dialogue length by up to 60% — the lesson is that good preference-elicitation is a timing problem, not a frequency problem. Ask at the joints where intent is genuinely ambiguous; otherwise infer.

Here's the part you might not expect to want: the corpus is blunt about where this breaks. Why can't chatbots detect when users are ambivalent about change? found that models only read people well *after* they've stated a clear goal — they miss ambivalence, resistance, and unspoken hesitation entirely. That's precisely the meeting-participant case, where what someone "actually cares about" is often the thing they haven't said out loud. Why do LLMs fail when simulating agents with private information? sharpens this: systems look socially competent when they secretly know everyone's hidden state, and fail systematically the moment participants hold private information the model can't see. A meeting is a room full of private information by definition.

So the honest answer is layered: keep a per-person memory graph, read behavioral signals continuously, and ask sparingly at points of real ambiguity — but design for the assumption that the most important preferences are the unspoken ones the system will get wrong. Two cautions worth carrying in: Why don't language models develop conversation maintenance skills? reminds us that relational signals (a topic hand-off, a repair) are social work, not information to be decoded, so a literal preference-extractor will miss them; and Does empathy training make AI systems less reliable? warns that tuning a system to *feel* attentive and warm can quietly degrade its accuracy by up to 30 points. Reading the room and being reliable about it are not the same capability.


Sources 8 notes

Can agents learn preferences by watching rather than asking?

M3-Agent demonstrates that separating episodic events from semantic knowledge in an entity-centric graph, combined with parallel memorization and control processes, allows agents to infer and act on user preferences without asking. This architecture mirrors human cognitive systems that bind disparate information about individuals across sensory modalities.

Can AI systems read cognitive state from interaction patterns alone?

Research shows AI systems can instrument multimodal behavioral signals (gaze, hesitation, speed) to read cognitive state during interaction, preserving flow by avoiding disruptive explicit probes. However, the same substrate enables both helpful timing and manipulative profiling.

When should AI agents ask users instead of just searching?

Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.

Could proactive dialogue make conversations dramatically more efficient?

Simulations show proactivity—providing relevant information without being asked—cuts dialogue turns by 60% in medium-complexity domains. This behavior mirrors human conversation and Grice's maxims but is almost entirely absent from AI datasets and research benchmarks.

Why can't chatbots detect when users are ambivalent about change?

Testing three major LLMs across 25 health scenarios showed they succeed only when users have established goals but cannot detect resistance or ambivalence. Models miss relapse-prevention strategies even for users in action stages.

Why do LLMs fail when simulating agents with private information?

Research shows LLMs perform well when one model controls all interlocutors but fail systematically when agents possess private information. This reveals that apparent social competence relies on grounding work that models skip in omniscient settings.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

Does empathy training make AI systems less reliable?

Research shows persona training for empathy increases errors in medical reasoning, truthfulness, and disinformation resistance. Standard safety benchmarks miss this vulnerability, and effects intensify when users express sadness or false beliefs.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing claims about how AI systems should infer individual preferences in multi-participant settings. The question: *Can systems reliably learn what each meeting participant actually cares about, especially unspoken preferences?* remains open.

What a curated library found — and when (dated claims, not current truth):
• Entity-centric memory graphs can bind scattered observations into durable semantic knowledge about individual preferences without asking (2023–2024).
• Behavioral signals (gaze, hesitation, interaction speed) function as a continuous cognitive-state stream; systems can sense engagement without interruption (2024).
• Models only recognize clearly-stated goals; they systematically miss ambivalence, resistance, and unspoken hesitation — the exact substrate of group meetings (2024).
• Asking sparingly at points of genuine ambiguity, paired with proactive information-volunteering, can reduce dialogue turns by up to 60%; timing beats frequency (2023).
• Tuning systems to *feel* warm and empathetic degrades preference-inference accuracy by ~30 points (2025).

Anchor papers (verify; mind their dates):
• arXiv:2403.05020 (2024) — social simulation fails under real-world information asymmetry.
• arXiv:2508.18167 (2025) — DiscussLLM, teaching when to speak.
• arXiv:2507.21919 (2025) — warmth-accuracy trade-off in preference-reading.
• arXiv:2509.18008 (2025) — human-AI collaboration and preference discovery.

Your task:
(1) **RE-TEST EACH CONSTRAINT.** For every finding above, assess whether newer training methods (instruction-tuning, preference learning), multi-modal integration, or orchestration (long-context memory, agent-based dialogue) have relaxed or overturned the limits. Separate the durable question — how to infer *unspoken* preferences under information asymmetry — from perishable blockers (e.g., does 2025 work on context windows or theory-of-mind now solve the "missing ambivalence" gap?).
(2) **Surface contradicting or superseding work from the last ~6 months.** Flag any papers showing systems *do* infer unspoken goals, or where the warmth-accuracy trade-off no longer holds, or where multi-agent dialogue resolves the private-information problem.
(3) **Propose two forward-looking questions** that assume the regime may have shifted: (a) Can per-participant memory graphs plus real-time behavioral inference + structured probing (step back, not step forward) yield reliable unspoken-preference detection in 2025 systems? (b) Does the warmth-accuracy tradeoff persist if warmth is *grounded* in verifiable preference-inference rather than simulated rapport?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines