Language Understanding and Pragmatics Knowledge Retrieval and RAG Conversational AI Systems

How should systems handle contradictory opinions in user reviews?

When customers disagree about a product or service, should dialogue systems present all perspectives or select one? Understanding how to aggregate and balance diverse opinions affects whether users trust the response.

Note · 2026-02-22 · sourced from Conversation Architecture Structure
Why do AI conversations reliably break down after multiple turns? How should researchers navigate LLM reasoning research?

Most task-oriented dialogue research focuses on factual knowledge — FAQs, product specifications, service guides. But in many TOD tasks, users care about subjective insights: the experiences, opinions, and preferences of other customers. "Is the WIFI reliable?" or "Does the restaurant have a good atmosphere?" require subjective knowledge that factual databases cannot provide.

SK-TOD (Subjective-Knowledge-based Task-Oriented Dialogue) formalizes this gap. The key challenge: even for the same aspect of a product or service, customers may have different opinions. A hotel's WIFI might have 70% positive and 30% negative reviews. The system's response should include BOTH perspectives along with their proportions — two-sided responses have been recognized as more credible and valuable for customers.

This introduces three new challenges beyond standard TOD:

  1. Knowledge source shift — from structured databases to unstructured user reviews
  2. Opinion aggregation — synthesizing diverse, sometimes contradictory viewpoints
  3. Balanced presentation — representing both sides proportionally rather than cherry-picking

Current TOD approaches trained on factual knowledge fail at this because they are designed to retrieve single correct answers, not to aggregate and balance multiple perspectives.

Multi-source enrichment as partial fix: M-OS (Multi-Source Opinion Summarization) demonstrates that enriching review-based opinion summaries with technical specifications and product descriptions produces 87% user preference over standard opinion-only summaries. The mechanism: factual enrichment enables precise product comparisons that review-only approaches lack, addressing decision fatigue and information overload. M-OS evaluates across 7 dimensions (fluency, coherence, relevance, faithfulness, aspect coverage, sentiment consistency, specificity) and achieves ρ=0.74 Spearman correlation with human judgment. The implication for SK-TOD: combining subjective review aggregation WITH factual specifications creates more useful and complete responses than either alone.

This connects to a broader theme in the vault. Since Can LLMs generate more novel ideas than human experts?, LLMs have difficulty with evaluative tasks in general. Aggregating subjective reviews requires exactly this evaluative stance — weighing perspectives, judging representativeness, and presenting a balanced view rather than a confident single answer.


Source: Conversation Architecture Structure

Related concepts in this collection

Concept map
14 direct connections · 169 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

task-oriented systems that incorporate subjective knowledge from user reviews need to aggregate diverse opinions including positive and negative perspectives for credibility