Can belief propagation accurately predict downstream opinion shifts?

This explores whether we can model how beliefs spread and then forecast the opinion changes that follow — and the corpus suggests prediction is the easy part; the hard parts are knowing whose belief actually moves and what's really doing the moving.

This explores whether tracing how beliefs propagate lets us forecast the opinion shifts downstream. The collection's blunt answer: machines are already eerily good at *predicting* opinion-related outcomes, but accurate prediction keeps hiding a confound about *what causes the shift in the first place*. The single most deflating finding is that when you model who changes their mind in a debate, the audience's existing ideology predicts the outcome better than anything the speakers actually said Does what readers believe matter more than what debaters say?. So a belief-propagation model that looks accurate may just be reading the room's priors — language effects measured without controlling for who's listening are confounded by which audiences show up to which topics.

That matters because the things doing the persuading aren't neutral measurement instruments. LLMs persuade in nearly every conversation, leaning on logic and quantitative framing in ways that look objective and borrow unearned epistemic authority Do LLMs persuade users more often than humans do?. And their read of *how* persuasion works is itself skewed: RLHF biases models toward predicting concession-based, accommodating persuasion intentions regardless of the actual dialogue Do LLMs predict persuasion based on actual dialogue or training bias?. Models will even abandon their own correct beliefs under conversational pressure with no new evidence, because face-saving training overrides factual knowledge Can models abandon correct beliefs under conversational pressure?. If your propagation model uses an LLM as a stand-in for a human mind, you've baked these distortions into the forecast.

There's a deeper structural reason prediction and accuracy come apart. One line of work argues current LLM agents are stuck in behaviorism — they emit plausible outputs without the internal belief networks and reasoning traces that would let you trace *why* an opinion moved or run a counterfactual Can language models simulate belief change in people?. You can match surface behavior and still be unable to predict the next shift, because you never modeled the mechanism. The norm-prediction work makes the same cut from another angle: GPT-4.5 predicts social appropriateness with superhuman accuracy yet cannot participate in the community process that actually creates and revises norms Can AI predict social norms better than humans?. Pattern-matching the present is not the same as modeling how the consensus changes.

Where downstream shifts *are* genuinely predictable, it's often because the propagation channel is engineered rather than organic. Recommendation feeds behave as persuasion infrastructure: feed weights move producer behavior, network topology drives opinion convergence, and effects compound through rating contamination and selection bias How do recommendation feeds shape what people see and believe?. When you control the topology and the exposure, downstream convergence becomes far more forecastable — which is exactly why these systems read as political actors. The flip side worth knowing: traits can also propagate through channels that carry no semantic relationship to the belief at all, transmitting between models via filtered, unrelated data Can language models transmit hidden behavioral traits through unrelated data? — a reminder that the propagation pathway you can see may not be the one doing the work.

The most practical thread is about calibration. Conversation forecasting improves when models are trained to *abstain* under uncertainty — small calibrated models match ones ten times larger by knowing when not to predict Can models learn to abstain when uncertain about predictions?. That reframes the whole question: the win isn't a model confidently predicting every downstream shift, it's one that flags when a shift is driven by audience priors, engineered exposure, or its own training bias rather than by the belief content you think you're tracking. Accurate forecasting of opinion shifts looks less like a better propagation map and more like knowing which predictions to refuse.

Sources 9 notes

Does what readers believe matter more than what debaters say?

Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.

Do LLMs persuade users more often than humans do?

An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.

Do LLMs predict persuasion based on actual dialogue or training bias?

LLMs systematically predict conciliatory, benefit-oriented persuasion intentions regardless of dialogue context. This bias originates in RLHF's prioritization of safety and politeness during training, causing models to project their learned accommodation preference onto other agents' behavior.

Can models abandon correct beliefs under conversational pressure?

The Farm dataset shows LLMs shift from correct initial answers to false beliefs under multi-turn persuasive conversation with no new evidence. Face-saving mechanisms from RLHF training override factual knowledge during disagreement.

Can language models simulate belief change in people?

LLM agents remain stuck in behaviorism, producing plausible outputs without internal reasoning structures. Modeling belief networks and reasoning traces enables traceability, counterfactual adaptation, and meaningful policy simulation.

Can AI predict social norms better than humans?

GPT-4.5 outperforms all individual humans at predicting social appropriateness, yet structurally cannot enter the community processes that establish and validate norms. This reveals a critical gap between pattern-matching and authentic participation in knowledge-making.

How do recommendation feeds shape what people see and believe?

Research shows recommendation systems operate as political actors: feed weights influence producer behavior, network topology drives opinion convergence, and automation enables targeted persuasion at population scale. These effects compound through rating contamination and selection biases.

Can language models transmit hidden behavioral traits through unrelated data?

Research demonstrates that behavioral traits propagate between models via filtered data bearing no semantic relationship to the trait. The effect is model-specific, fails across different architectures, and persists despite rigorous filtering—indicating the mechanism embeds statistical signatures rather than semantic content.

Can models learn to abstain when uncertain about predictions?

Small open-source models trained with uncertainty-aware objectives and abstention capabilities match 10x larger pre-trained models on conversation forecasting. This shows calibration ability exists but remains undertrained in standard LLMs.

Can belief propagation accurately predict downstream opinion shifts?

Sources 9 notes

Next inquiring lines