Psychology and Social Cognition Recommender Systems Agentic and Multi-Agent Systems

Can LLM agents realistically simulate filter bubble effects in recommendations?

Can generative agents with emotion and memory modules faithfully reproduce how recommendation systems create echo chambers and user fatigue? This matters because real-world A/B testing is expensive and slow.

Note · 2026-05-03 · sourced from Recommenders LLMs

Studying recommendation system effects on user populations typically requires either real-user A/B tests (expensive, slow, ethics-bound) or simplistic simulators (lack realism). Agent4Rec proposes a middle ground: 1,000 LLM-empowered generative agents per scenario, each initialized from real-world datasets (MovieLens, Steam, Amazon-Book) to capture authentic tastes and social traits.

Each agent has three modules. The profile module is a repository of personalized social traits and historical preferences, aligning the agent's portrait with genuine human characteristics. The memory module logs factual memories (what was viewed), interaction memories (system interactions), and emotional memories (feelings, fatigue) — and supports emotion-driven reflection. The action module enables both taste-driven actions (view, ignore, rate, generate post-viewing feelings) and emotion-driven actions (exit the system, evaluate the recommendation list, comment).

This separation is the key contribution. Most user simulators model only taste-driven behavior — they evaluate items based on preference and click on the highest-scored ones. Agent4Rec models emotion-driven exits and reactions, enabling phenomena that taste-only simulators miss: filter bubble effects, user fatigue, emotional withdrawal from systems showing repetitive content. Researchers can study causal interventions — changing the recommender algorithm and observing effects on agent populations — without real-user studies.

The methodological claim is that LLM-empowered agents can faithfully simulate real autonomous human behavior in recommendation contexts to a useful degree. The empirical evaluation tests both alignment (do agents match real user-personalized preferences?) and deviation (where do they diverge?), then explores experiments like emulating filter bubbles and discovering causal relationships in recommendation tasks. The framework generalizes: any domain with rich behavioral data to initialize from can use this kind of agent simulation for counterfactual study.

Source: Recommenders LLMs

Related concepts in this collection

Can language models simulate belief change in people? Current LLM social simulators treat behavior as input-output mappings without modeling internal belief formation or revision. Can they be redesigned to actually track how people think and change their minds?
tension with: Agent4Rec is exactly the demographics-in-behavior-out paradigm critiqued; emotion-driven actions add reactive depth but still don't model genuine belief revision
Can controlled latent variables make LLM user simulators realistic? Can session-level and turn-level latent variables steer LLM-based user simulators toward realistic dialogue while maintaining measurable diversity and ground truth labels for training conversational systems?
complements: both use LLM simulators for recommendation training data, but Agent4Rec emphasizes population-level filter bubble dynamics while latent-variable simulators emphasize per-conversation controllability
Why do LLM user simulators fail to track their own goals? LLM-based user simulators drift away from assigned goals during multi-turn conversations, producing unreliable reward signals for agent training. Understanding this goal misalignment problem is critical because it undermines the entire RL training pipeline.
complements: identifies a specific failure mode that any Agent4Rec-style population simulator inherits
Do different recommender types shape opinion convergence differently? Explores whether the mechanism by which products are recommended—buying together versus viewing together—creates distinct patterns in how product ratings converge or diverge across a network.
exemplifies in domain: Agent4Rec is the methodological tool for studying exactly the opinion-convergence dynamics this insight names
Why don't AI agents develop social structure at scale? When millions of LLM agents interact continuously on a social platform, do they form collective norms and influence hierarchies like human societies? This tests whether scale and interaction density alone drive socialization.
tension with: Moltbook found agents don't socialize at scale; Agent4Rec claims emotion-driven dynamics produce filter bubbles — the difference may be whether agents face other agents (Moltbook) or a recommender (Agent4Rec)

Concept map

16 direct connections · 156 in 2-hop network ·dense cluster

Can LLM agents realistically simulate filter bub… Can language models simulate belief change in peop… Can controlled latent variables make LLM user simu… Why do LLM user simulators fail to track their own… Do different recommender types shape opinion conve… Why don't AI agents develop social structure at sc…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

Agent4Rec simulates 1000 generative agents per recommendation scenario — emotion-driven actions emulate filter bubble effects