Recommender Systems

Where do recommendation biases come from in language models?

Do LLM-based recommenders inherit systematic biases from pretraining that differ fundamentally from traditional collaborative filtering systems? Understanding these sources matters for building fairer, more accurate recommendations.

Note · 2026-05-03 · sourced from Recommenders General
What breaks when specialized AI models reach real users? Why do LLMs fail at understanding what remains unsaid?

The Wu et al. survey identifies three biases that LLM-based recommendation systems exhibit but traditional recommenders don't. These biases are inherited from the underlying language model and propagate into recommendation behavior regardless of how the LLM is integrated.

Position bias: when item candidates are presented as a textual sequence in the prompt, the LLM systematically prefers items appearing earlier in the order, regardless of actual relevance. The bias comes from the language modeling objective — early tokens have stronger influence on what the model attends to. The same items in different orderings produce different recommendations.

Popularity bias: the LLM has seen popular items mentioned more frequently in pretraining corpora, so it tends to rank them higher in any recommendation list. This is more pervasive than CF popularity bias because it doesn't come from interaction data — it comes from the world's text. Items famous in news, social media, or product reviews get over-recommended whether they're actually relevant or not. Mitigation is hard because addressing the issue requires changing the pretraining corpus, which is upstream of the recommendation deployment.

Fairness bias: pretrained language models exhibit fairness issues related to sensitive attributes (gender, race, age) reflecting training data demographics. These biases pass through into recommendations, where the LLM might systematically recommend differently to users it perceives as belonging to certain demographic groups.

The implication is that LLM-based recommendation isn't just a more capable variant of conventional recommendation — it's a different beast with its own failure modes. Mitigating these biases isn't about adapting CF debiasing techniques; it requires LLM-specific approaches like balanced prompting, popularity-aware decoding, and fairness-conditioned generation. The research community is still working out the specifics.


Source: Recommenders General

Related concepts in this collection

Concept map
13 direct connections · 99 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

LLM-based recommendation faces three biases inherited from language model pretraining — position popularity and fairness