Why do the same users rate items differently each time?

User ratings are assumed to be clean preference signals, but do they actually fluctuate unpredictably? This matters because recommender systems rely on ratings as ground truth, yet temporal inconsistency and individual rating styles may contaminate that signal.

Note · 2026-05-03 · sourced from Recommenders General

The conventional reason recommender systems prefer explicit ratings (star ratings, thumbs up/down) over implicit feedback (clicks, watch time) is that explicit ratings are clean preference data. The user is directly stating "I like this." Amatriain, Pujol, and Oliver's experimental study evaluates this assumption and finds it doesn't hold.

The study has users rate the same items multiple times across spaced sessions. The same user gives substantially different ratings to the same item depending on when they rate. The variation is not just at the noise margin — users sometimes shift by multiple stars on the same item across sessions. The number of stars on a 5-star scale is not a stable property of the user's preference; it depends on mood, context, recently consumed alternatives, and rating-style at the moment.

The noise comes from multiple sources. Temporal inconsistency: the user's true preference may have shifted, but more often the rating itself fluctuates around a stable preference. Rater-specific style: some users use the full scale, some use only the top half, and these styles drift. Anchoring effects: a rating depends on what other items the user has recently rated.

The implication for recommender systems: rating data is preference data plus rating-noise plus rater-style, and conflating them produces biased models. Treating "5 stars" as a categorical labeling of "liked" understates the noise; treating the difference between 4 and 5 stars as meaningful overstates user precision. The paper undermines the cleanliness assumption that justified the field's preference for explicit ratings, which combined with the implicit-feedback availability and self-selection issues elsewhere in the literature, suggests the choice between explicit and implicit signals is more nuanced than the methodological canon admits.

Source: Recommenders General

Related concepts in this collection

Can implicit feedback reveal both preference and confidence? When users take implicit actions like purchases or watches, do those signals carry two separable pieces of information: what they prefer and how certain we should be? Explicit ratings can't make that distinction.
extends: noisy explicit ratings make the case for implicit feedback's preference-plus-confidence structure stronger
Do online reviews actually measure product quality or just buyer preferences? Online reviews come only from customers who already expected to like a product. This self-selection might hide the true quality signal beneath layers of preference bias and writing motivation. What can aggregated ratings actually tell us?
complements: rating noise compounds with self-selection bias — both undermine the "ratings as ground truth" assumption
Why do online reviewers publish negative ratings despite positive experiences? When people post reviews publicly, do they adjust their honest opinions to seem more discerning? Schlosser's experiments test whether audience awareness shifts how people rate products compared to private ratings.
complements: rater-style and audience-effects together describe how the same private preference can produce wildly different public ratings
Do online ratings actually reflect independent customer opinions? How much do previously-posted ratings shape the ones that come after, and does this social influence distort what ratings supposedly measure? Understanding this matters for anyone relying on review aggregates to judge product quality.
extends: the noise here is one part within-user; social-dynamics adds a between-user noise component

Concept map

13 direct connections · 89 in 2-hop network ·medium cluster

Why do the same users rate items differently eac… Can implicit feedback reveal both preference and c… Do online reviews actually measure product quality… Why do online reviewers publish negative ratings d… Do online ratings actually reflect independent cus…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

explicit user ratings are noisy — temporal inconsistency and rater idiosyncrasy contaminate the supposed ground truth