How do rating anchors shift meaning within short temporal windows for individual users?

This explores why a single user's rating scale isn't fixed — how 'what a 4 means to me' drifts depending on what I just saw, rated, or was shown, even over hours or days.

This explores rating *anchors* — the idea that a number like '4 stars' isn't a stable readout of preference but a comparison against whatever the user just encountered, and that the comparison point moves within short windows. The corpus's clearest evidence for this comes from work showing that the same user gives the same item substantially different ratings across sessions, sometimes shifting by multiple stars Why do the same users rate items differently each time?. The takeaway worth sitting with: a rating reflects both preference *and* rating-behavior at that moment. If you rated three mediocre films last night, tonight's average film looks better — the anchor moved, not the film.

The collection lets you trace where that moving anchor comes from. One source is internal and idiosyncratic — temporal inconsistency and personal rater style. But another is social and external: prior ratings visibly shape the ratings that come after them Do online ratings actually reflect independent customer opinions?. Once you see a 4.5 average, your own '4' carries a different meaning than it would on a blank slate. So the anchor isn't only 'what I recently rated' — it's also 'what I was recently shown,' which the platform controls.

A deeper reframing comes from work that decomposes responses into genuine preferences, non-attitudes, and *constructed* preferences — distinguishable by how consistent they stay across measurement conditions Do all annotation responses measure the same underlying thing?. This is the conceptual engine behind anchor-shift: a constructed preference is, by definition, built in the moment from available cues. The reason the meaning of an anchor moves within a short window is that for many items the user never had a stable preference to begin with — they assemble one on the spot, and the spot keeps changing.

Laterally, the corpus suggests these per-user shifts don't stay private. Because recommendation feeds decide what's shown together, they shape both who rates a product and against what they compare it Do different recommender types shape opinion convergence differently?. Frequently-bought-together networks pull different audiences with different baselines than co-viewed ones, so the platform is effectively setting different anchors for different people. At scale, these individual anchor drifts compound into feed-level effects, where ratings become persuasion infrastructure rather than neutral measurement How do recommendation feeds shape what people see and believe?.

The thing you may not have known you wanted to know: the unreliability of a single user's ratings over short windows isn't noise to be averaged away — it's a signal that preference is partly *manufactured at rating time*, by recent memory and by what the system chose to surface. Systems that model users as multiple context-dependent personas rather than one fixed taste vector Can attention mechanisms reveal which user taste explains each recommendation? are, in effect, building for exactly this: the same person means different things by the same number depending on which version of themselves is rating.

Sources 6 notes

Why do the same users rate items differently each time?

Amatriain et al. found that the same user gives substantially different ratings to the same item across sessions, shifting by multiple stars. This noise stems from temporal inconsistency, rater-specific biases, and anchoring effects—making ratings reflect both preference and rating-behavior rather than stable preference alone.

Do online ratings actually reflect independent customer opinions?

Moe and Trusov decomposed ratings into baseline quality, social-dynamics influence, and error, finding that prior ratings meaningfully affect subsequent ones. These effects have both immediate sales impact and long-term compounding effects through future ratings, though high opinion variance can eventually dampen the distortion.

Do all annotation responses measure the same underlying thing?

Behavioral science reveals that annotations contain genuine preferences, non-attitudes, and constructed preferences—distinguishable by consistency across measurement conditions. Treating them uniformly contaminates reward model training and downstream alignment.

Do different recommender types shape opinion convergence differently?

Research shows that frequently-bought-together and co-viewed recommendation networks produce different opinion convergence patterns. The mechanism: each recommender type attracts different audience segments with different prior expectations, shaping both who sees products together and how they rate them.

How do recommendation feeds shape what people see and believe?

Research shows recommendation systems operate as political actors: feed weights influence producer behavior, network topology drives opinion convergence, and automation enables targeted persuasion at population scale. These effects compound through rating contamination and selection biases.

Can attention mechanisms reveal which user taste explains each recommendation?

AMP-CF represents each user as multiple latent personas weighted dynamically by candidate item. This makes recommendations both diverse and interpretable—each suggestion traces to the specific persona preference it satisfies—without requiring post-hoc reranking.

How do rating anchors shift meaning within short temporal windows for individual users?

Sources 6 notes

Next inquiring lines