Do different recommendation datasets converge toward the same popular items over time?

This reads the question as being about the popularity feedback loop — whether recommenders, whatever data they start from, tend to drift toward the same small set of popular items over time, and what in the machinery causes (or counters) that pull.

This explores convergence toward popularity as a *force* rather than a property of any one dataset: the corpus doesn't directly race datasets against each other, but it does pin down why systems built on different data so often end up surfacing the same crowd-pleasers. The short version: convergence isn't inevitable — it's a side effect of specific design choices, and several of them are quietly baked into standard recommenders.

The most direct culprit is dimensionality. When user and item embeddings are too small, a model can't represent niche taste cheaply, so it overfits toward popular items to keep ranking scores high — and this compounds, because under-exposed niche items gather even less interaction data next round, feeding the next training cycle Does embedding dimensionality secretly drive popularity bias in recommenders?. A related amplifier lives in the embedding tables themselves: real interaction data is power-law distributed, so hash collisions and fixed-size tables land hardest on the highest-frequency users and items, and the distortion accumulates as new IDs keep arriving Why do hash collisions hurt recommendation models so much?. Different datasets share the same power-law shape, which is exactly why they tend to drift the same direction.

But convergence-to-popular is also an artifact of the objective, not just the data. Steck's calibration work shows that simply ranking by per-item relevance naturally produces lists dominated by a user's *primary* interest even when their history clearly documents secondary ones — accuracy optimization crowds out minority interests unless you explicitly enforce proportional representation Do accuracy-optimized recommendations preserve user interest diversity?. So the pull toward a narrow set isn't only about which items are globally popular; it's the same greedy-ranking instinct operating at the level of each individual user.

The more surprising thread is that convergence is *steerable*, and the recommender's structure decides the outcome. One study found that the type of recommendation network changes whether connected products' ratings converge or diverge — "frequently bought together" and "co-viewed" graphs route different audiences to the same items and produce genuinely different convergence patterns Do different recommender types shape opinion convergence differently?. And convergence can be deliberately broken: social recommenders that lean on friends with *different* tastes (rather than pulling similar users together) push people toward anomalous, off-distribution choices instead of the popular center Can friends with different tastes improve recommendations?.

So the honest answer the corpus points to: yes, there's a strong shared gravity toward popular items, and it comes from forces — power-law data, cramped embeddings, greedy relevance ranking — that operate the same way across datasets. The interesting part is that none of them are laws of nature. Treat dimensionality as a fairness knob, add calibration, or wire in diversity-bearing signal, and two systems on different data need not collapse onto the same bestseller list.

Sources 5 notes

Does embedding dimensionality secretly drive popularity bias in recommenders?

Research shows that when user/item embedding dimensions are too small, recommender systems overfit toward popular items to maximize ranking quality. This compounds over time as niche items receive insufficient exposure, and cannot be fixed post-hoc without treating dimensionality as a fairness hyperparameter.

Why do hash collisions hurt recommendation models so much?

Monolith's empirical work shows that real recommendation systems have power-law distributed frequencies, causing collisions to accumulate precisely on the entities models need most accurate. Fixed-size hashed tables worsen this over time as new IDs arrive.

Do accuracy-optimized recommendations preserve user interest diversity?

Steck's research shows that ranking by per-item relevance naturally produces lists dominated by a user's primary interest, even when they have documented secondary interests. Enforcing calibration via post-hoc reranking restores proportional representation without sacrificing overall accuracy.

Do different recommender types shape opinion convergence differently?

Research shows that frequently-bought-together and co-viewed recommendation networks produce different opinion convergence patterns. The mechanism: each recommender type attracts different audience segments with different prior expectations, shaping both who sees products together and how they rate them.

Can friends with different tastes improve recommendations?

Social Poisson Factorization uses friends' diverse tastes to recommend items outside users' usual preferences, outperforming methods that pull friends' representations together. Networks add value through influence on anomalous choices, not taste similarity.

Do different recommendation datasets converge toward the same popular items over time?

Sources 5 notes

Next inquiring lines