How can recommendation models handle per-user concept drift instead of global drift?

This explores why recommenders should track how each individual's tastes shift on their own timeline, rather than detecting one population-wide trend — and what techniques the corpus offers for doing that.

This explores why recommenders should track how each individual's tastes shift on their own timeline, rather than detecting one population-wide trend. The corpus's clearest answer is that global drift detection is simply the wrong unit of analysis: people change their minds at different times and for different reasons, so a population-level 'concept drift' signal washes out the very thing you care about. The fix is per-user temporal modeling that holds onto durable long-term signals while discounting passing noise Why do global concept drift methods fail for recommender systems?. But once you accept that, an interesting question follows — what *kind* of per-user change are you modeling? Not all drift is a one-way street.

A surprising amount of what looks like 'drift' is actually rhythm. Instead of detecting a change-point and declaring the user different now, you can treat time itself as a context dimension: a hypernetwork conditioned on time-of-period regenerates a user's preference parameters so that matching time slots (this weekday evening, this weekend) retrieve matching tastes rather than being read as fresh evidence of change Why do recommendation systems miss recurring user preference patterns?. This reframes the whole problem — much per-user 'drift' is recurrence the model failed to recognize.

The corpus also questions the premise that a user *has* a single preference that drifts at all. If you represent each person as several latent personas weighted by the candidate item, then what looks like temporal instability may just be different personas activating in different contexts — and you get interpretable recommendations for free, since each suggestion traces to the persona it satisfies Can modeling multiple user personas improve recommendation accuracy?, Can attention mechanisms reveal which user taste explains each recommendation?. Drift across a monolithic vector and switching between stable personas are two very different stories that can produce the same surface behavior.

On the mechanics of *learning* per-user change without wrecking what you already knew, the corpus splits into two camps. One uses parameter isolation: dynamically expandable graph convolution gives each new task its own parameters, preserving old patterns exactly while new ones capture emerging preferences — explicit control over the stability-plasticity trade-off that replay and distillation can't match Can model isolation solve streaming recommendation better than replay?. The other personalizes at inference time instead of retraining: a small number of adaptive questions infers a user's coefficients over shared base reward functions, so you adapt to the individual without touching weights Can user preferences be learned from just ten questions?. A quieter but important constraint sits underneath all of this — per-user fidelity depends on per-user identity surviving the embedding table, and hash collisions concentrate precisely on the high-frequency users you most need to model accurately Why do hash collisions hurt recommendation models so much?.

The thing worth carrying away: 'per-user concept drift' is really three distinct problems wearing one name — genuine preference shift, recurring periodic taste, and multiple stable personas that take turns. The corpus suggests you can't handle drift well until you decide which of these you're actually seeing, because each demands a different tool.

Sources 7 notes

Why do global concept drift methods fail for recommender systems?

User preferences shift on individual timescales for individual reasons, making population-level drift detection ineffective. Per-user temporal modeling that preserves long-term signals while discounting transient noise is required.

Why do recommendation systems miss recurring user preference patterns?

HyperBandit conditions a hypernetwork on time-of-period to generate user preference parameters, capturing weekly and daily cycles that change-point detection misses. This treats time itself as a context dimension, so matching time periods retrieve matching preference functions rather than treating each period as novel evidence.

Can modeling multiple user personas improve recommendation accuracy?

AMP-CF separates user representation into latent personas weighted by attention to the candidate item. This candidate-conditional approach improves accuracy by adapting the user representation at prediction time and produces inherent explanations for why items were recommended.

Can attention mechanisms reveal which user taste explains each recommendation?

AMP-CF represents each user as multiple latent personas weighted dynamically by candidate item. This makes recommendations both diverse and interpretable—each suggestion traces to the specific persona preference it satisfies—without requiring post-hoc reranking.

Can model isolation solve streaming recommendation better than replay?

DEGC uses per-task parameter isolation to handle streaming recommendation, providing explicit stability-plasticity trade-offs that experience replay and knowledge distillation methods cannot match. This approach preserves older patterns exactly while allowing new parameters to capture emerging preferences.

Can user preferences be learned from just ten questions?

PReF learns base reward functions from preference data, then uses active learning to select maximally informative questions that reduce coefficient uncertainty. Users can be personalized via inference-time reward alignment without weight modification.

Why do hash collisions hurt recommendation models so much?

Monolith's empirical work shows that real recommendation systems have power-law distributed frequencies, causing collisions to accumulate precisely on the entities models need most accurate. Fixed-size hashed tables worsen this over time as new IDs arrive.

How can recommendation models handle per-user concept drift instead of global drift?

Sources 7 notes

Next inquiring lines