Recommender Systems

Can neural networks explore efficiently at recommendation scale?

Exploration—discovering unknown user preferences—normally requires expensive posterior uncertainty estimates. Can a neural architecture make Thompson sampling practical for real-world recommenders without prohibitive computational cost?

Note · 2026-05-03 · sourced from Recommenders Architectures
What breaks when specialized AI models reach real users?

Supervised neural networks form the backbone of most recommenders, but they only exploit recognized user interests. Discovering unknown user preferences requires exploration — and the standard exploration framework (contextual bandits with Thompson sampling) requires posterior uncertainty estimates, which are computationally prohibitive for large neural networks at recommendation scale.

The Zhu et al. proposal is the Epistemic Neural Recommendation (ENR) architecture, an epistemic neural network designed to enable Thompson sampling at scale. Epistemic neural networks separate aleatoric uncertainty (irreducible noise in outputs) from epistemic uncertainty (uncertainty about the model's parameters). The latter is what's needed for Thompson sampling: sample a parameter setting from the posterior, choose actions according to that setting, observe outcomes, update.

Empirically, ENR significantly boosts click-through rates and user ratings by at least 9% and 6% respectively compared to state-of-the-art neural contextual bandit algorithms. It achieves equivalent performance with at least 29% fewer user interactions than the best-performing baseline. Computationally, it demands orders of magnitude fewer resources than other neural contextual bandit baselines — moving Thompson-sampling-based exploration from research-only to production-feasible.

The general principle: when a Bayesian technique seems too expensive at scale, ask whether the expensive part is genuinely necessary or whether a structural approximation captures what's needed. Epistemic networks make a focused commitment to estimating only the parameter uncertainty Thompson sampling actually uses, dropping the rest. The architectural simplification is what unlocks scale.


Source: Recommenders Architectures

Related concepts in this collection

Concept map
13 direct connections · 102 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

scalable neural contextual bandits enable sample-efficient exploration via epistemic neural networks supporting Thompson sampling at scale