Why do embedding tables need to grow elastically over time?
This explores why production recommendation systems can't just allocate a fixed embedding table up front — why the table has to keep expanding as the system runs.
This is really a question about what happens to a recommendation system *over time*, not at any single snapshot. The core driver is simple but unforgiving: new IDs keep arriving. Every new user, video, ad, or product is a new entity that needs its own embedding, and the stream never stops. A fixed-size table forces all those new arrivals to share slots with existing ones through hashing — and that's where the damage compounds. Why do hash collisions hurt recommendation models so much? shows from Monolith's production work that a fixed hashed table doesn't just have collisions, it gets *worse* over time as the ID space grows past the slots available.
What makes this more than a sizing nuisance is *where* the collisions land. Real recommendation IDs follow a power-law distribution, not a uniform one — a small number of users and items account for most of the traffic. Do hash collisions really harm popular recommendation items? makes the sharp point: collisions concentrate precisely on the high-frequency entities, so a fixed table degrades quality exactly where the system handles the most traffic and can least afford it. Elastic growth is the escape: give each ID room of its own so the popular entities the model leans on stay clean.
It's worth seeing this as one of three different 'capacity runs out' stories the corpus tells, because they're often confused. Growing the *number of rows* (elastic tables) is about covering an expanding set of entities. That's separate from the *width* of each embedding: Do embedding dimensions fundamentally limit retrievable document combinations? proves via communication-complexity theory that embedding dimension caps how many top-k result combinations you can ever return — a ceiling no amount of table growth fixes. And it's separate again from how you *use* a vector: How can user vectors capture diverse interests without exploding in size? shows Deep Interest Network sidestepping a fixed-length user vector not by enlarging it but by activating only the relevant past behaviors per candidate.
The thing worth walking away with: 'the table is too small' can mean three unrelated problems — too few rows for the entities (fix by growing elastically), too few dimensions per row (a hard mathematical limit), or a single vector trying to express too many interests at once (fix with dynamic attention, not size). Elastic growth only answers the first. Reach for it when your ID space is open-ended and power-law shaped — and don't expect it to rescue you from the other two.
Sources 4 notes
Monolith's empirical work shows that real recommendation systems have power-law distributed frequencies, causing collisions to accumulate precisely on the entities models need most accurate. Fixed-size hashed tables worsen this over time as new IDs arrive.
Real recommendation IDs follow power-law distributions, not uniform ones. High-frequency users and items collide more often, degrading model quality exactly where traffic is highest, making fixed-size hash tables inadequate for production systems.
Communication complexity theory proves that for any embedding dimension d, there exists a maximum number of top-k document combinations that can be returned as results. Even embeddings optimized directly on test data hit this polynomial limit, demonstrated on trivially simple retrieval tasks.
Deep Interest Network weights historical behaviors against each candidate ad, activating only relevant interests dynamically. This preserves dimension efficiency while expressing diverse tastes without lossy compression.