Can user embeddings personalize language models more efficiently than prompts?

Does distilling user interaction history into learned embeddings outperform stuffing that history directly into prompts for personalizing large language models? This matters because interaction data is long and expensive to process as tokens.

Synthesis note · 2026-06-03 · sourced from Design Frameworks

The standard way to personalize an LLM is to stuff a user's history into the prompt as text — but interaction data is long, noisy, and expensive to tokenize. User-LLM distills user embeddings from diverse interactions (search, media consumption, navigation, social activity) via self-supervised pretraining, capturing latent preferences and their evolution over time, then integrates them into the LLM through cross-attention and soft-prompting (with Perceiver layers to streamline). Across MovieLens, Amazon Review, and Google Local Review, it outperforms text-prompt-based contextualization especially on long-sequence tasks and tasks requiring deep user understanding, while being more computationally efficient and preserving the LLM's general knowledge.

The keeper is the representational choice: compress a user's behavioral history into a learned embedding the model attends to, rather than serializing it as prompt text — which both scales to long histories and captures temporal preference drift that flat prompt-stuffing loses.

This sits in the vault's personalization thread as the embedding-based contextualization route. It contrasts with Do user outputs outperform inputs for LLM personalization? (which keeps personalization in text but selects the right text) — User-LLM moves it out of text entirely — and connects to the broader question of whether user context belongs in tokens or in learned representations.

Inquiring lines that use this note as a source 1

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Should user context live in tokens or in learned model representations?

Related concepts in this collection 2

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

12 direct connections · 80 in 2-hop network ·medium cluster Open in graph ↗

Can user embeddings personalize language models … Do user outputs outperform inputs for LLM personal… Can user preferences be learned from just ten ques…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Do user outputs outperform inputs for LLM personalization? Does a user's history of outputs (responses, endorsed content) matter more for personalization than their input queries? This explores what actually drives effective personalization in language models.
text-based personalization (select the right text) vs User-LLM's embedding-based route
Can user preferences be learned from just ten questions? Explores whether adaptive question selection can efficiently infer user-specific reward coefficients without historical data or fine-tuning. This matters for scaling personalization without per-user model updates.
another compact-representation-of-user route, on the reward side

Can user embeddings personalize language models more efficiently than prompts?

Related concepts in this collection 2

Related papers in this collection 8

Search by related questions 3