Can discrete codes transfer better than text embeddings?

Does inserting a discrete quantization layer between text and item representations improve cross-domain transfer in recommenders? This explores whether decoupling text from final embeddings reduces domain gap and text bias.

Note · 2026-05-03 · sourced from Recommenders Architectures

Pre-trained-language-model-based transferable recommenders use the paradigm "text → representation": encode item title and description with a PLM, use the encoding as the item embedding. This works for cross-domain transfer because language is universal — but it has two failure modes. First, the recommender becomes too dependent on text similarity rather than interaction sequences, so it tends to recommend items with similar descriptions even when sequential evidence says otherwise. Second, text encodings from different domains live in different subspaces, so the domain gap survives the encoding step.

VQ-Rec inserts an intermediate representation: "text → code → representation." Item text is mapped via Optimized Product Quantization to a vector of discrete indices (the item code), and the code looks up embeddings that get aggregated. Text influence is mediated through the code rather than direct.

Two consequences. First, the discrete code distributes items more uniformly across the code space, making them more distinguishable than continuous text encodings tend to be. Second, the code-to-embedding mapping is parameter-efficient and can be tuned per downstream domain, while the text-to-code mapping stays fixed. Adapting to a new domain becomes a small fine-tune of an embedding table rather than retraining an encoder. The general principle: when transfer fails, look for the place where two representations are too tightly coupled, and insert a discrete intermediate that breaks the coupling.

Source: Recommenders Architectures

Related concepts in this collection

Can discretizing text embeddings improve recommendation transfer? Does inserting a quantization step between text encodings and item representations reduce the recommender's over-reliance on text similarity and enable better cross-domain transfer?
extends: paired statement of the same VQ-Rec result framed by the cross-domain unification benefit
Can item identifiers balance uniqueness and semantic meaning? Should LLM-based recommenders prioritize distinctive item references or semantic understanding? This explores whether a hybrid approach can overcome the tradeoffs forced by pure ID or pure text indexing.
complements: both refuse pure-text and pure-ID item indexing; multi-facet keeps multiple channels, VQ-Rec quantizes into a discrete intermediate
Can LLMs gain collaborative filtering strength without losing text understanding? LLM recommenders excel at cold-start through text semantics but struggle with warm interactions where collaborative patterns matter most. Can external collaborative models be integrated into LLM reasoning to close this gap?
complements: same architectural pattern — insert a representation layer between text and downstream recommender to break tight coupling
Can one text encoder unify all recommendation tasks? Does framing diverse recommendation problems—from sequential prediction to review generation—as natural language tasks allow a single model to learn shared structure? Can this approach generalize to unseen items and new task phrasings?
tension with: P5 unifies via text; VQ-Rec argues text coupling is the failure mode — these represent opposite design philosophies for transfer

Concept map

12 direct connections · 56 in 2-hop network ·medium cluster

Can discrete codes transfer better than text emb… Can discretizing text embeddings improve recommend… Can item identifiers balance uniqueness and semant… Can LLMs gain collaborative filtering strength wit… Can one text encoder unify all recommendation task…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

decoupling text from item representations via discrete codes is more transferable than direct text-encoded embeddings