CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation

Paper · arXiv 2310.19488 · Published October 30, 2023

Leveraging Large Language Models as Recommenders (LLMRec) has gained significant attention and introduced fresh perspectives in user preference modeling. Existing LLMRec approaches prioritize text semantics, usually neglecting the valuable collaborative information from user-item interactions in recommendations. While these text-emphasizing approaches excel in cold-start scenarios, they may yield sub-optimal performance in warm-start situations. In pursuit of superior recommendations for both cold and warm start scenarios, we introduce CoLLM, an innovative LLMRec methodology that seamlessly incorporates collaborative information into LLMs for recommendation. CoLLM captures collaborative information through an external traditional model and maps it to the input token embedding space of LLM, forming collaborative embeddings for LLM usage. Through this external integration of collaborative information, CoLLM ensures effective modeling of collaborative information without modifying the LLM itself, providing the flexibility to employ various collaborative information modeling techniques.

For instance, the world knowledge and context comprehension abilities of LLMs could enhance item understanding and user modeling, particularly for cold items/users [32]. This anticipation opens up an exciting new direction: leveraging LLMs as recommenders (LLMRec) [39], which exhibits the potential to become a transformative paradigm for recommendation [3, 39, 43].

To leverage LLMs as recommenders, pioneering studies have relied on In-Context Learning [4], which involves directly asking LLMs to make recommendations by using natural language-based prompts [6, 12, 21, 45]. However, most empirical findings indicate that the original LLMs themselves struggle to provide accurate recommendations, often due to a lack of specific recommendation task training [2, 3, 44]. To address this challenge, increasing efforts have been devoted to further fine-tuning LLMs using relevant recommendation data [2, 3, 21, 44]. Nevertheless, despite incorporating tuning to learn the recommendation task, these methods could still fall short of surpassing well-trained conventional recommender models, particularly for warm users/items, as demonstrated in recent works [27] and Figure 1.

We argue that the primary limitation of existing LLMRec methods is their inadequate modeling of local collaborative information implied within the co-occurrence patterns in user-item interactions. These methods represent users and items using text tokens, relying predominantly on text semantics for recommendations, which inherently fall short of capturing collaborative information. For example, two items with similar text descriptions might possess distinct collaborative information if consumed by different users, yet this difference often goes unaccounted for due to the textual similarity. Nevertheless, collaborative information between users and items often proves beneficial for recommendations, especially for ones with rich interactions [25]. Hence, we introduce a novel research problem: how can we efficiently integrate collaborative information into LLMs to optimize their performance for both warm and cold users/items?

To solve the issue, we propose explicitly modeling collaborative information in LLMs. Drawing from the experience of classic collaborative filtering with latent factor models (e.g., Matrix Factorization [23]), a straightforward solution is introducing additional tokens and corresponding embeddings in LLMs to represent users/items, akin to the roles played by user/item embeddings in latent factor models. This enables the possibility of encoding collaborative information when using these embeddings to fit interaction data, similar to MF.