Does LLM input augmentation beat direct LLM recommendation?
Can LLMs enrich item descriptions more effectively than making recommendations directly? This explores whether specialized models work better when LLMs focus on what they do best: content understanding rather than ranking.
Two paradigms exist for incorporating LLMs into recommender systems. The first uses LLMs as recommenders directly: build a prompt with task description, user profile, item attributes, and user-item history, ask the LLM to predict interaction probability. The second uses LLMs as input augmenters: use them to enrich item descriptions, then feed the enriched descriptions to a conventional recommender model.
LLM-Rec investigates the second paradigm with three prompt types. P1 instructs the LLM to paraphrase the original content, preserving information without adding new details. P2 instructs the LLM to summarize content with tags, generating a more concise overview. P3 instructs the LLM to deduce content characteristics and provide categorical responses at a coarser granularity than the original.
Combining the original description with the augmented texts from these prompts improves recommendation performance over either the original alone or the LLM-as-recommender approach. The mechanism: each prompt extracts a different aspect of the item that the LLM "knows" from pretraining (paraphrase preserves content but normalizes phrasing; tags compress to discriminative attributes; categories provide hierarchy). The augmented input enriches the recommender's representation without subjecting it to the LLM's recommendation-task biases.
The methodological lesson is to ask which problems an LLM is good at versus what you need for your task. LLMs are excellent at content understanding (paraphrase, summarization, categorization). They are not specialized recommenders. Letting the LLM do what it's good at — generate enriched textual features — and letting a specialized model do recommendation often beats trying to make the LLM do everything.
Source: Recommenders Personalized
Related concepts in this collection
-
How should language models integrate into recommender systems?
When building recommendation systems with LLMs, should you use them as feature encoders, token generators, or direct recommenders? The choice affects efficiency, bias, and compatibility with existing pipelines.
tension with: LLM-Rec evidence shows direct-LLM-as-recommender is the weakest paradigm; input-augmentation outside the taxonomy beats it
-
Do prompt techniques work the same across all LLM tiers?
Do chain-of-thought and rephrasing prompts help or hurt recommendation tasks equally across cost-efficient and high-performance models? Understanding tier-dependent effects could optimize prompt selection.
complements: rephrasing-as-input-augmentation is exactly the cheap-model-friendly prompt this benchmark identifies
-
Can retrieval enhancement fix explainable recommendations for sparse users?
When users have few historical interactions, embedded recommendation models struggle to generate personalized explanations. Can augmenting sparse histories with retrieved relevant reviews—selected by aspect—overcome this fundamental data limitation?
complements: aspect-augmentation and content-augmentation are parallel — both use external generation to enrich sparse signal before recommendation
-
Can LLMs gain collaborative filtering strength without losing text understanding?
LLM recommenders excel at cold-start through text semantics but struggle with warm interactions where collaborative patterns matter most. Can external collaborative models be integrated into LLM reasoning to close this gap?
complements: CoLLM brings CF-into-LLM; LLM-Rec brings LLM-text-into-traditional-recommender — opposite directions of the same hybrid intent
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
LLM-Rec input augmentation outperforms LLM-as-recommender — content prompting for paraphrase summary and category labels enriches representation