Large Language Models are Zero-Shot Rankers for Recommender Systems

Paper · arXiv 2305.08845 · Published May 15, 2023
Recommenders LLMs

“To carry out this study, we first formalize the recommendation process of LLMs as a conditional ranking task. Given prompts that include sequential historical interactions as “conditions”, LLMs are instructed to rank a set of “candidates” (e.g., items retrieved by candidate generation models), according to LLM’s intrinsic knowledge about the relationships between candidate items and historically interacted items. Then we conduct controlled experiments to systematically study the empirical performance of LLMs as rankers by designing specific configurations for “conditions” and “candidates”, respectively. Overall, we attempt to answer the following key questions:

• Can LLMs capture underlying user preferences from prompts with sequential interactions?

• Can LLMs leverage their intrinsic knowledge to rank candidates retrieved by different practical

strategies?

Our empirical experiments are conducted on two widely-used public datasets for recommender systems. Our experiments lead to several key findings that potentially shed light on how to develop LLMs as powerful ranking models for recommender systems. We summarize the key findings of this empirical study as follows:

• LLMs can utilize historical behaviors for personalized ranking, but struggle to perceive the

order of the given sequential interaction histories.

• By employing specifically designed promptings, such as recency-focused prompting and incontext

learning, LLMs can be triggered to perceive the order of sequential historical interactions,

leading to improved ranking performance.

• LLMs outperform existing zero-shot recommendation methods, showing promising zero-shot

ranking abilities, especially on candidates retrieved by multiple candidate generation models

with different practical strategies.

• LLMs suffer from position bias and popularity bias while ranking, which can be alleviated by

prompting or bootstrapping strategies.”