Review-LLM: Harnessing Large Language Models for Personalized Review Generation

Paper · arXiv 2407.07487 · Published July 10, 2024

Product review generation is an important task in recommender systems, which could provide explanation and persuasiveness for the recommendation. Recently, Large Language Models (LLMs, e.g., ChatGPT) have shown superior text modeling and generating ability, which could be applied in review generation. However, directly applying the LLMs for generating reviews might be troubled by the “polite” phenomenon of the LLMs and could not generate personalized reviews (e.g., negative reviews). In this paper, we propose Review-LLM that customizes LLMs for personalized review generation. Firstly, we construct the prompt input by aggregating user historical behaviors, which include corresponding item titles and reviews. This enables the LLMs to capture user interest features and review writing style. Secondly, we incorporate ratings as indicators of satisfaction into the prompt, which could further improve the model’s understanding of user preferences and the sentiment tendency control of generated reviews. Finally, we feed the prompt text into LLMs, and use Supervised Fine-Tuning (SFT) to make the model generate personalized reviews for the given user and target item. Experimental results on the real-world dataset show that our fine-tuned model could achieve better review generation performance than existing close-source LLMs.

Compared with other traditional generation tasks (such as poem generation), applying LLMs for the review generation in the ecommerce platforms is more challenging due to the lack of personalized information. First, most existing large language models are usually pre-trained at the corpus-level and might not capture the review style and habits of the users. This might cause the generated review to be inconsistent with user’s previous reviews. Second, users are dissatisfied with many items and the corresponding reviews should be negative. However, the generated text by the LLMs is usually “polite” (Touvron et al., 2023), which might lead to the model generating positive reviews for the user’s dissatisfaction. Hence, in this paper, we design a framework (Review-LLM) for harnessing the LLMs to generate personalized reviews. Specifically, we reconstruct the model input via aggregating the user behavior sequence, including the item titles and corresponding reviews. In this way, the model could learn user interest features and review writing styles from semantically rich text information. Furthermore, the user’s rating of the item can be used to indicate the user’s satisfaction with the item. We integrate this information into the prompt input accordingly. In this way, the large language model can better perceive whether users like different items, and may prevent the model from generating more “polite” reviews. Finally, we feed the input prompt text into the LLMs (Llama-3), which is subsequently fine-tuned using Supervised Fine- Tuning (SFT) to output the review for target items.

Generation Instruction: Its role is to instruct the LLMs to consider both the user’s preference and historical behaviors to complete the generation task. The generation task is structured as an output of the review for the target item; 2) Input: This contains the items the user has interacted with, including the item title, review, and rating; 3) The user purchased a new item: This contains the target item title and the corresponding rating; 4) Response: This is the generated review for the target item.