Understanding the Role of User Profile in the Personalization of Large Language Models

Paper · arXiv 2406.17803 · Published June 22, 2024

However, the precise role of user profiles and their effect mechanism on LLMs remains unclear. This study first confirms that the effectiveness of user profiles is primarily due to personalization information rather than semantic information. Furthermore, we investigate how user profiles affect the personalization of LLMs. Within the user profile, we reveal that it is the historical personalized response produced or approved by users that plays a pivotal role in personalizing LLMs. This discovery unlocks the potential of LLMs to incorporate a greater number of user profiles within the constraints of limited input length. As for the position of user profiles, we observe that user profiles integrated into different positions of the input context do not contribute equally to personalization. Instead, where the user profile that is closer to the beginning affects more on the personalization of LLMs.

it remains unclear how incorporated user profiles specifically affect LLM personalization. This uncertainty arises from potential overlaps with retrieval-augmentation generation (RAG) (Lewis et al., 2020) and in-context learning (ICL) (Brown et al., 2020), where personalization is not directly addressed. RAG relies on retrieving documents relevant to the input, implicitly requiring semantic information to provide answers (Mallen et al., 2023), while ICL uses input-output pairs as demonstrations to guide LLMs in forming relevant mappings (Garg et al., 2022), necessitating complete input-output pairs.

Salemi et al. (2023) offered a comprehensive evaluation benchmark with diverse language tasks, where the user profiles consist of the user’s previous queries and the personalized response

recent works have demonstrated the effectiveness of introduced user profiles in different ways, such as the summarization (Richardson et al., 2023), keywords synthesis (Li et al., 2023b), user embeddings (Doddapaneni et al., 2024; Ning et al., 2024), parametric knowledge (Tan et al., 2024) via Lora (Shi and Lipani, 2023a) or the prompt rewriting via reinforcement learning (Li et al., 2023a).

Only using the output part substantially enhances the personalization. Our results in Figure 4 reveal that, except for LaMP-3, using only the output part of user profiles achieves comparable (LaMP-4) or even superior performance (LaMP-2 and LaMP-5) compared to using complete user profiles. In contrast, utilizing only the input part leads to noticeable performance degradation across all tasks. This supports our earlier findings, emphasizing the importance of user profiles with a more powerful output for LLM personalization. It further underscores that responses produced or endorsed by users play a pivotal role in effective personalization, particularly when contrasted with correct mapping and previous input considerations.