A Conversation is Worth A Thousand Recommendations: A Survey of Holistic Conversational Recommender Systems

Paper · arXiv 2309.07682 · Published September 14, 2023
Recommenders ConversationalAssistants Personalization

Conversational recommender systems (CRS) generate recommendations through an interactive process. However, not all CRS approaches use human conversations as their source of interaction data; the majority of prior CRS work simulates interactions by exchanging entity-level information. As a result, claims of prior CRS work do not generalise to real-world settings where conversations take unexpected turns, or where conversational and intent understanding is not perfect. To tackle this challenge, the research community has started to examine holistic CRS, which are trained using conversational data collected from real-world scenarios. Despite their emergence, such holistic approaches are under-explored.

Thus there is a dichotomy in CRS research. Most CRS do not assume actual human conversations for interaction, only simulating the interaction with entity-level information [7, 3]. However, there are also prior work that relax this constraint and tackle conversational recommendation based on actual human conversations [8, 9]. Besides recommendation and decision strategy, these works also tackle the aforementioned conversational challenges in language understanding, generation, topic/goal planning and knowledge engagement. To distinguish these two forms of CRS research, we divide the current research works in CRS into standard CRS (the former, more prevalent form of prior CRS work), and what we term holistic CRS (which assumes a wider scoping of the CRS task) based on the input and output formats, as shown in Figure 3.

We present a comprehensive survey of holistic CRS methods by summarizing the literature in a structured manner. Our survey recognises holistic CRS approaches as having three components: 1) a backbone language model, the optional use of 2) external knowledge, and/or 3) external guidance. We also give a detailed analysis of CRS datasets and evaluation methods in real application scenarios. We offer our insight as to the current challenges of holistic CRS and possible future trends.

Language generation quality and style. Current holistic CRS methods do not meet the requirements for practical application due to their inferior language quality scores in human evaluation, even when compared to retrieval-based methods [70, 51, 71]. Successful recommendation responses need to supplement explicit prediction results by accounting for implicit features like social strategy and language styles (e.g., encouragement and informativeness [12, 65, 66]). As recommendation outcomes often draw from an external or enriched knowledge structure, future research should focus on 1) elevating language quality to garner positive user feedback [72], and 2) emphasizing preferred language styles to enhance user acceptance [73].

To enhance its efficacy, future versions of holistic CRS should prioritize personalised experiences for individual users by harnessing multi-modal data from item categories and user profiles. Moreover, attending to users’ personal feedback and latent preferences is key for building a superior user modelling framework, resulting in more pertinent recommendations [74]. Additionally, incorporating other LMs or AI-generated content (AIGC) into recommendation feedback could also be a promising avenue [75, 76].