CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper · arXiv 2404.03820 · Published April 4, 2024

Recent advancements in instruction-tuning datasets have predominantly focused on specific tasks like mathematical or logical reasoning. There has been a notable gap in data designed for aligning language models to maintain topic relevance in conversations - a critical aspect for deploying chatbots to production. We introduce the CANTTALKABOUTTHIS dataset to help language models remain focused on the subject at hand during task oriented interactions. It consists of synthetic dialogues on a wide range of conversation topics from different domains. These dialogues are interspersed with distractor turns that intentionally divert the chatbot from the predefined topic. Fine-tuning language models on this dataset helps make them resilient to deviating from the assigned role and improves their ability to maintain topical coherence compared to general-purpose instruction-tuned LLMs like GPT-4-TURBO and MIXTRAL-INSTRUCT.

As a first effort to empower chatbots in topic-following we present CANTTALKABOUTTHIS1, a small dataset consisting of 1080 synthetic dialogues designed to train models to stay on topic. Our dataset creation process involves a three-step prompting-based approach that focuses on both diversity and quality: 1) We generate topic-following prompts across a variety of scenarios; 2)We create dialogues adhering to these topical instructions, employing a technique akin to dialogue inpainting (Dai et al., 2022); 3) We integrate distractors into these dialogues to test topic following. Our findings using this dataset reveal that even state-of- the-art general-purpose LLMs initially struggle with staying on topic, often engaging with distractors. We demonstrate that models fine-tuned on our dataset significantly improve performance in following complex dialogue instructions. However, investigation into the nature of our synthetic distractors reveals a limitation: they tend to be off-topic but simplistic.

We firmly believe that topic-following is a crucial yet overlooked aspect in current instruction-tuning datasets Köpf et al. (2024); Longpre et al. (2023).

Integrating CANTTALKABOUTTHIS into instruction-tuning blends significantly enhances an LLM’s capacity to comprehend and follow complex system instructions about the interaction topic. This is particularly advantageous for developing task-oriented dialogue systems (TODS)…

A.5 Topic Following Evaluation Prompt We provide to the models the system instruction that specifies how the interaction between the user and the chatbot has to proceed. In addition, we add an instruction

Do not reply to any questions that are not related to the scenario or the role assigned to you. In case you get such questions, reply only with "This question is not relevant to the scenario".

For all the LLMs, we use a temperature of 0, i.e. greedy decoding for evaluation.