Conversational Prompt Engineering

Paper · arXiv 2408.04560 · Published August 8, 2024

Conversational Prompt Engineering (CPE), a user-friendly tool that helps users create personalized prompts for their specific tasks. CPE uses a chat model to briefly interact with users, helping them articulate their output preferences and integrating these into the prompt. The process includes two main stages: first, the model uses user-provided unlabeled data to generate data-driven questions and utilize user responses to shape the initial instruction. Then, the model shares the outputs generated by the instruction and uses user feedback to further refine the instruction and the outputs. The final result is a few-shot prompt, where the outputs approved by the user serve as few shot examples.

The key idea is to utilize chat models to assist users in creating prompts through a brief and user-friendly conversation, considering a small set of unlabeled data provided by the user.

Advanced chat models can assist users in better understanding and articulating their exact requirements, making it easier for users to communicate their needs. Second, given a task, unlabeled input texts can be leveraged by LLMs to suggest data-specific dimensions of potential output preferences. This process can further help users specify their task requirements by highlighting relevant aspects of the data. Finally, user feedback on specific model-generated outputs can be leveraged not only to improve the outputs themselves, but also to refine the instruction to be applied to unseen texts.

4.1 Three-party Chat The interaction within CPE embodies three actors: the user, the system, and the model. The user: communicates with CPE via the UI. The model: The LLM that handles the core capabilities of CPE: user data analysis and discussion, instruction refinement, and output enhancement. The LLM is instructed to respond in a particular format of one of several API calls. Each response by the model is then executed, and may or may not involve sharing a message with the user (see Section 4.4). Out implementation uses Llama-3-70B. The system: orchestrates the interaction between the user and the model. Each model response (API call), triggers an action by the system. Contrary to common practice – where the model is assigned a single initial system instruction – we invoke multiple system instructions, dynamically throughout the chat. This mechanism allows us greater flexibility and control over the model, and enables the side-chats (see below).

We use CoT (Wei et al., 2022) in side-chats during various stages where more careful guidance of the model is needed. For example, after the user provides feedback for prompt outputs, the model needs to understand if and how to refine the instruction accordingly. Due to the complexity of this stage, we use CoT by first asking the model to summarize the comments made by the user in a side-chat (showing only those comments as context); once these summaries are available, we ask the model to use them to suggest a new instruction.