Guiding Large Language Models via Directional Stimulus Prompting

Paper · arXiv 2302.11520 · Published February 22, 2023

“Since directly optimizing LLMs for specific tasks is either inefficient and infeasible for most users and developers, researchers resort to optimizing prompts instead. Prompt engineering approaches, which involve manually or automatically designing optimal task-specific natural language instructions and selecting appropriate training samples for demonstration in the prompt, have been the focus of many researchers [6, 55, 79, 39]. Despite these efforts, effectively steering LLMs to generate desired results and effectively exploiting labeled data remains a significant challenge.

To address the challenge, we propose a novel framework called Directional Stimulus Prompting (DSP). This framework introduces a new component called the “directional stimulus” into the prompt to provide nuanced, instance-specific guidance and control over LLMs. Specifically, the directional stimulus prompt acts as “hints” and “clues” for the input query to guide LLMs toward the desired output. Notably, this differs from the methods that augment LLMs with additional knowledge retrieved from external sources [25, 60], as the directional stimulus prompt is generated solely based on the input query in our framework. Figure 1 compares our proposed prompting approach, DSP, with standard prompting for the summarization task. Our approach incorporates keywords in the prompt as the directional stimulus prompt to hint at key points the desired summary should cover. By providing this instance-specific guidance through directional stimulus prompt, LLMs can generate outputs that more closely align with the desired reference summary.”

“To this end, our Directional Stimulus Prompting (DSP) approach introduces a small piece of discrete tokens z named “directional stimulus” into the prompt, which acts as hints and clues to provide LLMs with fine-grained guidance toward the desired direction. For example, for the summarization task, the directional stimulus z might consist of keywords that should be included in the desired summary. To generate this stimulus for each input query, we use a small tunable policy language model, pPOL(z|x). We then use this generated stimulus, z, along with the original input, x, to construct the prompt that steers the LLM toward generating its output, pLLM(y|x, z), through black-box API calls. It’s important to note that the parameters of the LLM, pLLM, are not accessible or tunable. Overall, when using the LLM with DSP to perform a downstream task, the output is obtained via y ∼ pLLM(·|x, z), z ∼ pPOL(·|x).”