Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Paper · arXiv 2107.13586 · Published July 28, 2021
Prompts Prompting

This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub “prompt-based learning”. Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(yjx), prompt-based learning is based on language models that model the probability of text directly. To use these models to perform prediction tasks, the original input x is modified using a template into a textual string prompt x0 that has some unfilled slots, and then the language model is used to probabilistically fill the unfilled information to obtain a final string ^x, from which the final output y can be derived. This framework is powerful and attractive for a number of reasons: it allows the language model to be pre-trained on massive amounts of raw text, and by defining a new prompting function the model is able to perform few-shot or even zero-shot learning, adapting to new scenarios with few or no labeled data. In this paper we introduce the basics of this promising paradigm, describe a unified set of mathematical notations that can cover a wide variety of existing work, and organize existing work along several dimensions, e.g. the choice of pre-trained models, prompts, and tuning strategies.

__

Two Sea Changes in NLP

A Formal Description of Prompting

 Supervised Learning in NLP . . . . . .

 Prompting Basics . . . . . . . . . . . .

. Prompt Addition . . . . . . . .

. Answer Search . . . . . . . . .

. Answer Mapping . . . . . . . .

 Design Considerations for Prompting .

Pre-trained Language Models

 Training Objectives . . . . . . . . . . .

 Noising Functions . . . . . . . . . . . .

 Directionality of Representations . . . .

 Typical Pre-training Methods . . . . . .

. Left-to-Right Language Model .

. Masked Language Models . . .

. Prefix and Encoder-Decoder . .

Prompt Engineering

 Prompt Shape . . . . . . . . . . . . . .

 Manual Template Engineering . . . . .

 Automated Template Learning . . . . .

. Discrete Prompts . . . . . . . .

. Continuous Prompts . . . . . .

Answer Engineering

 Answer Shape . . . . . . . . . . . . . .

 Answer Space Design Methods . . . . .

. Manual Design . . . . . . . . .

. Discrete Answer Search . . . .

. Continuous Answer Search . . .

Multi-Prompt Learning

 Prompt Ensembling . . . . . . . . . . .

 Prompt Augmentation . . . . . . . . . .

 Prompt Composition . . . . . . . . . .

 Prompt Decomposition . . . . . . . . .

Training Strategies for Prompting Methods

 Training Settings . . . . . . . . . . . .

 Parameter Update Methods . . . . . . .

. Promptless Fine-tuning . . . . .

. Tuning-free Prompting . . . . .

. Fixed-LM Prompt Tuning . . .

. Fixed-prompt LM Tuning . . .

. Prompt+LM Tuning . . . . . .

Applications

 Knowledge Probing . . . . . . . . . . .

 Classification-based Tasks . . . . . . .

 Information Extraction . . . . . . . . .

 “Reasoning” in NLP . . . . . . . . . .

 Question Answering . . . . . . . . . .

 Text Generation . . . . . . . . . . . . .

 Automatic Evaluation of Text Generation

 Multi-modal Learning . . . . . . . . . .

 Meta-Applications . . . . . . . . . . .

 Resources . . . . . . . . . . . . . . . .

Prompt-relevant Topics

Challenges

 Prompt Design . . . . . . . . . . . . .

 Answer Engineering . . . . . . . . . .

 Selection of Tuning Strategy . . . . . .

 Multiple Prompt Learning . . . . . . .

 Selection of Pre-trained Models . . . .

 Theoretical and Empirical Analysis o

rompting . . . . . . . . . . . . . . . .

 Transferability of Prompts . . . . . . .

 Combination of Different Paradigms . .

 Calibration of Prompting Methods . . .

Meta Analysis

 Timeline . . . . . . . . . . . . . . . . .

 Trend Analysis . . . . . . . . . . . . .

Conclusion

 Appendix on Pre-trained LMs

. Evolution of Pre-trained LM Parameters

. Auxiliary Objective . . . . . . . . . . .

. Pre-trained Language Mode

amilies . . . . . . . . . . . . . . . . .