ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Paper · arXiv 2410.12405 · Published October 16, 2024
Prompts PromptingEvaluations

Our extensive study, spanning multiple tasks, uncovers that prompt sensitivity fluctuates across datasets and models, with larger models exhibiting enhanced robustness. We observe that few-shot examples can alleviate this sensitivity issue, and subjective evaluations are also susceptible to prompt sensitivities, particularly in complex, reasoning-oriented tasks.

Findings suggest that prompt sensitivity is essentially a reflection of the model’s confidence level: higher confidence in its outputs correlates with increased robustness against prompt semantic variations.