ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Paper · arXiv 2410.12405 · Published October 16, 2024

Our extensive study, spanning multiple tasks, uncovers that prompt sensitivity fluctuates across datasets and models, with larger models exhibiting enhanced robustness. We observe that few-shot examples can alleviate this sensitivity issue, and subjective evaluations are also susceptible to prompt sensitivities, particularly in complex, reasoning-oriented tasks.

Findings suggest that prompt sensitivity is essentially a reflection of the model’s confidence level: higher confidence in its outputs correlates with increased robustness against prompt semantic variations.