"Is ChatGPT a Better Explainer than My Professor?": Evaluating the Explanation Capabilities of LLMs in Conversation Compared to a Human Baseline

Paper · arXiv 2406.18512 · Published June 26, 2024

Description automatically generated](file:////Users/adrianchan/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip_image001.png)

We found that the GPT generated explainer responses were preferred over the human baseline emphasizing the challenge of effective science communication between experts and everyday people. Additionally, the annotators preferred S2: GPT Standard responses over S2: GPT w/ EA responses mainly due to the concise and succinct responses.

This raises the question: How can explainers tailor their explanation to the explainee’s background and proficiency level to increase the explainee’s understanding of the topic?

We then compare this corpus to existing expert explanation dialogues (Wachsmuth and Alshomary, 2022), and we evaluate the effectiveness of pretrained language models in predicting the quality of explanation dialogues.