Do LLM arguments actually argue better than humans?
LLM counter-arguments score higher on textbook quality markers like logical soundness and respectful tone, while human arguments show more creativity and emotional intensity. What does this gap reveal about how we measure argumentative quality?