Argumentation and Persuasion

Topic · 105 papers

(QA)2: Question Answering with Questionable Assumptions
For instance, the question When did Marie Curie discover Uranium? cannot be answered as a typical when question without addressing the false assumption Marie Curie discovered Uranium. In this work, we…
A Hybrid Human-AI Approach for Argument Map Creation From Transcripts
Deliberation processes are important mechanisms for collaborative decision-making, fostering informed choices across a wide array of domains (Vaculín et al., 2013; Owen, 2015). Traditionally, these pr…
A Hybrid Intelligence Method for Argument Mining
Large-scale survey tools enable the collection of citizen feedback in opinion corpora. Extracting the key arguments from a large and noisy set of opinions helps in understanding the opinions quickly a…
A Robustness Evaluation Framework for Argument Mining
little is usually known about the model’s stability and consistency when deployed in real-world settings. In this paper, we propose a robustness evaluation framework to guide the design of rigorous ar…
A meta-analysis of the persuasive power of large language models
Large language models (LLMs) are increasingly used for persuasion, such as in political communication and marketing, where they affect how people think, choose, and act. Yet, empirical findings on the…
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
For Large Language Models (LLMs) to be reliably deployed in both everyday and high-stakes domains, knowing when not to answer is equally critical as answering correctly. Real-world user queries, which…
Argument Quality Assessment in the Age of Instruction-Following Large Language Models
Rather than just fine-tuning LLMs towards leaderboard chasing on assessment tasks, they need to be instructed systematically with argumentation theories and scenarios as well as with ways to solve arg…
Argument Summarization and its Evaluation in the Era of Large Language Models
Large Language Models (LLMs) have revolutionized various Natural Language Generation (NLG) tasks, including Argument Summarization (ArgSum), a key subfield of Argument Mining (AM). This paper investig…
Argumentative Large Language Models for Explainable and Contestable Decision-Making
The diversity of knowledge encoded in large language models (LLMs) and their ability to apply this knowledge zero-shot in a range of settings makes them a promising candidate for use in decisionmaking…
Argunauts: Open LLMs that Master Argument Analysis with Argdown
[Community Article](https://huggingface.co/blog/community)Published February 14, 2025 [![Gregor Betz's avatar](https://huggingface.co/avatars/78be882adf32b808686713e9b457797d.svg)](https://huggingfac…
Attention, Intentions, And The Structure Of Discourse
In this paper we explore a new theory of discourse structure that stresses the role of purpose and processing in discourse. In this theory, discourse structure is composed of three separate but interr…
Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models
We present Attentive Reasoning Queries (ARQs), a novel structured reasoning approach that significantly improves instruction-following in Large Language Models through domain-specialized reasoning blu…
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
Large language models (LLMs) have recently shown impressive performance on tasks involving reasoning, leading to a lively debate on whether these models possess reasoning capabilities similar to human…
Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate
Abstract: Large Language Models (LLMs) have demonstrated significant capabilities in understanding and generating human language, contributing to more natural interactions with complex systems. Howeve…
Can AI Explanations Make You Change Your Mind?
In the context of AI-based decision support systems, explanations can help users to judge when to trust the AI’s suggestion, and when to question it. In this way, human oversight can prevent AI errors…
Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions
Communication among humans relies on conversational grounding, allowing interlocutors to reach mutual understanding even when they do not have perfect knowledge and must resolve discrepancies in each …
Can Language Models Recognize Convincing Arguments?
propose tasks measuring LLMs’ ability to (1) distinguish between strong and weak arguments, (2) predict stances based on beliefs and demographic characteristics, and (3) determine the appeal of an arg…
Can Large Language Models Capture Human Annotator Disagreements?
Human annotation variation (i.e., annotation disagreements) is common in NLP and often reflects important information such as task subjectivity and sample ambiguity. While Large Language Models (LLMs)…
Can Large Language Models do Analytical Reasoning?
Our divide-and-conquer approach breaks down play-by-play data into smaller, more manageable segments, solves each piece individually, and then aggregates them together. Besides the divide-and-conquer …
Can Large Language Models perform Relation-based Argument Mining?
The general AM problem can be split into three main tasks: 1) argument identification, involving segmenting text into units and determining which are argumentative; 2) identification of argumentative …
Causal Sufficiency and Necessity Improves Chain-of-Thought Reasoning
Chain-of-Thought (CoT) prompting plays an indispensable role in endowing large language models (LLMs) with complex reasoning capabilities. However, CoT currently faces two fundamental challenges: (1) …
Chain of Stance: Stance Detection with Large Language Models
Stance detection is an active task in natural language processing (NLP) that aims to identify the author’s stance towards a particular target within a text. Given the remarkable language understanding…
ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning
Narrative comprehension on long stories and novels has been a challenging domain attributed to their intricate plotlines and entangled, often evolving relations among characters and entities. Given th…
Complex Logical Instruction Generation
Instruction following has catalyzed the recent era of Large Language Models (LLMs) and is the foundational skill underpinning more advanced capabilities such as reasoning and agentic behaviors. As tas…
Computational Modelling of Undercuts in Real-world Arguments
Argument Mining (AM) is the task of automatically analysing arguments, such that the unstructured information contained in them is converted into structured representations. Undercut is a unique struc…
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying
Studies have underscored how, regardless of the recent breakthrough and swift advances in AI research, even state-of-the-art Large Language models (LLMs) continue to struggle when performing logical a…
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations
Those models take a contrastive learning approach, where they build binary classifiers to differentiate positive, or coherent examples from negative, or incoherent dialogues. Those classifiers are usu…
DEEM: Dynamic Experienced Expert Modeling for Stance Detection
Recent work has made a preliminary attempt to use large language models (LLMs) to solve the stance detection task, showing promising results. However, considering that stance detection usually require…
DR-HAI: Argumentation-based Dialectical Reconciliation in Human-AI Interactions
“The significance of creating AI agents that can establish trust and accountability by interacting with human users is constantly increasing. This is the central idea behind the field of explainable A…
Debating with More Persuasive LLMs Leads to More Truthful Answers
Common methods for aligning large language models (LLMs) with desired behaviour heavily rely on human-labelled data. However, as models grow increasingly sophisticated, they will surpass human experti…
DeepCT-enhanced Lexical Argument Retrieval
The recent Touché lab’s argument retrieval task focuses on controversial topics like ‘Should bottled water be banned?’ and asks to retrieve relevant pro/con arguments. Interestingly, the most effectiv…
Determinants of LLM-assisted Decision-Making
Decision-making is a fundamental capability in everyday life. Large Language Models (LLMs) provide multifaceted support in enhancing human decision-making processes. However, understanding the influen…
Do LLMs Truly Understand When a Precedent Is Overruled?
Large language models (LLMs) with extended context windows show promise for complex legal reasoning tasks, yet their ability to understand long legal documents remains insufficiently evaluated. Develo…
Durably reducing conspiracy beliefs through dialogues with AI
“Conspiracy theories are a paradigmatic example of beliefs that, once adopted, are extremely difficult to dispel. Influential psychological theories propose that conspiracy beliefs are uniquely resist…
EVINCE: Optimizing Multi-LLM Dialogues Using Conditional Statistics and Information Theory
EVINCE (Entropy and Variation IN Conditional Exchanges) is a novel framework for optimizing multi-LLM dialogues using conditional statistics and information theory. It addresses limitations in multi-a…
Eliciting Reasoning in Language Models with Cognitive Tools
The recent advent of reasoning models like OpenAI’s o1 was met with excited speculation by the AI community about the mechanisms underlying these capabilities in closed models, followed by a rush of r…
Explicit Inductive Inference using Large Language Models
However, recently McKenna et al. (2023a) has pointed out that LLMs are severely affected by an attestation bias when performing inference tasks. Given the question of whether premise P entails hypothe…
Exploiting Dialogue Acts and Context to Identify Argumentative Relations in Online Debates
Argumentative Relation Classification is the task of determining the relationship between two contributions in the context of an argumentative dialogue. Existing models in the literature rely on a com…
Exploring the Potential of Large Language Models in Computational Argumentation
Computational argumentation has become an essential tool in various domains, including law, public policy, and artificial intelligence. It is an emerging research field in natural language processing …
Finding Common Ground: Using Large Language Models to Detect Agreement in Multi-Agent Decision Conferences
Decision conferences are structured, collaborative meetings that bring together experts from various fields to address complex issues and reach a consensus on recommendations for future actions or pol…
Fine-grained Hallucination Detection and Editing for Language Models
Several recent work studies automatic hallucination detection (Min et al., 2023) or editing outputs (Gao et al., 2022) to address such LM hallucinations. These systems typically categorize hallucinati…
Fine-tuning Pre-trained Language Models for Dialogical Argument Mining with Inference Anchoring Theory
In this paper, we present our framework for DialAM-2024 Task A: Identification of Propositional Relations and Task B: Identification of Illocutionary Relations. The goal of Task A is to detect argumen…
GenAI as a Power Persuader: How Professionals Get Persuasion Bombed When They Attempt to Validate LLMs
When diligent professionals make decisions, they validate their analyses. Increasingly, professionals across domains use generative AI (GenAI) for analytic knowledge work and therefore validate the AI…
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Most traditional AI safety research views models as machines and centers on algorithm focused attacks developed by security experts. As large language models (LLMs) become increasingly common and comp…
How susceptible are LLMs to Logical Fallacies?
“This paper investigates the rational thinking capability of Large Language Models (LLMs) in multi-round argumentative debates by exploring the impact of fallacious arguments on their logical reasonin…
HowProjective is Projective Content? Gradience in Projectivity and At-issueness
Projective content is utterance content that a speaker may be taken to be committed to even when the expression associated with the content occurs embedded under an entailment-canceling operator (e.g.…
Identification of Propositional and Illocutionary Relations
annotate dialogue based on the inference anchoring theory (IAT). The task can be split into two parts, identification of propositional relations and identification of illocutionary relations. We propo…
Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions
However, explanations generated via CoT are susceptible to content biases that negatively affect their robustness and faithfulness. To mitigate existing limitations, recent work has proposed the use o…
LLM Augmentations to support Analytical Reasoning over Multiple Documents
Our key contributions are: 1) We conduct the first investigation of the feasibility of using LLMs in intelligence analysis where both evidencebased reasoning and analytical creativity is of utmost …
LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory
What does it truly mean for a language model to “reason” strategically, and can scaling up alone guarantee intelligent, context-aware decisions? Strategic decision-making requires adaptive reasoning, …
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback
While a few methods improve content, they solely transfer the style of texts to be more formal (Rao and Tetreault, 2018; Lai et al., 2021), less subjective (Pryzant et al., 2020; Liu et al., 2021a), o…
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!
Large reasoning models (LRMs) tackle complex reasoning problems by following long chain-ofthoughts (Long CoT) that incorporate reflection, backtracking, and self-validation. However, the training tech…
LLMs Struggle to Reject False Presuppositions when Misinformation Stakes are High
These implicit assumptions, known as presuppositions, refer to background knowledge or shared beliefs assumed to be part of the common ground between interlocutors (Stalnaker, 1973). Presuppositions a…
Large Language Models are as persuasive as humans, but how? About the cognitive effort and moral-emotional language of LLM arguments
Abstract. Large Language Models (LLMs) are already as persuasive as humans. However, we know very little about how they do it. This paper investigates the persuasion strategies of LLMs, comparing them…
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning
Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle with complex logical problems. This paper introduces a novel framework, LOGICLM, which integrates LLMs with sy…
Modeling Appropriate Language in Argumentation
Online discussion moderators must make adhoc decisions about whether the contributions of discussion participants are appropriate or should be removed to maintain civility. Existing research on offens…
Modeling the Quality of Dialogical Explanations
Expert explainers usually plan an explanation strategy by choosing appropriate explanation moves, dialogue acts, and topics to ensure optimal comprehension on the explainee side (Wachsmuth and Alshoma…
Multi-Agent Collaborative Intelligence: Dual-Dial Control for Reliable LLM Reasoning
Multi-agent debate often wastes compute by using a fixed adversarial stance, aggregating without deliberation, or stopping on heuristics. We introduce MACI, an active controller with two independent d…
On the Adaptive Psychological Persuasion of Large Language Models
However, systematic exploration of their dual capabilities to autonomously persuade and resist persuasion, particularly in contexts involving psychological rhetoric, remains unexplored. In this paper,…
On the Conversational Basis of Some Presuppositions
The current literature on presupposition focuses almost exclusively on the projection problem: the question of how and why the presuppositions of atomic clauses are projected to complex sentences whic…
Overview of DialAM-2024: Argument Mining in Natural Language Dialogues
Argumentation is the process by which humans rationally elaborate their thoughts and opinions in written (e.g., essays) or spoken (e.g., debates) contexts. Argument Mining research, however, has been …
Persuasive presuppositions
A recurrent claim, coming from different approaches to pragmatics, argumentation theory and related disciplines, is that informative presuppositions have a special persuasive force. My aim in this pap…
Premise Order Matters in Reasoning with Large Language Models
Large language models (LLMs) have accomplished remarkable reasoning performance in various domains. However, in the domain of reasoning tasks, we discover a frailty: LLMs are surprisingly brittle to t…
Presuppositions are more persuasive than assertions if addressees accommodate them: Experimental evidence for philosophical reasoning
Best practice and descriptive research claim that presuppositions, such as the “too” in “,” increase the persuasiveness of arguments. Surprisingly, there is hardly any causal evidence for this claim. …
Prompting Large Language Models With the Socratic Method
Abstract—This paper presents a systematic approach to using the Socratic method in developing prompt templates that effectively interact with large language models, including GPT- 3. Various methods a…
Propositional Interpretability in Artificial Intelligence
David Chalmers I will argue for the importance of a special sort of interpretability, which I call propositional interpretability. This involves interpreting a system’s mechanisms and behavior in ter…
Query Understanding in the Age of Large Language Models
“The central problem of IR systems, also referred to as the “holy grail” of IR, is to overcome the vocabulary mismatch between the user and the system [75]. This leads to the challenge of matching the…
RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems
Abstract: Reasoning requires going beyond pattern matching or memorization of solutions to identify and implement “algorithmic procedures” that can be used to deduce answers to hard problems. Doing so…
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
Large Language Models (LLMs) still struggle with natural language reasoning tasks. Motivated by the society of minds (Minsky, 1988), we propose RECONCILE, a multi-model multiagent framework designed a…
Reasoning Can Hurt the Inductive Abilities of Large Language Models
Large Language Models (LLMs) have shown remarkable progress across domains, yet their ability to perform inductive reasoning—inferring latent rules from sparse examples—remains limited. It is often as…
Reasoning Models Are More Easily Gaslighted Than You Think
In this paper, we conduct a systematic evaluation of three state-of-the-art reasoning models, i.e., OpenAI’s o4-mini, Claude-3.7-Sonnet and Gemini-2.5-Flash, across three multimodal benchmarks: MMMU, …
Reasoning Strategies in Large Language Models: Can They Follow, Prefer, and Optimize?
Human reasoning involves different strategies, each suited to specific problems. Prior work shows that large language model (LLMs) tend to favor a single reasoning strategy, potentially limiting their…
Reranking-based Generation for Unbiased Perspective Summarization
Generating unbiased summaries in real-world settings such as political perspective summarization remains a crucial application of Large Language Models (LLMs). Yet, existing evaluation frameworks rely…
Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing
Preceding work in natural language processing (NLP) and computational linguistics (CL) has mostly focused on practical AQ assessment, considering either the overall quality of arguments (Toledo et al.…
Rhetorical XAI: Explaining AI’s Benefits as well as its Use via Rhetorical Design
Modern AI systems are notoriously opaque, limiting efforts to understand or audit their behaviors [42, 188]. In response, Explainable Artificial Intelligence (XAI) aims to foster trust and accountabil…
SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval
To address these issues, in this paper, we propose SAILER, a new Structure-Aware pre-traIned language model for LEgal case Retrieval. It is highlighted in the following three aspects: (1) SAILER fully…
SPICE: Self-Play In Corpus Environments Improves Reasoning
Self-improving systems require environmental interaction for continuous adaptation. We introduce SPICE (Self-Play In Corpus Environments), a reinforcement learning framework where a single model acts …
SciTopic: Enhancing Topic Discovery in Scientific Literature through Advanced LLM
Abstract—Topic discovery in scientific literature provides valuable insights for researchers to identify emerging trends and explore new avenues for investigation, facilitating easier scientific infor…
Self-reflecting Large Language Models: A Hegelian Dialectical Approach
Iterative self-reflection (Shinn et al., 2023; Madaan et al., 2023) is another approach that has recently gained significant attention within the NLP community. This method involves models mimicking h…
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making
Large language models (LLMs) have demonstrated strong potential in clinical question answering, with recent multi-agent frameworks further improving diagnostic accuracy via collaborative reasoning. Ho…
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds
We evaluate LLMs’ language understanding capacities on simple inference tasks that most humans find trivial. Specifically, we target (i) grammatically-specified entailments, (ii) premises with evident…
SocraSynth: Multi-LLM Reasoning with Conditional Statistics
Large language models (LLMs), while promising, face criticisms for biases, hallucinations, and a lack of reasoning capability. This paper introduces SocraSynth, a multi-LLM agent reasoning platform de…
Sources of Hallucination by Large Language Models on Inference Tasks
We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First, memorization at the level o…
Stance Detection on Social Media with Fine-Tuned Large Language Models
The implementation of prompting strategies represents a significant departure from traditional NLP model training methods. By employing these strategies, LLMs can generate predictions without the exte…
Strategic Reasoning with Language Models
This paper introduces an approach that uses pretrained LLMs with few-shot chain-of-thought examples to enable strategic reasoning for AI agents. Our approach uses systematically generated demonstratio…
Teaching Probabilistic Logical Reasoning to Transformers
We propose a novel end-to-end fine-tuning approach, Probabilistic Constraint Training (PCT), that utilizes probabilistic logical rules as constraints in the fine-tuning phase without relying on these …
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants
Reasoning is a crucial part of natural language argumentation. To comprehend an argument, one must analyze its warrant, which explains why its claim follows from its premises. As arguments are highly …
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
Large language models (LLMs) encapsulate vast amounts of knowledge but still remain vulnerable to external misinformation. Existing research mainly studied this susceptibility behavior in a single-tur…
The Levers of Political Persuasion with Conversational AI
There are widespread fears that conversational AI could soon exert unprecedented influence over human beliefs. Here, in three large-scale experiments (N=76,977), we deployed 19 LLMs—including some pos…
The Missing Layer of AGI: From Pattern Alchemy to Coordination Physics
Influential critiques argue that Large Language Models (LLMs) are a dead end for AGI: “mere pattern matchers” structurally incapable of reasoning or planning. We argue this conclusion misidentifies th…
The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning
Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a diagnose–measure–bridge–treat framework. Causal-behavior…
The Place of Emotion in Argument
Douglas N. Walton Abstract Appeals to emotion—pity, fear, popular sentiment, and _ad hominem_ attacks—are commonly used in argumentation. Instead of dismissing these appeals as fallacious wherever t…
The Thin Line Between Comprehension and Persuasion in LLMs
Large language models (LLMs) are excellent at maintaining high-level, convincing dialogue, but it remains unclear whether their persuasive success reflects genuine understanding of the discourse. We e…
The social component of the projection behavior of clausal complement contents
Abstract. Some accounts of presupposition projection predict that content’s consistency with the Common Ground influences whether it projects (e.g., Heim 1983; Gazdar 1979a,b). I conducted an experime…
Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models
Large language models (LLMs) have shown strong performance across natural language reasoning tasks, yet their reasoning processes remain brittle and difficult to interpret. Prompting techniques like C…
Thinking LLMs: General Instruction Following with Thought Generation
in the standard alignment framework they lack the basic ability of explicit thinking before answering. Thinking is important for complex questions that require reasoning and planning – but can be appl…
Thought Anchors: Which LLM Reasoning Steps Matter?
We argue that analyzing reasoning traces at the sentence level is a promising approach to understanding reasoning processes. We present three complementary attribution methods: (1) a black-box method …
Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design
we advocate for the development of conversational technology that is inherently designed to support and facilitate argumentative processes. We argue that, at present, large language models (LLMs) are …
Turiya at DialAM-2024: Inference Anchoring Theory Based LLM Parsers
Representing discourse as argument graphs facilitates robust analysis. Although computational frameworks for constructing graphs from monologues exist, there is a lack of frameworks for parsing dialog…
Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy
The widespread use of social media has led to a surge in popularity for automated methods of analyzing public opinion. Supervised methods are adept at text categorization, yet the dynamic nature of so…
Unlocking Varied Perspectives: A Persona-Based Multi-Agent Framework with Debate-Driven Text Planning for Argument Generation
Writing persuasive arguments is a challenging task for both humans and machines. It entails incorporating high-level beliefs from various perspectives on the topic, along with deliberate reasoning and…
Using LLMs to Discover Legal Factors
Recently, large language models (LLMs) have been applied automatically to annotate legal case texts from particular legal domains in terms of factors from pre-existing factor lists. In this paper, we …
Virtuous Machines: Towards Artificial General Science
Artificial intelligence systems are transforming scientific discovery by accelerating specific research tasks, from protein structure prediction to materials design, yet remain confined to narrow doma…
When Large Language Models are More Persuasive Than Incentivized Humans, and Why
Large Language Models (LLMs) have been shown to be highly persuasive, but when and why they outperform humans is still an open question. We compare the persuasiveness of two LLMs (Claude 3.5 Sonnet an…
Writing-Zero: Bridge the Gap Between Non-verifiable Tasks and Verifiable Rewards
Reinforcement learning with verifiable rewards (RLVR) has facilitated significant advances in large language models (LLMs), particularly for reasoning tasks with objective, ground-truth answers, such …

No results.