NLP and Linguistics

Topic · 133 papers

Related topics:

(QA)2: Question Answering with Questionable Assumptions
For instance, the question When did Marie Curie discover Uranium? cannot be answered as a typical when question without addressing the false assumption Marie Curie discovered Uranium. In this work, we…
A Non-Factoid Question-Answering Taxonomy
INSTRUCTION REASON EVIDENCE-BASED COMPARISON EXPERIENCE DEBATE INSTRUCTION You want to understand the procedure/method of doing/achieving something. Instructions/guidelines provided in a step-…
A comprehensive taxonomy of hallucinations in Large Language Models
This report provides a comprehensive taxonomy of LLM hallucinations, beginning with a formal definition and a theoretical framework that posits its inherent inevitability in computable LLMs, irrespect…
A meta-analysis of the persuasive power of large language models
Large language models (LLMs) are increasingly used for persuasion, such as in political communication and marketing, where they affect how people think, choose, and act. Yet, empirical findings on the…
A recipe for annotating grounded clarifications
In order to interpret the communicative intents of an utterance, it needs to be grounded in something that is outside of language; that is, grounded in world modalities. In this paper we argue that di…
ACE: Abstractions for Communicating Efficiently
A central but unresolved aspect of problem-solving in AI is the capability to introduce and use abstractions, something humans excel at. Work in cognitive science has demonstrated that humans tend tow…
ANAPHORA RESOLUTION: THE STATE OF THE ART
The "pointing back" (reference) is called an anaphor and the entity to which it refers is its antecedent. The process of determining the antecedent of an anaphor is called anaphora resolution. Usually…
Adam's Law: Textual Frequency Law on Large Language Models
While textual frequency has been validated as relevant to human cognition in reading speed, its relatedness to Large Language Models (LLMs) is seldom studied. We propose a novel research direction in …
Argument Quality Assessment in the Age of Instruction-Following Large Language Models
Rather than just fine-tuning LLMs towards leaderboard chasing on assessment tasks, they need to be instructed systematically with argumentation theories and scenarios as well as with ways to solve arg…
Assessment of Personality Dimensions Across Situations Using Conversational Speech
Abstract—Prior research indicates that users prefer assistive technologies whose personalities align with their own. This has sparked interest in automatic personality perception (APP), which aims to …
Attention, Intentions, And The Structure Of Discourse
In this paper we explore a new theory of discourse structure that stresses the role of purpose and processing in discourse. In this theory, discourse structure is composed of three separate but interr…
Automatic Extraction of Metaphoric Analogies from Literary Texts: Task Formulation, Dataset Construction, and Evaluation
Extracting metaphors and analogies from free text requires high-level reasoning abilities such as abstraction and language understanding. Our study focuses on the extraction of the concepts that form …
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Pretraining language models on formal language can improve their acquisition of natural language. Which features of the formal language impart an inductive bias that leads to effective transfer? Drawi…
Beyond Passive Critical Thinking: Fostering Proactive Questioning to Enhance Human-AI Collaboration
Critical thinking is essential for building robust AI systems, preventing them from blindly accepting flawed data or biased reasoning. However, prior work has primarily focused on passive critical thi…
Bigger is not always better: The importance of human-scale language modeling for psycholinguistics
scaling has several downsides for both computational psycholinguistics and natural language processing research. We discuss the scientific challenges presented by the scaling paradigm, as well as the …
CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning
Large Language Models (LLMs) have recently achieved impressive results in complex reasoning tasks through Chain of Thought (CoT) prompting. However, most existing CoT methods rely on using the same pr…
Can AI Explanations Make You Change Your Mind?
In the context of AI-based decision support systems, explanations can help users to judge when to trust the AI’s suggestion, and when to question it. In this way, human oversight can prevent AI errors…
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Recent advancements in large language models (LLMs) have sparked optimism about their potential to accelerate scientific discovery, with a growing number of works proposing research agents that autono…
Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions
Communication among humans relies on conversational grounding, allowing interlocutors to reach mutual understanding even when they do not have perfect knowledge and must resolve discrepancies in each …
Can LLMs assist with Ambiguity? A Quantitative Evaluation of various Large Language Models on Word Sense Disambiguation
Ambiguous words are often found in modern digital communications. Lexical ambiguity challenges traditional Word Sense Disambiguation (WSD) methods, due to limited data. Consequently, the efficiency of…
Can Large Language Models Understand Context?
Understanding context is key to understanding human language, an ability which Large Language Models (LLMs) have been increasingly seen to demonstrate to an impressive extent. However, though the eval…
Can Large Language Models perform Relation-based Argument Mining?
The general AM problem can be split into three main tasks: 1) argument identification, involving segmenting text into units and determining which are argumentative; 2) identification of argumentative …
Causal Sufficiency and Necessity Improves Chain-of-Thought Reasoning
Chain-of-Thought (CoT) prompting plays an indispensable role in endowing large language models (LLMs) with complex reasoning capabilities. However, CoT currently faces two fundamental challenges: (1) …
Chain of Stance: Stance Detection with Large Language Models
Stance detection is an active task in natural language processing (NLP) that aims to identify the author’s stance towards a particular target within a text. Given the remarkable language understanding…
ChatGPT Reads Your Tone and Responds Accordingly -- Until It Does Not -- Emotional Framing Induces Bias in LLM Outputs
Background: Large Language Models (LLMs) like GPT-4 tailor their responses not just to the content but also to the tone of user prompts. Prior work has hinted that emotional phrasing – whether optimis…
ChatGPT: deconstructing the debate and moving it forward
In particular, we argue that the discussion about LLMs like ChatGPT reveals and assumes (1) an externalist and instrumentalist view of technology that presents technology as just a tool and, paradoxic…
Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data
We argue that the language modeling task, because it only uses form as training data, cannot in principle lead to learning of meaning. We take the term language model to refer to any system trained on…
Clustering-based Sampling for Few-Shot Cross-Domain Keyphrase Extraction
Keyphrase extraction is the task of identifying a set of keyphrases present in a document that captures its most salient topics. Scientific domain-specific pre-training has led to achieving state-of-t…
Collaborative Rational Speech Act: Pragmatic Reasoning for Multi-Turn Dialog
In this paper, we introduce Collaborative Rational Speech Act (CRSA), an information-theoretic (IT) extension of RSA that models multi-turn dialog by optimizing a gain function adapted from rate-disto…
Comparing Apples to Apples: Generating Aspect-Aware Comparative Sentences from User Reviews
“Deciding on a product to purchase can be a time-consuming process. Every user has specific quality preferences, budget restrictions, or enjoys different item features. To distill important informatio…
Complex Logical Instruction Generation
Instruction following has catalyzed the recent era of Large Language Models (LLMs) and is the foundational skill underpinning more advanced capabilities such as reasoning and agentic behaviors. As tas…
Computational Modelling of Undercuts in Real-world Arguments
Argument Mining (AM) is the task of automatically analysing arguments, such that the unstructured information contained in them is converted into structured representations. Undercut is a unique struc…
Computational structuralism: Toward a formal theory of meaning in the age of digital intelligence
The discovery that “next-token predictor” language models can fluently produce text has important but underappreciated theoretical implications. Most notably, their success demonstrates that fully rel…
Conversation Chronicles: Towards Diverse Temporal and Relational Dynamics in Multi-Session Conversations
In the field of natural language processing, open-domain chatbots have emerged as an important research topic. However, a major limitation of existing open-domain chatbot research is its singular focu…
Conversational DNA: A New Visual Language for Understanding Dialogue Structure in Human and AI
What if the patterns hidden within dialogue reveal more about communication than the words themselves? We introduce Conversational DNA, a novel visual language that treats any dialogue – whether betwe…
Conversational Semantic Parsing for Dialog State Tracking
We consider a new perspective on dialog state tracking (DST), the task of estimating a user’s goal through the course of a dialog. By formulating DST as a semantic parsing task over hierarchical repre…
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations
Those models take a contrastive learning approach, where they build binary classifiers to differentiate positive, or coherent examples from negative, or incoherent dialogues. Those classifiers are usu…
Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models
Effective interlocutors account for the uncertain goals, beliefs, and emotions of others. But even the best human conversationalist cannot perfectly anticipate the trajectory of a dialogue. How well c…
Detecting Cognitive Distortions from Patient-Therapist Interactions
An important part of Cognitive Behavioral Therapy (CBT) is to recognize and restructure certain negative thinking patterns that are also known as cognitive distortions. This project aims to detect the…
Detecting hallucinations in large language models using semantic entropy
Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect genera…
Development and validation of large language model rating scales for automatically transcribed psychological therapy sessions
Rating scales have shaped psychological research, but are resource-intensive and can burden participants. Large Language Models (LLMs) offer a tool to assess latent constructs in text. This study intr…
Diplomat: A Dialogue Dataset for Situated PragMATic Reasoning
“We introduce a new benchmark, Diplomat, aiming at a unified paradigm for pragmatic reasoning and situated conversational understanding. Compared with previous works that treat different figurative ex…
Discourse Structure and Dialogue Acts in Multiparty Dialogue: the STAC Corpus
Here is a table of the absolute frequency of the relations in our corpus. Total Training Testing Comment 1851 1684 167 Clarification_question 260 240 20 Elaboration 869 771 98 Acknowledgment 1010 89…
Discovering Latent Concepts Learned in BERT
A large number of studies that analyze deep neural network models and their ability to encode various linguistic and non-linguistic concepts provide an interpretation of the inner mechanics of these m…
Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations
While large language models have significantly enhanced the effectiveness of discourse relation classifications, it remains unclear whether their comprehension is faithful and reliable. We provide DIS…
Do LLMs produce texts with "human-like" lexical diversity?
The degree to which LLMs produce writing that is truly human-like remains unclear despite the extensive empirical attention that this question has received. The present study addresses this question f…
Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom
Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce SwordsmanImp, the first Chinese…
Do large language models resemble humans in language use?
regularities in language range from phonology to pragmatics. For example, people associate different sounds with different referents (e.g., Köhler, 1929), automatically reinterpret implausible sentenc…
Eliciting Reasoning in Language Models with Cognitive Tools
The recent advent of reasoning models like OpenAI’s o1 was met with excited speculation by the AI community about the mechanisms underlying these capabilities in closed models, followed by a rush of r…
Empirical Study of Symmetrical Reasoning in Conversational Chatbots
This work explores the capability of conversational chatbots powered by large language models (LLMs), to understand and characterize predicate symmetry, a cognitive linguistic function traditionally b…
Evaluating Emotional Nuances In Dialogue Summarization
“Affective content has been the target of a few summarization tasks such as opinion summarization [Wang and Ling, 2016]. However, opinion is only a subset of affective expressions and such task mainly…
Event-Aware Sentiment Factors from LLM-Augmented Financial Tweets: A Transparent Framework for Interpretable Quant Trading
In this study, we wish to showcase the unique utility of large language models (LLMs) in financial semantic annotation and alpha signal discovery. Leveraging a corpus of company-related tweets, we use…
Explicit Inductive Inference using Large Language Models
However, recently McKenna et al. (2023a) has pointed out that LLMs are severely affected by an attestation bias when performing inference tasks. Given the question of whether premise P entails hypothe…
Exploiting Dialogue Acts and Context to Identify Argumentative Relations in Online Debates
Argumentative Relation Classification is the task of determining the relationship between two contributions in the context of an argumentative dialogue. Existing models in the literature rely on a com…
Exploring the Potential of ChatGPT on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
This paper aims to quantitatively evaluate the performance of ChatGPT, an interactive large language model, on inter-sentential relations such as temporal relations, causal relations, and discourse re…
Finding Common Ground: Using Large Language Models to Detect Agreement in Multi-Agent Decision Conferences
Decision conferences are structured, collaborative meetings that bring together experts from various fields to address complex issues and reach a consensus on recommendations for future actions or pol…
Fine-tuning Pre-trained Language Models for Dialogical Argument Mining with Inference Anchoring Theory
In this paper, we present our framework for DialAM-2024 Task A: Identification of Propositional Relations and Task B: Identification of Illocutionary Relations. The goal of Task A is to detect argumen…
From Five Dimensions to Many: Large Language Models as Precise and Interpretable Psychological Profilers
We prompted various LLMs with Big Five Personality Scale responses from 816 human individuals to role-play their responses on nine other psychological scales. LLMs demonstrated remarkable accuracy in …
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
Humans organize knowledge into compact categories through semantic compression by mapping diverse instances to abstract representations while preserving meaning (e.g., robin and blue jay are both bird…
Grounding Gaps in Language Model Generations
However, it is unclear whether large language models (LLMs) generate text that reflects human grounding. To this end, we curate a set of grounding acts and propose corresponding metrics that quantify …
Grounding ‘Grounding’ in NLP
In contrast, Cognitive Science more formally defines “grounding” as the process of establishing what mutual information is required for successful communication between two interlocutors – a definitio…
HowProjective is Projective Content? Gradience in Projectivity and At-issueness
Projective content is utterance content that a speaker may be taken to be committed to even when the expression associated with the content occurs embedded under an entailment-canceling operator (e.g.…
Identification of Propositional and Illocutionary Relations
annotate dialogue based on the inference anchoring theory (IAT). The task can be split into two parts, identification of propositional relations and identification of illocutionary relations. We propo…
Inspecting and Editing Knowledge Representations in Language Models
[[Natural Language Inference]] Neural language models (LMs) represent facts about the world described by text. Sometimes these facts derive from training data (in most LMs, a representation of the …
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments
The social and implicit nature of human communication ramifies readers’ understandings of written sentences. Single gold-standard interpretations rarely exist, challenging conventional assumptions in …
Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?
However, human sarcasm understanding is often considered an intuitive and holistic cognitive process, in which various linguistic, contextual, and emotional cues are integrated to form a comprehensive…
LLMs Struggle to Reject False Presuppositions when Misinformation Stakes are High
These implicit assumptions, known as presuppositions, refer to background knowledge or shared beliefs assumed to be part of the common ground between interlocutors (Stalnaker, 1973). Presuppositions a…
LLMs are Frequency Pattern Learners in Natural Language Inference
While fine-tuning LLMs on NLI corpora improves their inferential performance, the underlying mechanisms driving this improvement remain largely opaque. In this work, we conduct a series of experiments…
Language Models’ Hall of Mirrors Problem: Why AI Alignment Requires Peircean Semiosis
This paper examines some limitations of large language models (LLMs) through the framework of Peircean semiotics. We argue that basic LLMs exist within a "hall of mirrors," manipulating symbols withou…
Language models show human-like content effects on reasoning tasks
Abstract reasoning is a key ability for an intelligent system. Large language models (LMs) achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human …
Large Linguistic Models: Investigating LLMs' metalinguistic abilities
Abstract—The performance of large language models (LLMs) has recently improved to the point where models can perform well on many language tasks. We show here that—for the first time—the models can al…
Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency
Languaging is not the kind of thing that can admit of a complete or comprehensive modelling. From an enactive perspective we identify three key characteristics of enacted language; embodiment, partici…
Learning to Map Context-Dependent Sentences to Executable Formal Queries
We propose a context-dependent model to map utterances within an interaction to executable formal queries. To incorporate interaction history, the model maintains an interaction-level encoder that upd…
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways
Large Language Models (LLMs) generate complex and largely grammatical strings and display impressive performance with structures traditionally thought to require abstract and hierarchical syntax (Linz…
Linguistic Alignment in Conversational AI: A Systematic Review of Cognitive-Linguistic Dimensions, Measurements, and User Outcomes (2020–2025)
Conversational Artificial Intelligence systems frequently adapt to or mirror the user’s linguistic style, an emergent dynamic that shapes whether the AI is perceived as a tool, a partner, or a hybrid …
Linguistic Blind Spots of Large Language Models
questions linger regarding their ability to perform fine-grained linguistic annotation tasks, such as detecting nouns or verbs, or identifying more complex syntactic structures like clauses in input t…
Linguistic markers of inherently false AI communication and intentionally false human communication: Evidence from hotel reviews
To the human eye, AI-generated outputs of large language models have increasingly become indistinguishable from human-generated outputs. Therefore, to determine the linguistic properties that separate…
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models
we investigate if NLI tasks, that are rarely used for LLM evaluation, can still be informative for evaluating LLMs. Focusing on five different NLI benchmarks across six models of different scales, we …
Man vs machine – Detecting deception in online reviews
This study focused on three main research objectives: analyzing the methods used to identify deceptive online consumer reviews, evaluating insights provided by multi-method automated approaches based …
Meanings are like Onions: a Layered Approach to Metaphor Processing
Abstract Metaphorical meaning is not a flat mapping between concepts, but a complex cognitive phenomenon that integrates multiple levels of interpretation. In this paper, we propose a stratified mode…
Metadiscursive nouns in academic argument: ChatGPT vs student practices
The ability of ChatGPT to create grammatically accurate and coherent texts has generated considerable anxiety among those concerned that students might use such large language models (LLMs) to write t…
Minds versus Machines: Rethinking Entailment Verification with Language Models
Leveraging a comprehensively curated entailment verification benchmark, we evaluate both human and LLM performance across various reasoning categories. Our benchmark includes datasets from three categ…
Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover's Distance
Linguistic coordination is a well-established phenomenon in spoken conversations and often associated with positive social behaviors and outcomes. While there have been many attempts to measure lexica…
Modeling the Quality of Dialogical Explanations
Expert explainers usually plan an explanation strategy by choosing appropriate explanation moves, dialogue acts, and topics to ensure optimal comprehension on the explainee side (Wachsmuth and Alshoma…
Neural Conversation Models and How to Rein Them in: A Survey of Failures and Fixes
“In this paper, we attempt to systematise the literature about the attested problems of neural conversation models (conditional language models realised with neural networks) used as chat-partner simu…
Neutralizing Bias in LLM Reasoning using Entailment Graphs
However, recent works show that LLMs still suffer from hallucinations in NLI due to attestation bias, where LLMs overly rely on propositional memory to build shortcuts. To solve the issue, we design a…
No that's not what I meant: Handling Third Position Repair in Conversational Question Answering
The ability to handle miscommunication is crucial to robust and faithful conversational AI. People usually deal with miscommunication immediately as they detect it, using highly systematic interaction…
On the Binding Problem in Artificial Neural Networks
In this work, we argue that this underlying cause is the binding problem: The inability of existing neural networks to dynamically and flexibly bind information that is distributed throughout the netw…
On the Conversational Basis of Some Presuppositions
The current literature on presupposition focuses almost exclusively on the projection problem: the question of how and why the presuppositions of atomic clauses are projected to complex sentences whic…
On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models
The ability of Large Language Models (LLMs) to encode syntactic and semantic structures of language is well examined in NLP. Additionally, analogy identification, in the form of word analogies are ext…
Opportunities for large language models and discourse in engineering design
In this paper, we argue that foundation models such as LLMs can be used for creative reasoning tasks in the engineering design process, complementing and integrating existing computational methods suc…
Overview of DialAM-2024: Argument Mining in Natural Language Dialogues
Argumentation is the process by which humans rationally elaborate their thoughts and opinions in written (e.g., essays) or spoken (e.g., debates) contexts. Argument Mining research, however, has been …
Position: LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks
Large Language Models (LLMs), essentially n-gram models on steroids which have been pre-trained on web-scale language corpora (or, effectively, our collective consciousness), have caught the imaginati…
Post-training for Efficient Communication via Convention Formation
Humans communicate with increasing efficiency in multi-turn interactions, by adapting their language and forming ad-hoc conventions. In contrast, prior work shows that LLMs do not naturally show this …
Pragmatic Implicature Processing in ChatGPT
Recent large language models (LLMs) and LLM-driven chatbots, such as ChatGPT, have sparked debate regarding whether these artificial systems can develop human-like linguistic capacities. We examined t…
Presuppositions are more persuasive than assertions if addressees accommodate them: Experimental evidence for philosophical reasoning
Best practice and descriptive research claim that presuppositions, such as the “too” in “,” increase the persuasiveness of arguments. Surprisingly, there is hardly any causal evidence for this claim. …
Pretrained Language Models as Containers of the Discursive Knowledge
Abstract: Discourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is ca…
Probing Structured Semantics Understanding and Generation of Language Models via Question Answering
As John McCarthy (McCarthy, 1990, 1959) points out, in order to a better understanding of natural language, it is necessary for an intelligence system to understand the “deep structure” (Chomsky, 2011…
Real-time News Story Identification
To improve the reading experience, many news sites organize news into topical collections, called stories. In this work, we present an approach for implementing real-time story identification for a ne…
Rhetorical XAI: Explaining AI’s Benefits as well as its Use via Rhetorical Design
Modern AI systems are notoriously opaque, limiting efforts to understand or audit their behaviors [42, 188]. In response, Explainable Artificial Intelligence (XAI) aims to foster trust and accountabil…
SciTopic: Enhancing Topic Discovery in Scientific Literature through Advanced LLM
Abstract—Topic discovery in scientific literature provides valuable insights for researchers to identify emerging trends and explore new avenues for investigation, facilitating easier scientific infor…
Semantic Change Characterization with LLMs using Rhetorics
Languages continually evolve in response to societal events, resulting in new terms and shifts in meanings. These changes have significant implications for computer applications, including automatic t…
Semantic Parsing for Task Oriented Dialog using Hierarchical Representations
Previous work on task oriented intent and slot-filling work has been restricted to one intent per query and one slot label per token, and thus cannot model complex compositional requests. Alternative …
Semantic Structure in Large Language Model Embeddings
Psychological research consistently finds that human ratings of words across diverse semantic scales can be reduced to a low-dimensional form with relatively little information loss. We find that the …
Sequence Organization in Interaction: A Primer in Conversation Analysis
By “generic orders of organization,” I mean the various organizations of practice that deal with the various generic organizational contingencies of talk-in-interaction without which it cannot proceed…
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds
We evaluate LLMs’ language understanding capacities on simple inference tasks that most humans find trivial. Specifically, we target (i) grammatically-specified entailments, (ii) premises with evident…
Simulacra as conscious exotica
The advent of conversational agents with increasingly human-like behaviour throws old philosophical questions into new light. Does it, or could it, ever make sense to speak of AI agents built out of g…
Sources of Hallucination by Large Language Models on Inference Tasks
We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First, memorization at the level o…
Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs
Humans spontaneously use increasingly efficient language as interactions progress, by adapting and forming ad-hoc conventions. This phenomenon has been studied extensively using reference games, showi…
Task-Oriented Dialogue with In-Context Learning
We describe a system for building task oriented dialogue systems combining the in context learning abilities of large language models (LLMs) with the deterministic execution of business logic. LLMs ar…
TaskLAMA: Probing the Complex Task Understanding of Language Models
“Structured Complex Task Decomposition (SCTD) is the problem of breaking down a complex real-world task (such as planning a wedding) into a directed acyclic graph over individual steps that contribute…
Teaching Probabilistic Logical Reasoning to Transformers
We propose a novel end-to-end fine-tuning approach, Probabilistic Constraint Training (PCT), that utilizes probabilistic logical rules as constraints in the fine-tuning phase without relying on these …
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants
Reasoning is a crucial part of natural language argumentation. To comprehend an argument, one must analyze its warrant, which explains why its claim follows from its premises. As arguments are highly …
The Demon is in Ambiguity: Revisiting Situation Recognition with Single Positive Multi-Label Learning
Abstract—Context recognition (SR) is a fundamental task in computer vision that aims to extract structured semantic summaries from images by identifying key events and their associated entities. Speci…
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Despite widespread use of LLMs as conversational agents, evaluations of performance fail to capture a crucial aspect of communication: interpreting language in context—incorporating its pragmatics. Hu…
The Hermeneutics of Artificial Text
The paper justifies the necessity of using the research background of hermeneutics to study artificial texts and also proposes the first conclusions about these texts in the context of this background…
The Levers of Political Persuasion with Conversational AI
There are widespread fears that conversational AI could soon exert unprecedented influence over human beliefs. Here, in three large-scale experiments (N=76,977), we deployed 19 LLMs—including some pos…
The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning
Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a diagnose–measure–bridge–treat framework. Causal-behavior…
The Vector Grounding Problem
Confusingly, the notion of grounding is also used in relation to another aspect of language, which has to do with communication (Clark & Brennan 1991, Traum 1994). In this context, the ’grounding prob…
The social component of the projection behavior of clausal complement contents
Abstract. Some accounts of presupposition projection predict that content’s consistency with the Common Ground influences whether it projects (e.g., Heim 1983; Gazdar 1979a,b). I conducted an experime…
Theory of Knowledge Based on the Idea of the Discursive Space
This paper discusses the theory of knowledge based on the idea of dynamical space. The goal of this effort is to comprehend the knowledge that remains beyond the human domain, e.g., of the artificial …
Toward Conversational Agents with Context and Time Sensitive Long-term Memory
There has recently been growing interest in conversational agents with long-term memory which has led to the rapid development of language models that use retrieval-augmented generation (RAG). Until r…
Transformer-based cynical expression detection in a corpus of Spanish YouTube reviews
Consumers of services and products actively engage through social networks when they are dissatisfied, exhibiting a wide range of behaviors. Encinas and Cavazos (2021). Encinas presents a classificati…
Truth or lie: Exploring the language of deception
we investigated whether linguistic features that differentiate true and false utterances in English—namely utterance length, concreteness, and particular parts-of-speech—are also present in the Polish…
Turning large language models into cognitive models
ask whether large language models can be turned into cognitive models. We find that – after finetuning them on data from psychological experiments – these models offer accurate representations of huma…
Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy
The widespread use of social media has led to a surge in popularity for automated methods of analyzing public opinion. Supervised methods are adept at text categorization, yet the dynamic nature of so…
Using Natural Language for Reward Shaping in Reinforcement Learning
Using arbitrary natural language statements within reinforcement learning presents several challenges. First, a mapping between language and objects/actions must implicitly or explicitly be learned, a…
We’re Afraid Language Models Aren’t Modeling Ambiguity
Ambiguity is an intrinsic feature of natural language. Managing ambiguity is a key part of human language understanding, allowing us to anticipate misunderstanding as communicators and revise our inte…
What are the Goals of Distributional Semantics?
As Harnad (1990) discusses, if the meanings of words are defined only in terms of other words, these definitions are circular. One goal for a semantic model is to capture how language relates to the w…
What does it mean to understand language?
Language understanding entails not just extracting the surface-level meaning of the linguistic input, but constructing rich mental models of the situation it describes. Here we propose that because pr…
What we talk to when we talk to language models
David Chalmers [[Linguistics, NLP, NLU]] [[Role Play]] [[Philosophy Subjectivity]] Quasi-interpretivism does not say anything about whether LLMs have beliefs and desires. But it does make it plausib…
Word Meanings in Transformer Language Models
We investigate how word meanings are represented in the transformer language models. Specifically, we focus on whether transformer models employ something analogous to a lexical store - where each wor…
“Understanding AI”: Semantic Grounding in Large Language Models
This motivates another method: looking under the hood of systems and exploring their internal mechanisms and functions. But in the case of deep learning neural networks, the notorious black box proble…