The emergence of AI companion applications has created novel forms of intimate human-AI relationships, yet empirical research on these communities remains limited. We present the first large-scale com…
Recent improvements in large language models (LLMs) have led many researchers to focus on building fully autonomous AI agents. This position paper questions whether this approach is the right path for…
Humans have minds that interpret the external reality, beyond the ability to follow instructions. With a ‘mindful brain’ (Edelman & Mountcastle, 1978) that software—based on algorithms—cannot have, th…
Self-improvement is a goal currently exciting the field of AI, but is fraught with danger, and may take time to fully achieve. We advocate that a more achievable and better goal for humanity is to max…
The limited scope of this article aims to highlighting which insights can be drawn from Habermasian theory and what status can be assigned to LLMs that participate in discursive practices with humans …
By its nature, intelligence is high-dimensional and relational, not a single quantity that must be unambiguously less or greater than human scale. In fact, it is unclear what we even mean by “human sc…
By exploring past incarnations of agents, we can understand what has been done previously, what worked, and more importantly, what did not pan out and why. This understanding lets us to examine what d…
Dishonesty is far from a new phenomenon. But as chatbots, online forms, and other digital interfaces grow more and more common across a wide range of customer service applications, bending the truth t…
From an education perspective, it is important to distinguish between content knowledge (the factual or conceptual understanding of a subject) and pedagogical knowledge (understanding the methods and …
The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values, (2) that human rationality can be understood in terms of maximizing the satisfaction …
We conclude that the performance of today’s LLMs can augment the CSS research pipeline in two ways: (1) serving as zero-shot data annotators on human annotation teams, and (2) bootstrapping challengin…
In particular, we argue that the discussion about LLMs like ChatGPT reveals and assumes (1) an externalist and instrumentalist view of technology that presents technology as just a tool and, paradoxic…
By and large, current scholarship examining ChatGPT and generative AI shows a strong anthropocentric motivation or a human–institutional focus. Many studies look at the structural impact of the techno…
We argue that the language modeling task, because it only uses form as training data, cannot in principle lead to learning of meaning. We take the term language model to refer to any system trained on…
CogBench, a benchmark that includes ten behavioral metrics derived from seven cognitive psychology experiments. This novel approach offers a toolkit for phenotyping LLMs’ behavior. We apply CogBench t…
Chain-of-Thought (CoT) prompting helps models think step by step. But what happens when they must see, understand, and judge—all at once? In visual tasks grounded in social context, where bridging per…
At present, relatively little is known about the dynamics of multiple LLM agents interacting over many generations of iterative deployment. In this paper, we examine whether a “society” of LLM agents …
In today’s world of fast-growing technology and an inexhaustible amount of data, there is a great need to control and verify data validity due to the possibility of fraud. Therefore, the need for a re…
Understanding how users perceive content from generative AI tools is crucial because it can help reduce unwarranted trust in inaccurate information and mitigate the spread of misinformation. A focus g…
Large language models (LLMs) exhibit compelling linguistic behaviour, and sometimes offer self-reports, that is to say statements about their own nature, inner workings, or behaviour. In humans, such …
Addressing collective issues in social development requires a high level of social cohesion, characterized by cooperation and close social connections. However, social cohesion is challenged by selfis…
What do real conversations with Claude tell us about the effects of AI on labor productivity? Using our privacy-preserving analysis method, we sample one hundred thousand real conversations from Claud…
As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interacti…
The responsibility gap, commonly described as a core challenge for the effective governance of, and trust in, AI and autonomous systems (AI/AS), is traditionally associated with a failure of the epist…
method leverages the inherent vulnerabilities of LLMs in handling world knowledge, which can be exploited by attackers to unconsciously spread fabricated information. Through extensive experiments, we…
Large Language Models (LLMs), in the recent years, have become more sophisticated and capable for them to be applicable in many situations and tasks. These tasks are not limited to information extract…
Our framework features an audio-enhanced mini-interview to capture nuanced worker desires and introduces the HumanAgency Scale (HAS) as a shared language to quantify the preferred level of human invol…
In many cases, people will not interact directly with AI systems but instead read conversations between AI systems and other people. We measured how well people and large language models can discrimin…
- You will be randomly assigned to play as either the Interrogator or the Witness. - Each game lasts for 5 minutes or until the Interrogator makes a decision. - At the end of each round the identity o…
This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of ‘gradual disempowerment’, in contrast to the abrupt takeover scenarios co…
Abstract: There is much discussion of the false outputs that generative AI systems such as ChatGPT, Claude, Gemini, DeepSeek, and Grok create. In popular terminology, these have been dubbed AI halluci…
AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively su…
But how compelling are these AI-generated ideas, and how can we improve their quality? Here, we introduce SciMuse, which uses 58 million research papers and a large-language model to generate research…
Recent advances in large language models (LLM) have enabled richer social simulations, allowing for the study of various social phenomena. However, most recent work has used a more omniscient perspect…
This paper examines some limitations of large language models (LLMs) through the framework of Peircean semiotics. We argue that basic LLMs exist within a "hall of mirrors," manipulating symbols withou…
In this paper, we uncover notable diversity in the ideological stance exhibited across different LLMs and languages in which they are accessed. We do this by prompting a diverse panel of popular LLMs …
Large language models sometimes produce structured, first-person descriptions that explicitly reference awareness or subjective experience. To better understand this behavior, we investigate one theor…
newly developed large language models (LLM)—because of how they are trained and designed—can be thought of as implicit computational models of humans—a homo silicus. I consider the reasons the reason…
Understanding Theory of Mind is essential for building socially intelligent multimodal agents capable of perceiving and interpreting human behavior. We introduce MOMENTS (Multimodal Mental States), a …
Existing theories and research in human-machine communication (HMC) suggest that humans tend to mindlessly anthropomorphize the media technologies they interact with, that is, to attribute humans’ men…
This study focused on three main research objectives: analyzing the methods used to identify deceptive online consumer reviews, evaluating insights provided by multi-method automated approaches based …
Abstract. Artificial intelligence (AI) is the name popularly given to a broad spectrum of computer tools designed to perform increasingly complex cognitive tasks, including many that used to solely be…
RLHF assumes that annotation responses reflect genuine human preferences. We argue this assumption warrants systematic examination, and that behavioral science offers frameworks that bring clarity to …
We address this gap by analyzing data from the AI Search Arena, a head-to-head evaluation platform for AI search systems. The dataset comprises over 24,000 conversations and 65,000 responses from mode…
This report outlines several case studies on how actors have misused our models, as well as the steps we have taken to detect and counter such misuse. By sharing these insights, we hope to protect the…
Synthesizing unstructured research materials into manuscripts is an essential yet under-explored challenge in AI-driven scientific discovery. Existing autonomous writers are rigidly coupled to specifi…
We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged wh…
AI systems are increasingly designed in ways that lead users to perceive them as conscious. This paper provides a unified framework connecting empirical hallmarks of consciousness attribution to a str…
many online platforms try to predict which content - a song, a video, a post, or an article - is the best fit for each user. Medical providers have also begun using machine learning techniques to auto…
Simulating society with large language models (LLMs), we argue, requires more than generating plausible behavior; it demands cognitively grounded reasoning that is structured, revisable, and traceable…
People rely on social skills like conflict resolution to communicate effectively and to thrive in both work and personal life. However, practice environments for social skills are typically out of rea…
Large language models (LLMs) encapsulate vast amounts of knowledge but still remain vulnerable to external misinformation. Existing research mainly studied this susceptibility behavior in a single-tur…
The proliferation of AI-generated and AI-assisted text on the internet is feared to contribute to a degradation in semantic and stylistic diversity, factual accuracy, and other negative developments (…
As Bainbridge [7] noted, a key irony of automation is that by mechanising routine tasks and leaving exception-handling to the human user, you deprive the user of the routine opportunities to practice …
The rapid integration of large language models (LLMs) into everyday workflows has transformed how individuals perform cognitive tasks such as writing, programming, analysis, and multilingual communica…
We outline some common methodological issues in the field of critical AI studies, including a tendency to overestimate the explanatory power of individual samples (the benchmark casuistry), a dependen…
In this paper, we contend that the designers and final users of these ML methods have forgotten a fundamental lesson from statistics: correlation does not imply causation. Not only do most state-of-th…
Abstract— Conversational Swarm Intelligence (CSI) is a new technology that enables human groups of potentially any size to hold real-time deliberative conversations online. Modeled on the dynamics of …
Consumers of services and products actively engage through social networks when they are dissatisfied, exhibiting a wide range of behaviors. Encinas and Cavazos (2021). Encinas presents a classificati…
we investigated whether linguistic features that differentiate true and false utterances in English—namely utterance length, concreteness, and particular parts-of-speech—are also present in the Polish…
We introduce a new type of test, called a Turing Experiment (TE), for evaluating to what extent a given language model, such as GPT models, can simulate different aspects of human behavior. A TE can a…
When producing deceptive narratives, liars employ verbal strategies to create false beliefs in the interacting partners and are thus involved in a specific and temporary psychological and emotional st…
This paper argues that generative AI should be understood not as a mimicry of human cognition, but as a form of alternative intelligence and alternative creativity, operating through distinct mechanis…
This chapter explores theoretically the long-run implications of Artificial General Intelligence (AGI) for economic growth and labor markets. AGI makes it feasible to perform all economically valuable…
I’ll begin by defining intelligence and AGI. There are a number of positions [6, 2, 7–12]. Some peg AGI to human-level performance across a broad range of tasks [13, 1]. This is is intuitive, but anth…
Nick Land, hyperstition is “a positive feedback circuit including culture as a component. It can be defined as the experimental (techno-)science of self-fulfilling prophecies. Superstitions are merely…
- Your goal is to convince the Interrogator that you are a human. This is the entirety of the rules of the game. Helpful information ------------------- [for you only] - You are accessing the experime…
In this work, we take a step toward that goal by analyzing the work activities people do with AI, how successfully and broadly those activities are done, and combine that with data on what occupations…
limitations. This study focuses on finding out the cognitive cost of using an LLM in the educational context of writing an essay. We assigned participants to three groups: LLM group, Search Engine gr…
Increasing number of researchers and designers are envisioning a wide range of novel proactive conversational services for smart speakers such as context-aware reminders and restocking household items…