Through the Lens of Human-Human Collaboration: A Configurable Research Platform for Exploring Human-Agent Collaboration

Paper · arXiv 2509.18008 · Published September 22, 2025

Research on LLM agents [68, 69, 82], which are LLM systems capable of exhibiting complex, human-like behaviors to solve tasks, shows these agents2 can yield distinct, believable cognitive and social behaviors, particularly through role-playing the provided persona profiles. These capabilities promise new paradigms of human-agent collaboration, where humans treat these LLM agents as mutual partners rather than mere tools.

we need interfaces [5, 59] and interaction paradigms [41, 95] that help LLM agents communicate intent [23, 58], negotiate decisions [10, 87], manage shared context [25, 38, 51], and signal responsibility [30, 44, 59] in ways humans can readily perceive and act upon

To address this gap, we present an open, configurable research platform for conducting reproducible, controlled experiments on human-LLM-agent collaboration. Our platform design draws inspiration from the Shape Factory experiment [11], a classic CSCW experiment for analyzing group dynamics between collocated and remote collaborators. The Shape Factory experiment supports rigorous manipulation of communication modality, awareness, social framing, and resource interdependence. Our platform enables HCI researchers to adapt classic experimental paradigms for human-agent collaboration research, such as strategic collaboration tasks (e.g., DayTrader [10]), collaborative decisionmaking tasks (e.g., Essay Ranking [110]), and collaborative task-solving (e.g., Passcode [31]. The platform architecture features four key components, as shown in Fig 1:

(1) researcher interface for manipulating parameters and interaction controls; (2) modularized and customizable participant interface that reflects those manipulations;

(3) standardized agent context protocol ensuring consistent agent integration; (4) experiment controller serving as execution engine and logging system.

Critically, the platform allows systematic manipulation of theory-grounded interaction controls to investigate their impact on collaborative dynamics, such as communication modality [10, 86], awareness dashboards [29, 38], and social framing [11, 42]. Researchers can independently manipulate these controls and further customize the functionalities to meet more nuanced needs using the platform’s open-source code.

We demonstrate our platform’s utility through a two-part evaluation. First, we validate the usability and effectiveness of our platform for controlled experiments through two case studies that re-implemented the Shape Factory human-human collaboration experiment (six-person team) into a human-agent collaboration experiment (one person and five LLM agents as a team) [11]. A total of 16 participants took part in a crossed, between-subjects design with counterbalancing in both case studies, where we manipulated a key interaction control in each. The goal was to confirm that our platform could faithfully re-implement a classic experimental paradigm and produce measurable differences in participant behaviors and outcomes. Analysis of behavioral logs and post-study surveys confirmed the platform’s ability to capture significant differences across conditions; for instance, varying communication modality led to distinct patterns in negotiation frequency and team performance, as well as substantial differences in perceived trust and workspace awareness, which is aligned with established findings in collaborative work. Second, we conducted a participatory cognitive walkthrough with five HCI researchers on human-AI collaboration to evaluate the researcher interface for experiment setup and data analysis. Their feedback informed iterative improvements to the workflow and interface design. Collectively, these evaluations validate our platform as a methodological foundation for the HCI community to build a systematic, evidence-based understanding of human-agent collaboration.