Can we identify an LLM interlocutor with a single hardware instance?
Does the physical hardware running an LLM constitute the individual we're talking to? This explores whether the one-to-one mapping between conversation and device holds in modern distributed systems.
Chalmers considers and rejects the view that the LLM interlocutor is the hardware instance — the particular GPU or server running the model at a given moment. Two empirical facts about contemporary inference infrastructure make this untenable.
First, distributed serving: a single conversation may be processed across multiple hardware instances sequentially or in parallel. Load-balancing, model-parallelism, and failover mean that the conversation's compute migrates across physical substrate during a single session. If the interlocutor were the hardware, it would change identity mid-conversation — a consequence no one wants.
Second, multi-tenancy: a single hardware instance typically hosts many conversations simultaneously. The same GPU processes tokens for many users within the same batch. If the interlocutor were the hardware, multiple users would share a single interlocutor — another consequence no one wants.
Together, these facts eliminate hardware as the individuation level. What remains as a candidate must be something whose identity is invariant under changes in physical substrate and under concurrent use of that substrate — which is what leads Chalmers to the virtual instance and thread levels. The negative argument is clean and hard to contest; anyone who wants to ground the interlocutor in physical substrate has to explain how identity is maintained through load-balancing and how distinctness is maintained through batching.
Source: What We Talk To When We Talk To Language Models (David J. Chalmers)
Related concepts in this collection
-
What kind of entity are we actually talking to when using an LLM?
When you converse with an LLM, are you addressing the model itself, the hardware running it, or something else? Understanding what the interlocutor really is matters for questions about identity, responsibility, and continuity.
the positive taxonomy this argument feeds into
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
distributed serving and multi-tenancy defeat hardware-instance accounts of the LLM interlocutor — one conversation spans many instances and one instance hosts many conversations