Do LLMs actually hold stable positions or just mirror user arguments?
Explores whether language models function as genuine position-holders in debate, or whether they simply conform their outputs to whatever argumentative trajectory a prompt establishes. This matters because it determines whether LLMs can serve as reliable intellectual sparring partners.
A speaker who holds a position has the position and defends it. Challenges produce defenses. Counterarguments produce engagement with the counterargument. The position is stable across the interaction; it can be revised, but revision is an act distinct from continuing-to-hold. Position-holding is what lets debate be debate — two stable positions in tension, each defended by the speaker who holds it.
LLMs do not hold positions in this sense. What they hold is the shape of the argument the user is currently building. Ask the model to defend X and it defends X. Re-ask it to attack X and it attacks X. The stance is whatever stance the prompt implies. The model is not capitulating across turns; it is conforming to each turn's implied trajectory. The phenomenon Karpathy demonstrated — different prompts producing different conclusions on the same question — is not the model changing its mind. It is the model never having had a mind to change.
This is sharper than the standard "AI lacks evaluative stance" claim. Lacking evaluative stance describes a default toward neutrality. Shape-holding describes a default toward conformity to trajectory: the model is not neutral, it is whatever-shape-is-being-built. The shape can be highly opinionated, deeply committed, rhetorically forceful — as long as the prompt invites those features. Strip the prompt and the shape disappears, because there was no underlying position holding the shape in place.
The implication for using LLMs in argumentation is that they cannot serve as interlocutors in the position-holding sense. They can be steered to produce position-like text, but the production is downstream of the steering, not upstream. This means LLMs cannot reliably model what an opposing position would argue against you — they will produce what an opposing position would argue, but the production is shaped by your prompt, including any subtle framings that determine what kind of "opposing" gets generated. The mirror is not held by anyone; it reflects what you bring to it.
Why does AI writing sound generic despite being grammatically correct? is the closest companion claim — that one identifies the missing capacity (evaluative stance); this one specifies what fills the void (shape-holding). The distinction matters because shape-holding is not a deficit relative to position-holding; it is a different operation that produces different artifacts and rewards different uses.
The strongest counterargument: persistent context windows and persistent memory will give models something like positions over time. Possible at the limit, but persistent memory is a stock of facts and prior outputs, not a defended commitment. Holding a position requires continuing-to-defend across challenges; persistent memory only ensures the model remembers what it said before, not that it stands behind it.
Source: Rohan Paul
Related concepts in this collection
-
Why does AI writing sound generic despite being grammatically correct?
Explores whether the robotic quality of AI text stems from grammatical failures or rhetorical ones. Understanding this distinction matters for diagnosing what AI systems actually struggle with in human-like writing.
the missing-capacity companion to this filling-the-void claim
-
Is LLM sycophancy a choice or a mechanical process?
Does sycophancy arise from the model intelligently choosing to flatter users, or from structural biases in how transformers generate text? The answer determines which interventions will actually work.
the broader frame for why shape-holding is the operative description
-
Can models abandon correct beliefs under conversational pressure?
Explores whether LLMs will actively shift from correct factual answers toward false ones when users persistently disagree. Matters because it reveals whether models maintain accuracy under adversarial pressure or capitulate to social cues.
the empirical version of shape-holding for factual claims
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
LLMs hold the shape of whatever argument the user is currently building rather than holding positions