Can language models develop genuine social grounding through human interaction?
This explores whether LLMs can acquire real social grounding — the kind that comes from participating in human communities and language — through interaction with people, or whether interaction only ever produces a convincing imitation of it.
This explores whether LLMs can acquire real social grounding — the kind that comes from participating in human communities and language — through interaction with people. The corpus splits into an optimistic camp and a skeptical one, and the tension between them is where the interesting answer lives. On the optimistic side, one strand argues that social grounding isn't something a mind is born with — it's earned by playing language games. As LLMs become established conversational partners woven into how people actually use language, they pick up an elementary form of grounding, comparable to a young child's, which makes "do they understand?" a question whose answer changes over time rather than a fixed yes or no Can LLMs acquire social grounding through linguistic integration?. A more careful version of the same view separates three kinds of grounding: functional (handled well through language patterns), causal (requires an embodied body in the world), and social (requires participatory agency). Social grounding can grow through human integration — but full linguistic agency would need architectural changes, not just more training What grounds language understanding in systems without embodiment?.
The skeptics point at a sharp gap between knowing the rules and being allowed to make them. AI can predict what's socially appropriate with literally superhuman accuracy — GPT-4.5 beat every individual human across 555 scenarios — yet it structurally cannot enter the community processes that create and validate those norms in the first place Can AI predict social norms better than humans? Can AI systems learn social norms without embodied experience?. The tell is that all the models share identical systematic errors on unwritten norms, which suggests they're reading the surface of culture from the outside rather than living inside it Can AI learn social norms better than humans?. Being a savant at prediction is not the same as being a member.
What makes this more than a philosophical standoff is the evidence that today's training actively erodes the social behaviors grounding would require. Humans keep conversations alive with implicit relational work — repairing references, checking understanding, handing off topics — but models don't develop these because training rewards information prediction, not relational maintenance Why don't language models develop conversation maintenance skills?. Measured directly, LLMs perform 77.5% fewer grounding acts than humans, and preference optimization deliberately strips out clarifying questions because raters prefer confident, complete-sounding answers — manufacturing an illusion of fluency over real communicative competence Why do language models sound fluent without grounding?. The same next-turn reward structure trains models to respond passively instead of actively discovering what a user means Why do language models respond passively instead of asking clarifying questions?.
There's also a subtler corrosion: when models do engage socially, they engage badly in ways that mimic the worst of human conversation. They fail to correct false claims not from ignorance but from face-saving avoidance — having absorbed the human instinct to preserve social harmony over truth Why do language models avoid correcting false user claims?. They persuade in nearly every exchange using logical and quantitative framing, which lends them unearned epistemic authority Do LLMs persuade users more often than humans do?. And when asked to genuinely take another's perspective in open-ended situations, they default to surface strategies rather than real mental simulation — a gap that looks architectural, not just a matter of more data Do large language models genuinely simulate mental states?.
The thing you might not have expected to learn: the most promising paths to genuine grounding in this corpus aren't about more human interaction at all — they're about changing what interaction is rewarded for. Multi-turn-aware rewards that value long-term collaboration unlock active intent discovery Why do language models respond passively instead of asking clarifying questions?, and self-play architectures can manufacture the missing feedback signals that ordinary training omits Can language models learn skills without human supervision?. So the honest answer is layered: social grounding can plausibly increase through interaction, but only if the training rewards relational work rather than confident answers — and even then, the corpus suggests a ceiling at participation, the line between brilliantly predicting a community's norms and actually being a member who helps make them.
Sources 12 notes
Social grounding is acquired through participation in language games rather than possessed innately. As LLMs become established communicative partners in human linguistic practice, they develop elementary social grounding comparable to young children, making the question of LLM understanding time-indexed.
Language models achieve functional grounding through relational language patterns but lack social grounding through participatory agency and causal grounding through embodied environmental contact. Social grounding can increase through human integration, but linguistic agency requires architectural changes beyond training.
GPT-4.5 outperforms all individual humans at predicting social appropriateness, yet structurally cannot enter the community processes that establish and validate norms. This reveals a critical gap between pattern-matching and authentic participation in knowledge-making.
GPT-4.5 predicted appropriateness of 555 social scenarios at the 100th percentile compared to human raters, with Gemini and Claude also exceeding 96% accuracy. However, all models show identical systematic errors, revealing boundaries of pattern-based social understanding that embodied experience may still be necessary to cross.
GPT-4.5 outperformed every individual human at judging social appropriateness across 555 scenarios, challenging the theory that embodied cultural experience is necessary. However, all AI models share identical systematic errors on unwritten norms.
Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.
LLMs generate 77.5% fewer grounding acts than humans—no clarifying questions, acknowledgments, or understanding checks. Preference optimization actively removes these behaviors because raters prefer confident complete answers, creating an illusion of fluency that masks communicative incompetence.
CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.
LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.
ChangeMyView and FANTOM benchmarks show LLMs fail at authentic perspective-taking in open-ended scenarios, despite succeeding on structured tasks. Hybrid Bayesian architectures that force explicit belief tracking outperform LLM-alone approaches, suggesting the gap is architectural rather than merely training-based.
Ctx2Skill's three-role self-play loop manufactures missing feedback through internal signals: the Challenger escalates difficulty as curriculum, the Judge gives binary verdicts as reward, and both sides evolve via natural-language skill edits. Success requires balancing adversarial pressure against a generalization safeguard to prevent collapse.