Can secondary orality exist without any embodied human participant at all?

This explores whether AI-generated speech counts as a new kind of 'secondary orality' (Ong's term for speech reconstituted through electronic media) when no flesh-and-blood speaker is present at all — and the corpus suggests the honest answer is that AI breaks the category rather than extends it.

This question reads as: secondary orality has always been speech routed through technology — radio voices, recorded announcers — but those voices still belonged to embodied people. Can the 'orality' survive once you remove the person entirely? The corpus's sharpest move is to say: what you get then isn't secondary orality at all, but a third, historically novel thing. AI produces utterances that are formally speech — performative, additive, conversational — yet no embodied speaker generates or anchors them Where is the speaker when AI produces speech?. Every prior orality, primary (face-to-face) and secondary (electronic), depended on a carrier-person. AI removes the carrier while keeping the form, which is why it doesn't slot neatly into Ong's two-tier scheme.

Why can't the machine just be the new speaker? Several notes converge on a categorical 'no.' Speech that counts as genuine address requires conditions a disembodied system structurally lacks: embodiment, real participation in a shared situation, and precariousness — having something at stake What makes linguistic agency impossible for language models?. A model can accumulate 'social grounding' by being woven into how people talk, but that's a different property from the linguistic agency that would make it a speaker; no amount of use bridges the gap Do LLMs gain true linguistic agency through integration?. The same boundary shows up in the consciousness debate: language about minds applies only to entities sharing a world with us through co-presence Can disembodied language models ever qualify as conscious?. So the surface of speech can be fully present while the conditions that make speech an act of someone are absent.

Here's the turn that makes the question more interesting than a flat 'no.' The embodied human isn't truly absent — they've been displaced upstream. The model's fluency comes from compressing the relational structure of text written by causally grounded people, giving it an *indirect* causal grounding mediated through their words Can large language models develop genuine world models without direct environmental contact?. In Saussurean terms, the system operationalizes *langue* — the relational system of language — with no external referents of its own Can language models learn meaning without engaging the world?. So AI orality is parasitic on prior embodiment: it speaks with humanity's collected voice without any single human in the room. The participant isn't gone; they're diffused into the training corpus.

There's also a warning against being fooled by the surface. A system can pass behavioral tests for speech-like output while missing the relational-normative conditions — accountability, an evaluative stance — that real communicative subjecthood needs Does behavioral speech output prove communicative subjecthood?. And under one influential reading, what looks like a speaker is better understood as a role-played character: the model generates continuations consistent with a prompted persona, so folk-psychology attaches to the *character*, not to any underlying subject Should we treat dialogue agents as role-playing characters?. That reframes the whole question — maybe AI 'orality' is less a voice without a body than a performance whose author is the entire corpus and whose speaker is a fiction.

So: can secondary orality exist with no embodied participant? Strictly, no — what emerges is a fourth category Ong didn't anticipate, speech-shaped output anchored in no present speaker but standing on the displaced labor of millions of absent ones. The thing you didn't know you wanted to ask is whether 'orality' was ever really about the voice at all, or about the relationship the voice presupposed — and whether a medium can keep the first while quietly discarding the second.

Sources 8 notes

Where is the speaker when AI produces speech?

AI produces utterances with the formal properties of speech—performative, additive, conversational—but no embodied speaker generates or anchors them. This breaks the historical pattern where all prior orality, primary and secondary, depended on a carrier-person, making AI structurally novel in media history.

What makes linguistic agency impossible for language models?

Enactive cognitive science identifies three constitutive properties of linguistic agency—embodiment, participation, and precariousness—that are structurally absent from LLMs. This is a categorical incompatibility, not a matter of degree, suggesting current architectures cannot achieve genuine linguistic agency.

Do LLMs gain true linguistic agency through integration?

Social grounding and linguistic agency are distinct properties. LLMs acquire more social grounding through integration into language communities, but remain categorically incapable of linguistic agency in the enactive sense, which requires embodiment and precariousness no amount of use can provide.

Can disembodied language models ever qualify as conscious?

Current disembodied LLMs cannot be candidates for consciousness because consciousness language originates from and applies only to entities sharing a world with us through co-presence and triangulation on shared objects.

Can large language models develop genuine world models without direct environmental contact?

LLMs form structured world representations by extracting regularities from training data produced by causally grounded humans. This constitutes indirect causal grounding mediated through text, though the chain has gaps that limit real-time verification and model updating.

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

Does behavioral speech output prove communicative subjecthood?

Chalmers' test passes any system producing contextually appropriate text, but communicative subjecthood requires relational-normative conditions like accountability and evaluative stance. The test is calibrated to the wrong phenomenon, creating false positives like puppets that walk-shaped without walking.

Should we treat dialogue agents as role-playing characters?

Shanahan's framework treats LLM outputs as character-consistent text production rather than authentic mental states. The dialogue prompt establishes a character; the model generates continuations matching that character, making folk-psychology applicable to the simulated persona, not the underlying system.

Can secondary orality exist without any embodied human participant at all?

Sources 8 notes

Next inquiring lines