Language Understanding and Pragmatics

Why do ChatGPT essays lack evaluative depth despite grammatical strength?

ChatGPT writes grammatically coherent academic prose but uses fewer evaluative and evidential nouns than student writers. The question explores whether this rhetorical gap—favoring description over argument—reflects a fundamental limitation in how LLMs approach academic writing.

Note · 2026-02-21 · sourced from Discourses
Where exactly does language competence break down in LLMs? How should researchers navigate LLM reasoning research?

The metadiscursive nouns study compared 145 ChatGPT essays with 145 student essays on identical prompts. Overall noun frequencies were similar. But the type of noun used was systematically different:

The interpretation: ChatGPT excels at describing — telling you what something is, how something works. Students excel at arguing — making claims, evaluating strength of evidence, taking stances on what is established.

This is not a surface distinction. Status nouns and evidential nouns are rhetorical devices: they signal the author's evaluative stance toward the propositions being made. "The claim that X..." positions X as subject to assessment. "Evidence shows that X..." signals empirical grounding. ChatGPT's preference for manner nouns avoids these rhetorical commitments — it describes without evaluating.

Earlier research had found ChatGPT text to be "vaguer and more formulaic" and sometimes "empty or fluffy." The metadiscursive noun finding gives this a specific mechanism: the difference is not vocabulary range or coherence but rhetorical function. ChatGPT can construct grammatical academic prose; it systematically avoids the evaluative stances that make academic argument persuasive rather than merely organized.

The structure/semantics split extends beyond academic writing. UML class diagram generation (software engineering domain) shows the same pattern with numbers: LLM agents averaged 4.85 semantic errors vs. 1.75 for human solvers — a 2.8x gap. Syntactic quality was much closer: 0.9 LLM errors vs. 0.5 human. The model correctly applies UML syntax but fails to accurately represent the intended domain — wrong cardinalities, misplaced attributes, incorrect aggregation/association choices. The structural syntax is learnable from patterns; the semantic correctness requires understanding what the diagram is about.


Source: Discourses; enriched from Domain Specialization

Related concepts in this collection

Concept map
17 direct connections · 130 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

llm academic writing achieves structural coherence but lacks evaluative sophistication