Language Understanding and Pragmatics

Can models learn argument quality from labeled examples alone?

Explores whether fine-tuning on quality-labeled examples teaches models the underlying criteria for evaluating arguments, or merely surface patterns. Matters because high-stakes assessment tasks depend on reliable, transferable quality judgment.

Note · 2026-02-21 · sourced from Argumentation
Where exactly does language competence break down in LLMs? What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

Argument Quality Assessment research trains models to evaluate the quality of arguments — are they logically valid? Well-supported? Relevant? Clear? The standard approach is supervised fine-tuning: label examples as high/low quality, train on them, evaluate transfer.

The finding: fine-tuning on quality-labeled examples does not reliably teach the models what makes arguments good. Models learn to pattern-match against the labeled examples but do not acquire the underlying criteria that would generalize to new argument types. When explicit theoretical frameworks (RATIO: Relevance, Acceptability, Sufficiency; QOAM: Quality of Argumentation Model) are provided as structured instruction, performance improves significantly.

Theory injection works where pattern learning fails.

This is a specific instance of Can models pass tests while missing the actual grammar?: models that score highly on quality assessments in the training distribution fail to transfer the criteria to out-of-distribution argument types. The learned pattern is "this looks like high-quality arguments in the training data" rather than "this argument satisfies the following criteria for quality."

The implication extends beyond argumentation. Whenever an evaluation task requires applying principled criteria that are not explicit in the labeled data — quality, fairness, coherence, persuasiveness — fine-tuning on examples risks teaching the distribution rather than the criteria. Why do different people reconstruct the same argument differently? points at the same problem from the other direction: if there's no gold standard, labeled examples cannot straightforwardly encode the right criteria.

The practical consequence: assessment tasks in high-stakes domains (argument quality in legal reasoning, argument validity in policy analysis) should not rely on fine-tuned models trained only on labeled examples. Explicit criteria instruction — prompting with theoretical frameworks, structured evaluation rubrics — is required.


Source: Argumentation

Related concepts in this collection

Concept map
17 direct connections · 174 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

argument quality assessment requires explicit theoretical framework instruction because quality criteria cannot be learned from examples alone