Can fairness frameworks extend to general-purpose language models?
Existing fairness frameworks were designed for narrow, structured tasks. This explores whether they scale to LLMs, which serve multiple populations, sensitive attributes, and use cases simultaneously.
Machine-learning fairness frameworks — group fairness, fair representations — were built for systems with well-structured inputs and outputs and a self-evident use (lending, recidivism, coreference). This work argues they break on general-purpose LLMs: each framework either does not logically extend to LLM tasks (unstructured natural-language data) or becomes intractable, because LLMs touch a multitude of populations, sensitive attributes, and use cases at once. The conclusion is sharp: it is not feasible to certify or guarantee that an LLM is generally "fair."
The constructive move is to lower the target from universal fairness to use-case-specific fairness, governed by three guidelines: the criticality of context, the responsibility of LLM developers, and stakeholder participation in an iterative design-and-evaluation process — with the speculative note that AI's general capabilities might eventually help address fairness as a form of scalable AI-assisted alignment.
The keeper is the impossibility-style framing: "fair LLM" is not a global property you can stamp on a model; fairness only has teeth relative to a context and its stakeholders.
This pairs with the vault's preference/alignment-pluralism thread. It mirrors Can human-centered LLM design ever achieve universal solutions? and complements Can a single reward model represent diverse human preferences? — both reject one-size-fits-all normative targets — and supports Should AI alignment target preferences or social role norms? as the contextual alternative.
Inquiring lines that use this note as a source 4
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can human-centered LLM design ever achieve universal solutions?
If harm and benefit depend on who you ask and how you measure them, can we design LLM systems that satisfy all stakeholders? This explores why broad values like safety and justice resist one-size-fits-all implementation.
same impossibility of a universal normative target; fairness is one such objective
-
Can a single reward model represent diverse human preferences?
Standard RLHF assumes one shared preference signal. But what happens when human values genuinely conflict? This question explores whether aggregating preferences into one model fundamentally fails at fairness.
both reject one-size-fits-all; MaxMin is a constructive aggregation response
-
Should AI alignment target preferences or social role norms?
Current AI alignment approaches optimize for individual or aggregate human preferences. But do preferences actually capture what matters morally, or should alignment instead target the normative standards appropriate to an AI system's specific social role?
the contextual, role-relative alternative to universal fairness
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- The Impossibility of Fair LLMs
- Unintended Impacts of LLM Alignment on Global Representation
- The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities
- A Survey on Large Language Models for Recommendation
- Can LLM be a Personalized Judge?
- Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities
- The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs
- QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?
Original note title
group-fairness frameworks do not extend to general-purpose LLMs so fairness must be pursued per use-case