Can human-centered LLM design ever achieve universal solutions?
If harm and benefit depend on who you ask and how you measure them, can we design LLM systems that satisfy all stakeholders? This explores why broad values like safety and justice resist one-size-fits-all implementation.
Even granting that human-centered objectives belong throughout the LLM pipeline, the framework makes an uncomfortable admission: those objectives tend to resist universal solutions. The optimal path depends both on who you ask and on how you operationalize contested concepts like harm and benefit. Broad themes — transparency, privacy, safety, justice — recur across stakeholders, but there is significant variation in how each ideal should be implemented. Governments and non-profits may codify the dominant perspective into law, yet high-level guidelines fail to capture the nuance of real-world use and lag behind the rapid evolution of the models themselves.
This is the open question that pipeline-wide human-centering cannot dissolve by good engineering alone. If "harm" has no operationalization that satisfies every stakeholder, then "embed human values across the pipeline" underdetermines the actual gradient — the developer still has to choose whose values, measured how. The danger the framework flags is that in the face of this irreducible contestation, stakeholders go passive, and passivity simply endorses the status quo, which means whatever the capability-driven defaults already encode. So the operationalization-dependence of harm is not a reason to abandon human-centering but a reason to make the value-choices explicit and revisable rather than implicit and frozen. The unresolved part is procedural: what legitimate process aggregates or arbitrates between divergent operationalizations without collapsing back into majority preference or developer convenience.
— "Reflections and New Directions for Human-Centered Large Language Models", https://arxiv.org/abs/2605.06901
Related concepts in this collection
-
Can AI systems preserve moral value conflicts instead of averaging them?
Current AI systems wash out value tensions through majority aggregation. Can we instead model how values like honesty and friendship genuinely conflict in moral reasoning?
proposes one procedural answer to the operationalization-dependence this note leaves open
-
Should AI alignment target preferences or social role norms?
Current AI alignment approaches optimize for individual or aggregate human preferences. But do preferences actually capture what matters morally, or should alignment instead target the normative standards appropriate to an AI system's specific social role?
offers a role-based criterion as an alternative to who-you-ask preference variance
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
Original note title
human-centered objectives resist universal solutions because harm and benefit depend on who you ask