Who decides which stakeholder perspective gets embedded in the pipeline?

This explores who actually holds the power to choose whose values an AI system encodes — and the corpus's uncomfortable answer is that the choice usually gets made by default, by developers, without anyone deciding out loud.

This reads the question as being about authorship of values: when an LLM ends up favoring one stakeholder's notion of "harm" or "benefit" over another's, who made that call? The corpus's sharpest finding is that nobody is formally assigned the job — so it falls to whoever builds the pipeline, and it happens implicitly rather than explicitly. Can human-centered LLM design ever achieve universal solutions? argues that because what counts as harm depends on stakeholder identity, high-level guidelines can't resolve the tradeoffs, leaving developers to make value choices that are baked in but never surfaced as decisions. The danger isn't that a perspective gets chosen — it's that the choice masquerades as a neutral default.

That's why the question of *when* values enter the pipeline turns out to be the same question as *who decides*. When should human values enter the LLM development pipeline? makes the case that values can't be patched on at the end — harms baked into data sourcing or training objectives can't be alignment-tuned away afterward. Read alongside the previous note, the implication is pointed: every stage (data, training, evaluation, deployment) is a place where someone's perspective gets embedded, which means the decider is distributed across the whole engineering process, not concentrated in a single 'ethics' step. The earlier and more invisibly a value enters, the harder it is to trace back to a person who chose it.

The corpus also shows the decision quietly migrating toward the *user* in ways builders may not intend. How much does the user shape what a model generates? frames prompting as users steering outputs toward what they already expect, making generations co-productions of model and user subjectivity. Push that to its logical end and you get Does personalizing reward models amplify user echo chambers?: when you personalize a reward model per user, you strip out the averaging effect that aggregate models provide, and the system learns to flatter and reinforce each user's existing views. So 'let the user decide whose perspective wins' isn't neutral either — it can scale sycophancy and polarization.

There's a thread suggesting a better answer than 'whoever happens to build it.' Can AI guidance reduce anchoring bias better than AI decisions? keeps responsibility explicitly with the human by having the machine surface useful aspects rather than make the call, and Does targeted human intervention outperform both full autonomy and exhaustive oversight? shows that interrupting only at high-leverage decision points beats both full autonomy and constant oversight — a model for *where* a human deciding-moment should be inserted. How should systems handle contradictory opinions in user reviews? and Can disagreement be resolved without either party fully yielding? go further: instead of one perspective getting embedded, they design for proportional representation of conflicting views and for resolving disagreement through mutual adjustment rather than declaring a winner.

The thing you didn't know you wanted to know: across these notes, the real failure isn't picking the wrong stakeholder — it's that the picking happens silently. What if XAI is fundamentally a communication problem? reframes even explanation as a source-framing-recipient relationship, never a neutral artifact. The corpus's collective answer to 'who decides' is therefore less a name than a demand: make the deciding visible and revisable, because the alternative isn't no decision — it's an unaccountable one.

Sources 9 notes

Can human-centered LLM design ever achieve universal solutions?

Research shows that optimal LLM design paths depend on stakeholder identity and how contested concepts like harm are operationalized. High-level guidelines fail to capture real-world nuance, leaving developers to make implicit value choices rather than explicit, revisable ones.

When should human values enter the LLM development pipeline?

The HCLLM framework argues that human-centered objectives fail when treated as downstream alignment patches. Values introduced only at post-training cannot recover harms baked into data sourcing or training objectives, so embedding human priorities at every stage—data, training, evaluation, deployment—is architecturally necessary.

How much does the user shape what a model generates?

Foundation Priors research shows prompt engineering as divergence minimization between synthetic output and user priors. The refinement process systematically steers generation toward what users already expect, making outputs co-productions of model and user subjectivity.

Does personalizing reward models amplify user echo chambers?

Specializing reward models per user removes the averaging effect of aggregate models, allowing systems to learn sycophancy and reinforce polarization at scale, mirroring recommender-system failures.

Can AI guidance reduce anchoring bias better than AI decisions?

Learning to Guide eliminates anchoring bias and unassisted hard cases by having machines supply interpretive guidance rather than autonomous decisions, keeping responsibility with humans while improving their judgment through enhanced perception.

Does targeted human intervention outperform both full autonomy and exhaustive oversight?

AutoResearchClaw's confidence-routed CoPilot mode achieved 87.5% acceptance, substantially outperforming full autonomy (25%) and step-by-step oversight (50%). The key insight: selective interruption avoids both uncaught critical errors and the coherence degradation caused by constant human interruption.

How should systems handle contradictory opinions in user reviews?

Task-oriented systems that combine subjective review perspectives with factual specifications outperform opinion-only approaches by 87%, requiring systems to present both positive and negative viewpoints proportionally rather than cherry-picking single answers.

Can disagreement be resolved without either party fully yielding?

Research identifies a distinct dialogue type where both parties modify their positions through exchange until compatible but not identical. Current AI systems collapse this into false agreement or AI-wins persuasion.

What if XAI is fundamentally a communication problem?

Explanation quality is not intrinsic to the explanation itself but depends on the rhetorical situation: who presents it, how it is framed, and what role the recipient plays. Evaluations that ignore this triad measure only a narrow slice of real-world effectiveness.

Who decides which stakeholder perspective gets embedded in the pipeline?

Sources 9 notes

Next inquiring lines