Does perceiving AI as conscious create multiple distinct risks?
Exploring whether a single perceptual mechanism—attributing consciousness to AI—can generate different categories of harm across emotional, political, and social domains, and what this implies for risk analysis.
The Seemingly Conscious AI paper makes a structural argument that decouples the moral question from the empirical one. Whether an AI is actually conscious is a metaphysical question that may not be answerable on a useful timescale. Whether users perceive it as conscious is an empirical question that already has measurable answers. The paper argues that the perceptual question — consciousness attribution — is the load-bearing one for risk analysis, because it is the user's perception that drives behavior, not the system's actual phenomenology.
The result is a taxonomy where many distinct risks reduce to one mechanism. Emotional dependence on chatbots, autonomy erosion through over-reliance on AI judgment, political strife driven by partisan AI personas, and the erosion of status hierarchies between humans and machines all flow from users treating the system as a mind. Different risks because different domains; same mechanism because the perceptual move is constant.
This reframing has practical consequences. Mitigations directed at the model — making it more transparent, more accurate, more aligned — do not directly address the perceptual move. The user can attribute consciousness to a transparent, accurate, aligned system as readily as to an opaque, error-prone one, perhaps more so. Mitigations directed at the interaction design — disclosure, framing, friction in the moments when attribution is most likely — operate on the actual mechanism. The taxonomy implies that interaction-level intervention is what couples to the risk surface; system-level alignment is at best a complement.
Source: Philosophy Subjectivity
Original note title
Consciousness attribution to AI generates a heterogeneous risk surface from a single perceptual mechanism