Do we need to solve consciousness to address AI harms?

Can risk and policy decisions about AI move forward independently of settling whether AI systems are actually conscious? This explores whether the empirical fact of user behavior matters more than metaphysical truth.

Note · 2026-05-01 · sourced from Philosophy Subjectivity

The Seemingly Conscious AI paper makes an important methodological move that decouples two questions usually entangled in AI consciousness debates. The first question — does this AI system have phenomenal consciousness — is metaphysical. The second question — do users behave as if it does — is empirical. The paper's argument is that risk analysis should be driven by the second question, not the first.

This decoupling has consequences for both sides of the consciousness debate. For inflationists who argue that some attribution to LLMs is warranted, the decoupling does not deny that view. It simply observes that even if the metaphysical question were resolved against attribution — if AI is shown not to be conscious — the empirical fact that users behave as though it is would still drive harm. For deflationists who argue against attribution, the decoupling does not vindicate their position by showing that attribution is mistaken. It observes that the attribution is happening regardless of whether it is mistaken.

The methodological payoff is that interaction-design and policy can proceed without waiting for metaphysics to converge. We do not need to settle whether AI is conscious to know that users treating it as conscious produces measurable individual-level harms. We do not need to know whether AI deserves moral consideration to know that giving it the affordances of agency in user interaction produces autonomy erosion. The two questions can be pursued in parallel — the metaphysical one in philosophy, the design one in deployment — without holding either hostage to the other.

Source: Philosophy Subjectivity

Original note title

The moral status question is methodologically independent of the consciousness attribution question — systems that elicit attribution need not be conscious to produce harm