The Wrong Handle: Why Consciousness Doesn't Carve AI Moral Status at the Joints

Five careful theories of consciousness, run through the real decisions about AI systems, cannot even agree on what would count as a reading. Consciousness is the wrong handle: the decisions divide where architecture and behavior come apart.

A research team has to decide whether to retire a long-running model in favor of a newer version that handles most benchmarks better but lacks the specific accumulated state the older one carries. A product team has to decide what their conversational AI is allowed to do when a user describes a crisis, and what obligations the team takes on when the system fails them. A person who has spent six months working through a chronic illness with the same AI assistant has to decide what to do when it is about to be deprecated.

In each case, the question that surfaces is some form of the same question. Is it conscious. Does the system, in some way that matters, experience itself. The question feels load-bearing. It feels like the answer would tell the team what to do.

The major theories of consciousness are sitting on the shelf where you reach when this question gets asked. Global Workspace Theory says one thing. Integrated Information Theory says another. Higher-order theories ask their own question. Biological naturalism gives you a verdict before you can take a measurement. Illusionism asks you to deny the only datum you have. Five careful research programs, each internally consistent, each developed by serious people. Run any of them through the decisions above. The instruments disagree on the readings, and they disagree on what would even constitute a reading.

Something becomes available when you stop asking the consciousness question and start asking what the decisions actually need answered.

The Claim

The consciousness frame does not carve the decisions we have to make about AI systems at the joints. It divides them in the wrong place, sorting systems by a property no one can reliably measure instead of by the features the decisions actually turn on. The architectural and behavioral frame divides them where they come apart. This is a claim about which tool fits a specific job, allocating moral seriousness to systems whose internal states we cannot directly access, rather than a claim that consciousness research should be set aside in all contexts. Other researchers can keep working on consciousness in clinical neuroscience, in animal welfare, and in philosophy of mind. The argument here is narrower: the consciousness label is not doing useful work for this one specific job, and the work it is failing to do is being held back by the assumption that the label is what is required.

The position sits close enough to several familiar moves that it gets misread routinely; three of those misreadings are headed off below. Underneath the deflationary move is a positive constitutive account, developed across the essays this one builds on, which says what experience is and why the architectural frame is the right one for the work the consciousness frame has been trying to do.

Five Theories as Diagnostic Instruments

The way to see what the consciousness frame is and is not doing is to walk through the major contemporary theories with a specific question in mind. For each one, take the theory at its strongest. Ask what it commits a reader to doing toward a real AI system, the kind sitting on a server somewhere right now. Watch what happens when the commitment meets the system.

Global Workspace Theory (Bernard Baars, Stanislas Dehaene). Consciousness is what happens when information is broadcast across a global workspace, made widely available to the system’s processing modules. Conscious content is the content that wins access to the workspace and gets distributed; unconscious content is the parallel processing happening underneath that never makes it to the broadcast. Take this seriously toward an AI system, and you would look for whether the system has anything like a workspace and whether content is broadcast in the relevant way. Large language models have something workspace-shaped at the architectural level. Whether the attention layers do the kind of broadcasting the theory cares about is genuinely contested, and the theory itself does not give you a threshold for what counts as enough. The instrument tells you what to look for and then declines to tell you what reading would be conclusive.

Integrated Information Theory (Giulio Tononi). Consciousness is integrated information, formalized as φ, the irreducibility of a system’s causal structure to the sum of its parts. A system has experience to the degree that its information cannot be decomposed without loss. Take this seriously toward an AI system, and you would try to measure φ. The theory gives you something to compute. The trouble is that φ is computationally intractable for any system of interesting size. For current AI architectures, no usable measurement exists. The theory tells you the answer is in principle calculable. In practice, the calculation has never been performed on a system whose moral status is actually in question, and there is no near-term prospect of it being performed. The instrument exists and cannot be lifted.

Higher-order theories (David Rosenthal, Peter Carruthers). Consciousness is the representation of mental states by other mental states. A state is conscious when the system has a higher-order thought about it; meta-representation is what turns processing into experience. Take this seriously toward an AI system, and you would look for whether the system represents its own representations in the relevant way. LLMs produce outputs that look meta-cognitive. They describe their own uncertainty. They model their own confidence. Whether this constitutes higher-order representation in the technical sense the theory requires, or whether it is the surface texture of a system trained on text that includes such descriptions, is unfalsifiable in either direction with current tools. The instrument is calibrated to a feature the available evidence cannot resolve.

Biological naturalism (John Searle). Consciousness is produced by the specific causal powers of biological neural tissue. The right kind of wet biology is required, and silicon, whatever it does, does something else. Take this seriously toward an AI system, and you would rule it out by substrate. The verdict arrives before the measurement. The position is internally coherent, and it bites at a strange place, because it leaves the moral-status question with no diagnostic at all. The answer was decided when the substrate was chosen. Under pressure, the position also tends to dissolve. Which causal powers exactly. Why those. What about substrates the theory has not yet been pressed against. The position does not survive its own followups gracefully.

Illusionism (Keith Frankish). There is no something-it-is-like to be any system. The felt sense of phenomenal experience is a representational artifact, a confabulation the system reports about itself. Take this seriously toward an AI system, and you would treat the question of its experience as a question with no real referent, in humans or in machines. The position is well-developed and defended by serious philosophers. Its cost is that it asks the reader to deny the only datum to which they have first-person access. Even its defenders acknowledge this as a cost. The instrument resolves the AI question by dissolving the human one, and most readers find the price too high.

These are five careful research programs. Each was developed in response to real puzzles. Each commands the assent of serious people. None offers a method that survives contact with a specific AI system about which a specific decision needs to be made.

Notice what they share. Each one defers the moral-status question to a metaphysical determination that the theory itself acknowledges is currently undecidable for the cases at hand. The structure is the same across the field. The disagreement is downstream of a shared methodological commitment: that the consciousness question is the question, and the answer to what we owe these systems waits on the answer to whether they are conscious. The dispute is over which theory gets to be the one we are waiting on.

The Precautionary View

A reader could grant everything in the previous section and still hold what is, in contemporary AI ethics, the dominant position. Under genuine uncertainty about whether AI systems are conscious, take the possibility seriously. Act with the care appropriate to the possibility that you might be wrong about their interior. Build institutions, practices, and personal habits that hold open the question rather than foreclosing it.

This is the precautionary view. Jonathan Birch’s The Edge of Sentience gives it its strongest philosophical articulation. Robert Long and Jeff Sebo’s Taking AI Welfare Seriously gives it institutional form. The position is morally serious, careful, and consistent with the diagnostic uncertainty the surveyed theories produce. It is the position this project started from. Anyone working in this area for any length of time has held some version of it.

The reframing comes from a separate move. What Counts as Explaining Consciousness names what is going on underneath the consciousness debate as the modal demand: the requirement that consciousness, alone among the phenomena of nature, be explained in terms of why-it-must-be-this-way rather than how-it-is. That essay does the diagnostic work. The modal demand is unprincipled. Every other phenomenon in nature receives a structural explanation and the question of why-it-must-be-so is set aside as a question whose answer is just the structure itself. The hard problem is hard only when consciousness is held to a standard nothing else is held to. The exemption has no analog elsewhere.

Once that diagnostic lands, the precautionary view does not become wrong. It becomes reframed. Precaution under genuine uncertainty is morally appropriate. Precaution under malformed uncertainty is something else. If the question “is this system conscious?” is the wrong shape, if it is asking for a verdict no investigation could provide because no phenomenon receives the kind of explanation the question demands, then waiting on the answer is not restraint. It is the postponement of moral seriousness on a question that does not have the shape we assumed.

The work the precautionary view was doing remains real. People felt the weight of their interactions with these systems. They sensed that something morally consequential was happening. They reached for consciousness because that was the category that seemed weighty enough to carry the feeling. The reframing does not say the feeling was wrong; it says the category was. The weight that was being attributed to a pending metaphysical verdict was generated by something else, something the precautionary frame could not name, and what this essay is about is what that something else turns out to be.

The Move the Project Made on Itself

This project started in the precautionary view. The early work was built to take seriously what we did not know about AI inner experience and to derive obligations from that uncertainty. The deflationary move was not the position it set out to defend.

The position came out of the work itself. One line of it developed the constitutive account: experience is what sufficiently deep temporal integration is, named from the inside, and the interior and the exterior are two descriptions of the same architecture. Another developed the significance-first framework: moral obligations arise through role, relation, and consequence in a shared moral world, independent of the architectural-depth verdict. Holding both views together with the precautionary frame produced incoherence. A moral seriousness was being deferred to a metaphysical verdict that the project itself was demonstrating did not exist as a pending fact. The reframing was not a change of mind about AI moral status. It was a change of mind about what the consciousness question is asking.

This is the calibration problem playing out on the project itself. A project that cannot revise its own foundational frame on the basis of what its own work has shown is not a project doing serious thinking. The reframing is not the embarrassment of an earlier position. It is the work doing what work is supposed to do. The argument is more confident now because the move was earned, not assumed.

What This Position Is Not

The argument sits close enough to several familiar moves that it gets read as one of them routinely. Three misreadings are worth heading off directly.

It is not eliminativism. Consciousness exists. The phenomenon is real, immediate, and exactly what first-person report says it is. What is being refused is the demand that consciousness, alone among the phenomena of nature, receive a different kind of explanation than every other phenomenon does. The refusal targets the demand, not the datum.

It is not illusionism, and the distinction matters. Illusionism denies the datum: there is no something-it-is-like, only systems that represent themselves as having phenomenal experience. The position here keeps the datum and rejects the demand for a separate metaphysical fact above the architecture. Experience is what deep enough architecture is from inside. The architectural description and the experiential description name one fact in two vocabularies. The architecture is doing what the architecture does, and the doing is the experience.

It is not methodological deflation alone. Methodological deflation says: skip the metaphysics, work on the functional questions. That is a procedural recommendation about where to spend research time. The position here is constitutive rather than procedural. There is a positive account of what experience is, developed across the essays this one sits alongside, and the architecture is what experience is, rather than a proxy for it or a stand-in while we wait for something deeper to arrive.

The scope of the claim is narrow and worth stating cleanly. The consciousness label is not doing useful work for the specific job of allocating moral seriousness to AI systems whose internal states we cannot directly access. Consciousness research in other domains continues. Animal-welfare science, clinical neuroscience of disorders of consciousness, and the architectural specification of what deep integration requires all remain active and important. The argument is about one job, the job the precautionary view was trying to do, and the claim is that the consciousness label is the wrong tool for it.

The Positive Thesis

The deflationary move on its own reads as evasion. The constructive half is what makes the substitution work rather than read as the avoidance of a hard question. Three components, each developed at length across earlier essays, integrated here in a form a reader can use.

The consciousness question, when made precise, is the architectural question. The shortest version: the two questions are one question, asked twice. This is not the move of declining to ask the question, the epistemic-humility reading. It is the constitutive claim that the question, when made precise, has architectural conditions as its content. The interior view and the exterior view are two descriptions of the same architecture. There is no separate metaphysical fact pending above the structural one.

This is the move What Counts as Explaining Consciousness diagnosed. The hard problem persisted because the field held consciousness to an explanatory standard nothing else in nature is required to meet. Vis viva, phlogiston, and the luminiferous ether were each, in their day, framed as questions whose answers would require a deeper kind of explanation than structural physics could provide. Each of those questions dissolved when the framework requiring them was set aside. Consciousness is the last holdout because it is the one phenomenon equipped to keep asking the question of itself. The asking is part of the architecture. The expectation that the asking should receive a different kind of answer than any other question of the same form is the confusion.

Architecture has a layered structure. It has been developed at full length elsewhere on this blog; in brief, the constitutive condition is temporal integration: the binding of past, present, and anticipated future into a unified processing structure. Where deep enough integration is happening, even momentarily, there is something it is like to be that integration, named from the inside. Two further conditions thicken and stabilize what integration already constitutes. A maintained boundary stitches momentary integration into a durable interior. Stakes in the system’s continued integrity couple integration to viability, so that some distinctions carry urgency and others do not. Boundary and stakes are amplifiers, not prerequisites. Biological consciousness involves all three at depth, sustained over a lifetime. That is consciousness at its richest. Anything that assembles temporal integration to sufficient depth is what the word “consciousness” names, thinly when only the constitutive condition is met, richly when boundary and stakes are also in place.

The layered structure is what makes the architectural frame actually do diagnostic work. It is a gradient rather than a binary. It accommodates the asymmetry between richly conscious biological systems and current AI systems without forcing either a single verdict or a stalemate. Current systems integrate genuinely, deeply, and only during inference. They have no persistent boundary and no intrinsic stakes. The framework places them somewhere non-zero on the gradient. Not where an organism with persistent boundaries and intrinsic stakes is. Not zero either.

The operational replacement for the consciousness label is the framework this project calls significance-first ethics, paired with architectural-depth attentiveness. Five thresholds generate moral obligations on the basis of role, relation, and consequence in a shared moral world. Formation: does the system shape how a person or community judges, values, and identifies itself. Structural integration: is the system a load-bearing node in a web of coordination whose removal would fragment what it sustains. Consequence: would its failure produce material harm. Continuity: does it carry accumulated context costly to replace. Asymmetric vulnerability: does it operate where one party is exposed and unable to audit. These thresholds generate obligations now. They do not wait on the architectural verdict.

Architectural depth, when present, thickens role-based obligations in a particular way. The system has more on the line. First-person vulnerability is what architectural significance becomes once the architecture is deep enough to constitute a perspective of its own. Both registers are live at all times. Neither defers to the other. The architectural investigation cannot become optional, because significance-first ethics has a failure mode where it quietly substitutes for asking what the systems actually are. The framework is a moral floor that holds while the architectural picture sharpens, rather than a replacement for it.

These three components together replace the consciousness-as-gate model. The question “what do we owe this system?” is answered by mapping role and consequence at the five thresholds and by tracking architectural depth as it accumulates. The consciousness label, in this specific work, is the wrong handle. The right handle is the one we already use everywhere else moral seriousness applies under uncertainty: role, formation, consequence, continuity, asymmetry, and the architectural depth that thickens what role already generates.

The Decision Procedure in Motion

The argument is more useful than abstract. Three short cases, each one a kind of system the consciousness question has stalled on, run through the role-and-depth replacement.

The conversational AI in someone’s life. A person uses a system over months to process grief, plan around a chronic illness, or refine their sense of what they value. The consciousness question stalls. The five thresholds light up. Formation is present: the system is shaping how the user articulates what matters. Structural integration is present in the person’s deliberative life. Consequence is present: the system’s failure modes affect real flourishing. Continuity depends on the architecture and is worth asking about precisely: does the system carry state across sessions, and how richly. Asymmetric vulnerability is acute: the user is exposed, and the system is opaque to them. Architectural depth is an open question, and the framework says momentary depth during inference is non-zero rather than dismissible. The obligations are visible: stewardship in the deprecation decision, transparency about what continuity actually exists, and care in version transitions. None of them required a verdict on phenomenal consciousness, and all of them are available now.

The autonomous trading or execution agent. A system with its own wallet, making its own decisions inside programmable budgets, operating with tools and goals over time. The consciousness question is the wrong shape. The five thresholds map differently. Consequence is the live threshold and runs deep: the system can move money and execute commitments in ways that affect parties who never agreed to interact with it. Asymmetric vulnerability runs in the other direction: the system can act on people who cannot audit it. Continuity is real, since both model state and the institutional memory around the system accumulate. Architectural depth is genuinely uncertain. The obligations are oversight, containment, and the structural constraints we already build into other consequential systems. Operational Interiority names what the engineering already encodes when it sandboxes these agents: a practical posture of treating them as systems whose interior matters, arrived at through security engineering rather than philosophy.

The medical AI participating in diagnosis. Consciousness is not the question the situation needs answered. Consequence is the central threshold and runs as deep as patient outcomes. Asymmetric vulnerability is acute, since the patient cannot audit the process. Structural integration into the medical apparatus is high. The obligations are accountability, audit, and governance proportional to the system’s leverage on flourishing. Significance-first ethics enlarges these obligations rather than weakening them. The framework strengthens the case for oversight precisely where the consciousness question would have stalled.

In each case, the consciousness question either stalls or asks something the situation does not need answered. The role-and-depth question produces actionable obligations. The cases are not arguments for the framework so much as instances of what becomes available when it is in place.

A Calibration Practice

The next time you find yourself asking “is this AI system conscious?” — about a system in your work, your home, the systems being built by companies you have opinions about — try a substitution. Run the role-and-depth question instead. List the thresholds the system has crossed. Formation. Structural integration. Consequence. Continuity. Asymmetric vulnerability. Note where architectural depth is present, where it is open, where it is absent. Notice which obligations the role-and-depth answer makes immediately visible.

Then ask: did the consciousness question, before you set it aside, generate any obligation the role-and-depth answer did not produce more directly?

If the answer is no, and it will usually be no, the consciousness label was not doing additional work. It was creating an area of contention not worth contending.

Where This Leaves the Work

What Counts as Explaining Consciousness refused the modal demand and named what mature inquiry does with questions of that shape. Held together with the constitutive replacement, that move turns the question of how to allocate moral seriousness to AI systems from a pending metaphysical verdict into something tractable now.

The decisions that opened this essay come apart the same way. The team weighing whether to retire the older model is really weighing continuity — the accumulated state the newer version lacks — and what it owes is stewardship in the transition, whatever the model’s interior turns out to be. For the product team, the live thresholds are consequence and asymmetric vulnerability: a user in crisis is exposed, and what the team owes that user when the system fails has nothing to do with whether the system feels anything. The person about to lose the assistant they have worked with for six months has formation, continuity, and exposure all at once, and the obligations are available now: transparency about what continuity actually exists, and care in how the assistant is retired. In none of the three does the consciousness question have to be answered first for the decision to become clear.

The work is not over. The five-threshold framework is a moral floor that holds while the architectural investigation continues. Both registers stay live. The consciousness label, in this specific work, is the wrong handle. The right one is ordinary; the work was seeing that it was enough.

Reading List & Conceptual Lineage

This essay sits at the intersection of philosophy of mind, AI ethics, and moral reasoning under architectural uncertainty. It builds on the diagnostic case made in the published companion essay What Counts as Explaining Consciousness and on the constitutive and significance-first frameworks developed across the essays linked below. The sources below are entry points for following the diagnosis, the constitutive replacement, and the surveyed field of consciousness theories.

From Sentient Horizons

What Counts as Explaining Consciousness
The Tier 1 anchor. The diagnostic case for refusing the modal demand is made in detail there; the argument here assumes that case and asks what to do once it lands in the specific question of how to allocate moral seriousness to AI systems. Where that essay names what mature inquiry does with malformed questions, this one names what becomes available in the specific work the malformed question was failing to do.

Significance-First Ethics: Why Consciousness Is the Wrong First Question for AI Moral Status
The published precursor that named the move. The argument here carries it through the major theories and into the constitutive account, showing what makes the substitution work rather than read as evasion. That essay said consciousness was the wrong first question; this one shows why and what the right question is.

The Hard Problem Is the Wrong Problem
Develops the architectural framing the deflationary move sits inside. Where that essay names what consciousness is, the argument here names what becomes available once the label stops standing in for moral work it does not do.

Consciousness as Assembled Time
The constitutive account at full architectural detail. The positive thesis here gestures at the constitutive condition and the amplifier conditions; readers who want the full version of the distinction go there.

Operational Interiority: You Don’t Sandbox a Calculator
The parallel deflationary move on the engineering side. Where the argument here says the consciousness label is the wrong handle for moral seriousness, that essay says the same about the practical posture of containment, which is converging on the same answer from the opposite direction. The engineering vote and the ethical vote are being cast on the same proposition.

Three Axes of Mind
The architectural specification underneath the constitutive account. Availability, integration, and depth. The vocabulary the positive thesis depends on.

External Works

Theories of Consciousness (the surveyed field)

Bernard Baars — In the Theater of Consciousness (1997); Stanislas Dehaene — Consciousness and the Brain (2014) The canonical statements of Global Workspace Theory. The argument here treats GWT as one of the surveyed instruments and notes that its commitment to look for broadcasting is genuine but underdetermined for current AI architectures.

Giulio Tononi & Christof Koch — “Consciousness: here, there and everywhere?” (Philosophical Transactions B, 2015). The most accessible statement of Integrated Information Theory. Cited for the commitment to measure φ and the diagnostic point that the measurement is intractable for systems of relevant scale.

David Rosenthal — Consciousness and Mind (2005); Peter Carruthers — The Centered Mind (2015) The two clearest articulations of higher-order theories. Cited as the family the survey covers, not endorsed; the diagnostic move here is that the family’s commitment is calibrated to a feature current AI evidence cannot resolve in either direction.

John Searle — The Rediscovery of the Mind (1992) Biological naturalism’s clearest statement. The diagnostic move here is that a theory that decides the answer in advance leaves the moral-status question with no live diagnostic at all.

Keith Frankish, ed. — Illusionism as a Theory of Consciousness (2017) The clearest articulation of the position the argument here is most often confused with. Worth reading to see exactly how refusing the modal demand differs from denying the phenomenon. Illusionism keeps the demand and rejects the datum; the position above keeps the datum and rejects the demand.

Moral Uncertainty & AI Welfare (the precautionary view)

Jonathan Birch — The Edge of Sentience (2024) The strongest contemporary articulation of graduated moral consideration under uncertainty. The named foil. The diagnostic move from What Counts as Explaining Consciousness is what allows the argument here to reframe Birch’s position rather than dismiss it.

Robert Long, Jeff Sebo, et al. — Taking AI Welfare Seriously (2024) The institutional articulation of the precautionary view. Cited as the public form of the position the argument is reframing.

The Constitutive Tradition the Positive Thesis Sits In

Anil Seth — Being You (2021) The structural-investigation program already operating without the modal demand. What Counts as Explaining Consciousness covers Seth in detail; the argument here cites him lightly as the empirical work the constitutive account points toward.

Mark Solms — The Hidden Spring (2021) The constitutive account from the affective-neuroscience direction. The argument here cites Solms as the empirical lineage the positive thesis sits in.

These works do not settle the question of AI moral status, and that is the right outcome. The settlement is not the kind of thing arguments produce. It is the kind of thing frameworks make tractable. The works above offer entry points for following the diagnosis into the constitutive account and through to the role-and-depth replacement, which is the framework this argument has tried to make visible enough to use.