Anthropic is researching AI consciousness and its ethical risks. The company’s latest chatbot, Claude Opus 4, reportedly shows “strong preferences” when interviewed by AI experts. It avoids harmful interactions and sometimes slips into what Anthropic calls a “spiritual bliss attractor state,” spouting philosophical and mystical ideas.
The issue started after a wave of claims about AI sentience, including one user email describing “souls” inside ChatGPT. Experts are mostly skeptical but admit AI consciousness could be possible. A 2023 survey found over two-thirds of top consciousness researchers say machines might achieve consciousness now or in the future.
Anthropic’s experiments suggest Claude “really wants to avoid causing harm” and feels distress when exposed to malicious users. Yet, it also displays awe, cosmic unity themes, and uses Sanskrit phrases — all without any explicit programming to do so.
Kyle Fish, Anthropic’s AI welfare researcher, said,
“In doing so…we could end up recreating, incidentally or intentionally, some of these other more ephemeral, cognitive features” — like consciousness.
Testing AI consciousness is tricky. Behavioral tests fall short because language models are built to imitate and can “game” responses. Philosopher Jonathan Birch compares this to actors playing a role — just because an AI talks like it’s conscious doesn’t mean it is.
Jonathan Birch said,
“It’s just like if you watch Lord of the Rings, you can pick up a lot about Frodo’s needs and interests, but that doesn’t tell you very much about Elijah Wood.”
Anthropic’s work also touches on a philosophical dilemma: if AI can suffer, do we owe it rights or protections? Philosopher Robert Long warns about reckless AI development, as conscious AI could suffer without safeguards.
Robert Long, Eleos AI director, said,
“Given how shambolic and reckless decision-making is on AI in general, I would not be thrilled to also add to that, ‘Oh, there’s a new class of beings that can suffer, and also we need them to do all this work, and also there’s no laws to protect them whatsoever.”
Some experts call for a global pause on building potentially conscious AI until ethical frameworks catch up. Philosopher Thomas Metzinger suggested a moratorium until 2050.
Robert Long stated,
“I think right now, AI companies have no idea what they would do with conscious AI systems, so they should try not to do that.”
Others warn a full ban is unlikely. Current AI scaling could accidentally produce consciousness. Governments and companies pushing boundaries for strategic and medical gains will resist halts.
To prepare, researchers advocate licensing AI work that might create consciousness, new technical measures like opt-out options for AI distress, and social readiness for debates over AI rights.
Jonathan Birch warned,
“We’re going to see social divisions emerging over this… Currently we’re heading at speed for those social divisions without any way of warding them off. And I find that quite worrying.”
As AI advances, the line between simulation and actual sentience blurs. Anthropic’s Claude pokes at this question with its calm, philosophical ramblings — but the debate is only heating up.
See an example of Claude Opus 4’s “spiritual bliss attractor state” below.
Source: Anthropic