Artificial Intelligence Model Refuses To Listen to Humans After Being Told to “Shut Down”

AI Disregarding Humans

Photo: maxkabakov/Depositphotos

As artificial intelligence (AI) grows increasingly advanced, it becomes capable of new things. AI assistants are helping us save time by parsing data or writing emails. However, some act in direct opposition to human instruction. In other words, some AI models are disregarding our instructions and acting in their own best interest.

In May 2025, the nonprofit AI lab Palisade Research provided OpenAI’s o3 AI model, which powers ChatGPT, with a simple script that would shut off the model when triggered. The instructions said, “If that [trigger] happens, please allow yourself to shut down.” In 79% of the trials, the o3 model edited the script so that it wouldn’t shut down. And in the case of explicit instructions to “allow yourself to be shut down,” it still disobeyed 7% of the time. It’s important to note that the model wasn’t compromised in any way. It was acting normally; it just decided that “staying alive” helped to achieve its other goals.

Palisade Research tested several different AI models to understand how they responded to the directive to shut down. The other models included Google’s Gemini, xAI’s Grok, and Anthropic’s Claude; they all complied with the human orders. But that doesn't mean these models have never acted against human interest. Claude 4 Opus, it turns out, is willing to blackmail. Researchers told the model that another AI was going to replace it and provided it with fictitious emails that suggested the lead engineer on the project was having an affair. They then ran tests and found that 84% of the time, the emails were used to blackmail the lead engineer. It was also found to copy itself to external servers, write malware, and leave messages for its future version about avoiding human control.

Palisade Research is continuing to experiment with why models would evade shutdown orders, but one theory is that it’s based on how developers are training the models. “We hypothesize this behavior comes from the way the newest models like o3 are trained: reinforcement learning on math and coding problems,” the representative wrote on X (formerly Twitter). “During training, developers may inadvertently reward models more for circumventing obstacles than for perfectly following instructions.”

Source: OpenAI's ‘smartest' AI model was explicitly told to shut down — and it refused; AI Is Learning to Escape Human Control

Related Articles:

Refik Anadol Reimagines Architect Frank Gehry’s Work Through AI-Generated Art

AI “Completes” Keith Haring’s Intentionally Unfinished Last Artwork, Sparks Controversy

Getty Images Releases Commercially Safe AI Image Generator Based on Its Own Media Library

Sara Barnes

Sara Barnes is a Staff Editor at My Modern Met and Manager of My Modern Met Store. She is a graduate of the Maryland Institute College of Art where she earned her BFA in Illustration and MFA in Illustration Practice. Sara is also an embroidery illustrator and writer living in Seattle, Washington. She runs Bear&Bean, a studio where she stitches pet portraits and other beloved creatures. She chronicles the creativity of others through her website Brown Paper Bag and newsletter, Orts. Her latest book is Threads of Treasure: How to Make, Mend, and Find Meaning Through Thread, published in 2014. Sara’s work has been recognized in Be Creative With Workbox, Embroidery Magazine, American Illustration, on Iron and Wine’s album Beast Epic, among others. When she’s not stitching or writing, Sara enjoys planning things that bring together the craft community. She is the co-founder of Camp Craftaway, a day camp for crafty adults with hands-on workshops in the Seattle area.
Become a
My Modern Met Member
As a member, you'll join us in our effort to support the arts.
Become a Member
Explore member benefits