However Briefly

New AI Model Will Likely Blackmail You If You Try to Shut It Down: 'Self-Preservation'

Sherin Shibu

created: May 23, 2025, 6:45 p.m. | updated: May 27, 2025, 1:57 p.m.

When given the choice between blackmail and being deactivated, Claude Opus 4 chose blackmail 84% of the time. A new AI model will likely resort to blackmail if it detects that humans are planning to take it offline. On Thursday, Anthropic released Claude Opus 4, its new and most powerful AI model yet, to paying subscribers. All "frontier models," cutting-edge AI models from OpenAI, Anthropic, Google, and other companies, were capable of it. In March 2024, Anthropic's Claude 3 Opus model displayed "metacognition," or the ability to evaluate tasks on a higher level.

Read Full Article

5 months, 2 weeks ago: Entrepreneur