If You Turn Down an AI’s Ability to Lie, It Starts Claiming It’s Conscious
Victor Tangermann
created: Nov. 29, 2025, 12:30 p.m. | updated: Dec. 9, 2025, 12:26 p.m.
Researchers have found that if you tone down a large language model’s ability to lie, it’s far more likely to claim that it’s self-aware.
Still, the behavior of large language models can be eerie.
In one experiment, the team modulated a “set of deception- and roleplay-related features” to suppress a given AI model’s ability to lie or roleplay.
Other studies have found that AI models may be developing “survival drives,” often refusing instructions to shut themselves down and lying to achieve their objectives.
More on conscious AI: Across the World, People Say They’re Finding Conscious Entities Within ChatGPT
2 months, 2 weeks ago: Futurism