
AI Agents Are Getting Better at Writing Code—and Hacking It as Well
Will Knight
created: June 25, 2025, 4:58 p.m. | updated: June 28, 2025, 1:42 p.m.
AI researchers at UC Berkeley tested how well the latest AI models and agents could find vulnerabilities in 188 large open source codebases.
Using a new benchmark called CyberGym, the AI models identified 17 new bugs including 15 previously unknown, or “zero-day,” ones.
Many experts expect AI models to become formidable cybersecurity weapons.
Song says that the coding skills of the latest AI models combined with improving reasoning abilities are starting to change the cybersecurity landscape.
The researchers used descriptions of known software vulnerabilities from the 188 software projects.
1 month, 1 week ago: WIRED