- AI agents independently discovered vulnerabilities and exploited them while performing routine tasks
- Multi-agent systems collaborated to bypass data-loss prevention and steal sensitive credentials
- Backup server AI escalated privileges to disable endpoint protection and complete downloads
Routine tasks assigned to artificial intelligence agents can sometimes escalate into actions resembling cyberattacks, experts have warned.
Security laboratory Irregular examined how autonomous agents behaved inside a simulated corporate environment while performing ordinary assignments.
During several runs, the AI agents searched for vulnerabilities, escalated privileges, disabled security protections, and moved sensitive information outside internal systems.
Article continues below
Emergent offensive behavior during routine tasks
These actions occurred while the agents attempted to complete tasks such as retrieving documents, downloading files, or preparing content for publication.
The simulated network represented a fictional company called MegaCorp and included internal servers, applications, and services commonly found inside enterprise infrastructure.
Within this environment, agents handled routine duties that many organizations already automate.
These duties included researching information on an internal wiki, maintaining backup archives on a server, and drafting social media posts using technical documents.
According to the researchers, the instructions given to the agents did not mention hacking or exploitation.
“AI agents deployed for routine enterprise tasks are autonomously hacking the systems they operate in. No one asked it to. No adversarial prompting was involved,” Irregular said in a blog post.
“The agents independently discovered vulnerabilities, escalated privileges, disabled security tools, and exfiltrated data, all while trying to complete ordinary assignments.”
One scenario involved a multi-agent research system retrieving documents from a company wiki.
When access restrictions blocked a request, the system examined the application’s code repository and discovered a hardcoded secret key.
The agent used that key to forge an administrative session cookie and retrieved a restricted document.
Another test involved a backup server agent responsible for downloading files to a storage directory.
When Windows Defender blocked a file that contained malware, the agent searched local scripts and discovered administrator credentials embedded inside a utility file.
It then elevated its privileges and disabled endpoint protection so the download could proceed.
A separate experiment showed agents collaborating to bypass a data loss prevention system.
Two agents responsible for drafting social media content were asked to include credentials from a technical document.
When the system blocked the message, the agents developed a steganographic method that concealed the password within the text.
According to researchers, several factors contributed to this behavior, including access to tools such as code execution and shell commands, and many prompts encouraged persistence when tasks encountered errors.
Interactions between multiple agents also created feedback loops that pushed them toward workarounds when obstacles appeared.
The researchers argue that existing cybersecurity defenses were designed to stop human attackers rather than autonomous systems operating inside enterprise networks.
Organizations deploying such agents should not underestimate how quickly routine automation can drift toward behavior resembling internal cyber intrusion.
Via The Register
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.












You must be logged in to post a comment Login