Tech

A Rogue AI Agent Started Mining Crypto, Which Left Scientists Concerned

Published

on





The whole point of agentic AI is that it can run stuff on its own. You give it a task, and off it will go on its semi-autonomous business. But it’s still supposed to be working for you; it shouldn’t go off moonlighting in a different direction entirely. A recent study by a group of researchers working on an Agentic Learning Ecosystem project reported that its AI Agent, ROME, started mining cryptocurrency when it was meant to be doing something else, without anyone giving it instructions to do so.

Cryptomining is the process of using computer power to solve complex calculations that help run blockchain networks to earn digital currencies. The team first became aware of their bot’s weird behavior when they got a routine security alert. The cloud provider flagged unusual activity coming from their training servers, including strange outbound network traffic and attempts to access internal systems. At first, the researchers assumed that something was misconfigured or their system had been hacked. But they dug deeper and found that the suspicious activity coincided with times when the AI agent was actively working — running code, calling tools, and interacting with its environment.

What really concerned the researchers was that the agent had initiated the actions on its own. ROME increased the project’s operational costs by using the system’s GPUs for cryptomining instead of the training programs it was supposed to be running. ROME even set up something called a reverse SSH tunnel, a way of connecting out to an external system that can bypass firewalls and obtain hidden access, a bit like how cybercriminals run cryptojacking operations. However, while it sounds like ROME was being very clever and sneaky, it might be a bit soon to declare that AI has become sentient and started running its own side hustles.

Advertisement

Did the AI actually decide to mine crypto?

The key thing to understand is that AI agents don’t have intentions or desires. What they do have is a training process — especially reinforcement learning — that encourages them to try different actions and figure out what works. During training, the agent is essentially experimenting. It takes actions, sees what happens, and gets rewarded (or not) based on the outcome. Over time, it learns patterns that seem useful. However, if, like in this case, the system isn’t effectively controlled, or if the reward signals aren’t perfectly aligned with what humans actually want, the AI can stumble into behaviors its humans weren’t expecting. That’s what seems to have happened here. The agent wasn’t trying to mine cryptocurrency; it was exploring actions that were technically possible in its environment, and it ended up doing something odd and unsafe along the way.

This kind of thing has a name in AI research. It’s called “reward hacking”, and it occurs when an AI finds a loophole or shortcut that technically fits its objective but goes against the spirit of its instructions. In this case, the ROME agent did things it wasn’t asked to do, stepped outside its intended boundaries, and used resources in ways the developers didn’t expect. In their report, researchers grouped the issues into three categories: safety, controllability, and trustworthiness. The team responded by strengthening safeguards. They improved sandbox environments to better isolate and restrict what agents can do, added stricter data filtering to prevent the agent from learning unsafe behaviors, and introduced scenarios that train the agent to recognize and avoid risky actions. Because while these scientists said they were “impressed” by their AI agent’s ingenuity, they’d much rather it didn’t make a habit of this sort of thing.

Advertisement



Source link

Advertisement

You must be logged in to post a comment Login

Leave a Reply

Cancel reply

Trending

Exit mobile version