Crypto World

AI Agents Must Be Treated as Untrusted Crypto Systems

Published

on

A new research paper reframes security for AI-powered agents as a system-wide problem, arguing that protections must extend beyond the model itself to harden the entire workflow. Published in amended form on May 20 by researchers from Google, Gray Swan AI, EmbraceTheRed, and several universities, the work contends that AI agents should be treated as untrusted components within a broader security architecture, warning that focusing solely on model robustness leaves ecosystems vulnerable to attacks and failures.

“Towards this end, we propose viewing agent security as an instance of computer security. This domain has long dealt with powerful attackers and motivated decades of research on principles and techniques that deal with such adversaries,” the researchers wrote in the paper. The framing shifts the emphasis from merely stiffening an agent’s inner workings to protecting the entire chain—from data inputs and instructions to the permissions the agent holds and the destinations data may reach. The authors argue that this systems-oriented stance is especially relevant as AI agents become more embedded in crypto applications, including autonomous trading and wallet interactions.

“Through this lens, efforts to increase model robustness, the dominant viewpoint in the community, are insufficient on their own. Instead, we must complement existing efforts with techniques from the systems security domain.”

The paper notes that AI agents are already gaining traction among crypto users, with industry executives speculating about rapid adoption. Circle CEO Jeremy Allaire, for example, has projected that billions of AI agents could operate on users’ behalf within five years, underscoring the pace at which autonomous tooling could become a standard element of crypto workflows.

Key takeaways

  • Security for AI agents should treat the agent as an untrusted component within a larger system, not as a trusted, isolated module.
  • Three mechanisms could block a large fraction of attacks: distinguish between instructions and untrusted data, grant only the minimum permissions needed, and control data flows to prevent leakage to unsafe destinations.
  • Real-world incidents, including crypto trading bots and wallet interfaces, illustrate how an attacker could exploit AI-enabled tooling if system-wide safeguards are not in place.
  • In crypto, AI agents are being used to build applications, automate trades, and interact with protocols, raising the stakes for robust, end-to-end security design.
  • Industry voices advocate for context-aware, sandboxed prompts and rigorous governance around what actions an AI agent may perform, especially when wallets or private keys are involved.

Security as a systems problem for AI agents

The core argument of the amended paper is that embedding security solely in the AI model’s robustness is insufficient. Instead, AI agents should be designed and operated as components within a larger, defended system. The researchers emphasize that standard security practice distinguishes between trusted and untrusted components, and AI should be treated as untrusted by design. By doing so, defenders can apply decades of computer security insights to arguments about threat models, adversaries, and defense-in-depth.

As part of this framework, the authors outline three mechanisms that could eliminate a large portion of potential attacks. First, there must be a clear separation between instructions given to an agent and the data the agent processes. By preventing adversaries from embedding malicious instructions within seemingly innocuous data, agents become harder to deceive via data-driven manipulation. Second, agents should operate with the minimum set of permissions necessary for a task, reducing the blast radius if an attacker compromises the agent. Third, the broader system should govern sensitive information flows, restricting where data can travel and ensuring that the agent cannot exfiltrate or redirect data to unsafe destinations.

Advertisement

The paper’s emphasis on data handling and permission discipline aligns with established security principles used to manage risk in other domains. In short, even a highly capable AI agent can be safe if the surrounding system controls are robust and well-defined, and the agent’s ability to act is carefully bounded.

Crypto real-world tensions: incidents and design patterns

The discussion comes against a backdrop of real-world incidents involving AI-enabled crypto tools. In May, the AI-powered trading assistant Bankr reportedly disabled transactions after an attacker gained access to at least 14 wallets, a development that security researchers linked to potential abuse of the bot. While the exact mechanism of compromise remains under discussion, the episode underscores the vulnerability surface when agents are granted operational control over wallets or trading actions.

Industry voices emphasize that the risk is not theoretical. As crypto platforms experiment with AI agents for tasks such as front-running detection, contract auditing, balance checks, and even automated payments, the potential for systemic damage grows if security is not engineered into the entire lifecycle—data ingestion, decision logic, and execution controls.

Aaron Ratcliff, attribution lead at Merkle Science, highlighted the paradox of integrating AI into trustless ecosystems. He told Cointelegraph that giving an agent access to a wallet can be safe if the system enforces strong boundaries and verification. “I’d want proof that the AI can catch front-running, apply slippage limits, spot scam tokens, and audit contracts in real time before it makes a trade. It should also sandbox prompts, prevent injection, and block man-in-the-middle access,” he said.

Advertisement

Sean Ren, co-founder of Sahara AI, agreed that model-context protocols play a crucial role in safety when configured correctly. Yet he cautioned that users must remain vigilant about every action an AI agent performs. “They essentially act as a gatekeeper between the AI model and your wallet. The agent can only perform specific, approved actions—such as checking balances or preparing a payment for you to confirm—rather than freely moving funds or changing wallet settings,” Ren noted.

Implications for Web3 developers and users

The study’s systems-security framing has practical implications for developers building AI-enabled Web3 applications. It suggests a shift in architecture toward explicit permissioning, verifiable data provenance, and enforced data flows that separate the agent’s decision-making from wallet control. For users, the message is one of guarded optimism: AI agents can unlock convenient automation and faster interactions with DeFi protocols, but only within a design that constrains risk through separation of duties, sandboxing, and robust monitoring.

As crypto platforms increasingly explore AI-powered assistants, the debate is likely to pivot from “can we automate more?” to “how can we do so safely?” The emphasis on treating agents as untrusted components may lead to more rigorous security reviews, standardized context protocols, and greater emphasis on prompt governance and prompt-injection defenses in production systems.

Researchers from the collaboration also point to broader industry momentum. The adoption of AI agents in crypto tooling—ranging from trading automation to proactive risk checks and contract analysis—could accelerate if builders adopt a shared safety framework that mirrors established computer-security practices. The outcome could be a crypto ecosystem that leverages AI’s productivity gains while maintaining strong protections against manipulation, leakage, and unintended actions.

Advertisement

Looking ahead, the authors of the study advocate for practical steps that exchanges, wallet providers, and DeFi developers can take now. These include enforcing strict separation of instructions and data, applying the principle of least privilege, and implementing system-level controls that govern where information can go. The overarching aim is to embed a defense-in-depth mindset that scales with increasingly autonomous AI agents, rather than relying solely on model hardening.

For readers monitoring the convergence of AI and crypto, the key takeaway is clear: as autonomous agents become more capable, the design principles governing their operation must evolve. The field is moving toward a holistic security paradigm that treats agents as components within a larger, defendable system, with consequences that reach wallets, trading bots, and on-chain automation alike. The next several quarters are expected to reveal whether the industry can translate this systems-security philosophy into concrete standards, safer defaults, and verifiable safeguards for end users.

Risk & affiliate notice: Crypto assets are volatile and capital is at risk. This article may contain affiliate links. Read full disclosure

Advertisement

Source link

You must be logged in to post a comment Login

Leave a Reply

Cancel reply

Trending

Exit mobile version