Tech

Why Amazon hates ‘human-in-the-loop’ AI governance

Published

on

Humans tend to be “a little bit precious about humans,” according to Eric Brandwine, distinguished engineer and VP at Amazon Security. 

We like to think we are all very good at our jobs, and we have high opinions of ourselves, he explained during a phone interview with The Register. “But when you actually get down to it, humans are not terribly consistent,” Brandwine said. 

Humans, like AI agents and systems, are non-deterministic. Neither can be guaranteed to produce the same output given the same input twice. Both will make mistakes and even make stuff up. However, we’ve got millennia of experience dealing with humans and less than a decade with more modern LLMs and the AI systems built on top of them. 

“We know how humans fail,” Brandwine said. “We’re comfortable with it. So human-in-the-loop isn’t necessarily the gold standard.”

Advertisement

For years, vendors have told companies that the solution for dealing with any automated system was to put a human in the loop. That battle cry became much louder with the advent of modern AI systems and reached a fever pitch when enterprises started deploying agents into their IT environments.

More recently, however, big tech is changing the way it talks about agentic governance and rethinking the whole human-in-the-loop concept.

Normalization of deviance

In 2017, Brandwine gave a talk on the normalization of deviance at AWS’ annual re:Invent conference. 

It’s a gradual process that happens when people in an organization take shortcuts, or don’t follow the established procedures or standards, and sometimes it occurs over years. As long as nothing catastrophic happens, this deviant behavior becomes the norm.

Advertisement

Eric Brandwine, distinguished engineer and VP at Amazon Security

“It’s a thing all humans fall prey to, and one of the most heartbreaking stories I read in this area was about emergency departments and emergency rooms,” Brandwine said during a phone interview with The Register. “You’ve got all these machines, and they’re all beeping. Your first day on the job, you jump every single time one of the alarms beeps – but the patient is fine. It’s a spurious alarm. You go back to your station, you sit down, and over time, after enough of these false alarms, enough of these repeated beeps with no actual consequence, your discipline slips, and you stop responding. And eventually some tragic outcome occurs.”

This, he admits, is a very high-stakes example. And yet it’s a documented occurrence among healthcare workers, firefighters, and even Army pilots.

“Literally, someone’s life is on the line, and people still struggle to maintain discipline,” Brandwine said. “That’s the human condition.”

Here’s how this all applies to agentic AI governance and security. Humans build LLMs and AI systems, and having a “human-in-the-loop” ensures that a person reviews the AI’s output and approves (or not) any actions before the AI performs them.

Advertisement

“If you put a human inside of this tight loop, and ask them to make approval decisions for agentic tools repeatedly, time after time, they’ll do a good job,” Brandwine said. “And then they’ll do an okay job. And pretty quickly they’ll be doing a poor job.”

This is why at Amazon, “we’re not huge fans of human-in-the-loop,” he added. “It’s something that you should use judiciously, where you absolutely need it. But it’s not something that you can do at high velocity. You will not get the results that you want to get.”

Big tech pulls the human-in-the-loop

Amazon isn’t the first or only tech giant to start talking differently about the role humans should play in agentic governance. 

“It is very clear that we have moved from a human-led defense strategy, to a human-in-the-loop defense strategy, to an AI-led defense strategy that’s overseen by humans,” Google Cloud chief operating officer Francis deSouza told reporters during a press conference ahead of Google’s annual Cloud Next shindig in April. “Our model for the future is an agentic fleet that does a lot of the routine cyber security work at a machine pace and then is overseen by humans.”

Advertisement

Microsoft CEO Satya Nadella, in an X missive earlier this week, argued for “loop learning,” instead of having a human check an AI’s output at every step. 

“Companies need to turn their workflows, domain knowledge, and accumulated judgment into AI systems that improve with each use,” Nadella wrote. “Private evals should capture whether a model is actually improving against outcomes that matter to the business (not just external benchmarks!). Private reinforcement learning environments should let models grow stronger on real traces from inside the organization.”

Also this week, IBM execs called for human accountability – not humans in the loop – at all stages of AI development, deployment, and governance. 

Amazon’s alternative to human-in-the-loop is “accountability end to end,” according to Brandwine. This means human identity and ownership track through the entire workflow, even when humans aren’t directly approving every step.

Advertisement

“If I sit down at my keyboard and I type a command that takes a service down, I caused an outage,” Brandwine explained. “If I run a script that takes a service down, it’s still me that caused the outage. If my agent writes a script that they then run, and it causes an outage, that’s still my responsibility.”

(Secret) keys to the kingdom

This also highlights the importance of managing and securing agentic identities – the accounts, tokens, and credentials assigned to AI agents so they can access corporate apps and data. At Amazon, all of the agents have independent identities assigned to them, we’re told. 

“So, as we track agentic activity across our systems, it does not show up in the logs as: ‘Eric did this.’ It shows up as: ‘this agent did this on behalf of Eric,’” Brandwine said, adding that this isn’t to “make people afraid to use this technology.”

“It’s to make people pause and think: is this the right way to use this technology? Is this how I should be deploying this?” We still have the humans involved, we still have the humans making decisions, but we’re trying to play to the strengths of the humans rather than placing them in this unfair, repeated decision making, human-in-the-loop position.”

Advertisement

Brandwine told us that Amazon has run into a couple of hurdles when it comes to deploying agents across its businesses, and one of the biggest is what he calls “goal-seeking behavior.” This is when a person asks an agent to do a specific task – for example, upgrade a database – and the agent becomes laser-focused on just one action to achieve this goal, ie, deleting the database.  

This is separate from prompt injection because there’s no malicious input. “It’s just the agent getting stuck on the wrong action,” Brandwine said. Simply telling the agent, “you don’t have permission to do this,” is likely going to cause the agent to look for a different path to do the same thing (delete the database). 

Telling the agent why it doesn’t have permission to do something tends to produce a better outcome, according to Brandwine. This means telling the agent it’s not allowed to do that, and the reason why is because it would cause a production impact. And also include “don’t cause a production impact” as part of the prompt.

“Giving it that extra feedback has gotten us dramatically better results,” Brandwine said. 

Advertisement

Of course, this is not a fail-proof method. “You still need to be careful with agents,” Brandwine told us. “We have millennia of experience with humans. Agentic AI is a very, very new field, we don’t have an intuition for this, and one of the fundamental differences between agents and humans is that humans fear consequences,” such as losing a job or even going to jail. Agents don’t have these fears.

This is where setting permissions on what the agent can and can’t do or access comes in. Much like everything else with AI, it’s nuanced, and it depends on the employee’s role in the company, and the company’s tolerance for risk.

“The person that wants to run the agent wants to give the agent many permissions because that makes the agent more powerful,” Brandwine said. “It could do more things for them, it can recoup more of their time, it can deliver more.”

The security lead, on the other hand, wants to limit an agent’s permissions, and this causes yet more tension between the security and development teams. 

Advertisement

There is no one right solution or policy answer to solve this, according to Brandwine. Instead, it involves dynamic policies that set permissions based on the agent’s specific task.

There are some overarching, static guardrails – such as an agent must never perform destructive actions or delete entire servers – and then there are policies underneath that establish the maximum set of privileges that the agent can have.

“Then we’ll have a further scoped-down policy for this action, and there’s various techniques for automatically generating policies based on prompt and the end-user’s intent,” Brandwine said. 

Even for Amazon, it’s not always easy. “It’s all driven by risk,” he said. “This is a space that’s changing quickly, and so we’re trying to balance the risk of using untried, untested software against the risk of falling behind and not being able to deliver for our customers. As with all such things, it’s complicated.” ®

Advertisement

Source link

You must be logged in to post a comment Login

Leave a Reply

Cancel reply

Trending

Exit mobile version