This week on the GeekWire Podcast: Anthropic takes its most powerful models offline after a U.S. order, with Amazon CEO Andy Jassy reportedly contributing to the concerns that helped trigger it. We talk about what it was like to use one of those models, Claude Fable, while it was available, and dig into the Amazon-Anthropic dynamic.
Then we explain how agentic AI is upending Amazon’s “working backwards” tradition, as represented by one division inside the company that is using agents to create prototypes in some cases before going through the company’s traditional PRFAQ process.
Then, an AI-powered school is arriving soon in the Seattle area. Alpha School uses AI-driven software rather than chatbots to teach core academics, frees the rest of the day for hands-on projects, and is drawing both interest from Microsoft executives and skepticism from critics.
Hunting and fishing license incident catches 3M residents
The Texas Parks and Wildlife Department (TPWD) says 3 million Texans had their data stolen following a breach at one of its suppliers.
People with state-issued hunting and fishing licenses are among those affected after attackers breached the vendor that handles license sales and copied customer data.
Advertisement
Details of victims’ driving license and passport numbers may be present in the leaked data. Basic personal information, such as email addresses, phone numbers, and residential addresses also leaked.
Social Security numbers (SSNs), financial data, or information relating to minors were not involved, according to the department’s disclosure.
According to a filing with the Office of the Attorney General, the attack on the unnamed vendor affected 3,087,721 Texans. The filing appears to contradict the department’s disclosure, noting that individuals’ names and SSNs were also involved.
Affected Texans were offered the usual one year of free credit monitoring services provided by Kroll, as long as they enroll by September 14.
Advertisement
A Kroll webpage dedicated to the incident reveals that an investigation has not determined when the breach took place. The department notified Texas Cyber Command on May 13, however.
“We recognize the seriousness of this issue and have identified and implemented additional security options to better protect customer information,” said TPWD. “Many of our staff are hunters and anglers and were affected by this incident. We are committed to continuing to work with the license system vendor to implement increased safeguards to prevent future incidents.”
TPWD said it is working with the affected vendor to introduce additional preventive measures, including enhanced monitoring and access controls.
The org went on to say that new license sales currently scheduled for August will go ahead as planned, although the website used to purchase licenses was unreachable at the time of writing. ®
THX Ltd. has spent more than four decades teaching moviegoers to expect the room to move before the film even begins. Founded by George Lucas in 1983 and developed out of Lucasfilm’s push to improve theatrical sound and presentation, THX became inseparable from Tomlinson Holman’s work, James A. Moorer’s thunderous Deep Note, and the kind of pre-movie trailer that made weak subwoofers beg for mercy.
The company’s latest Deep Note trailer, “Spark,” is not just another nostalgia play from one of cinema’s most recognizable audio brands. Now operating under Razer ownership after the 2016 acquisition, THX is using “Spark” to connect its Lucasfilm-era legacy with the next phase of immersive entertainment, including HDR10+ video and Eclipsa Audio. For a logo that once told audiences the theater was properly calibrated, this is THX trying to make the same argument in a very different format war.
Pro Tip: The first THX Deep Note trailer debuted in 1983.
Advertisement
THX Deep Note “Spark” Updates a Familiar Cinema Ritual
“Spark” blends the nostalgia of THX’s Lucasfilm-era origins with a more modern visual and sonic presentation. The trailer reflects the company’s long-standing mission to help audiences experience movies, music, games, and home theater content closer to the way creators intended.
It also acknowledges THX’s role in raising theatrical presentation standards during the Star Wars era, when George Lucas and Tomlinson Holman pushed for better sound and picture quality in cinemas. More than four decades later, “Spark” gives the Deep Note a fresh identity while preserving the familiar slow build and signature crescendo that made the THX trailer part of the moviegoing experience.
“As entertainment evolves, so does the role THX plays in bringing a creator’s full vision to audiences,” said Tuyen Pham, chief executive officer of THX Ltd. and veteran immersive audio innovator. “This trailer honors our legacy while embracing a future for open technology format standards for broader access for creators and deeper enjoyment by the audience. By releasing the Trailer in HDR10+ and Eclipsa Audio, we are empowering more storytellers, artists, and technologists to build extraordinary experiences that reach fans exactly as intended—faithfully, powerfully, and without compromise, with technology accessible to all via open standards of excellence and fidelity.”
The artistic approach for “Spark” is intended to symbolize imagination taking shape as an audiovisual journey, beginning with a “spark” from THX’s early innovations and media playback standards. It celebrates the creative possibilities of today’s entertainment landscape across concert venues, cinemas, home theaters, gaming rooms, and mobile devices enjoyed with headphones.
Advertisement
“THX was built on the idea that technical rigor and artistic ambition go hand in hand,” said Grace Qaqundah, senior vice president, THX Ltd. “Spark is a tribute to our history and a beacon for what lies ahead. We are thrilled to share it with audiences around the world as a spark of what’s possible when imagination meets high fidelity.”
The Spark also marks the first THX trailer released in the new open standards HDR10+ video and Eclipsa Audio. This is a strategic movie by THX that illustrates their commitment to open standard technology ecosystems that enable broad creator adoption and high-fidelity experiences across theaters, home entertainment, gaming platforms, and certified devices.
Who Supports HDR10+ and Eclipsa Audio?
Samsung has been one of the first major TV brands to support Eclipsa Audio, bringing the format to its 2026 TV and soundbar lineup. HDR10+ also has a much broader device footprint, with more than 22,000 certified products across categories including TVs, computer monitors, projectors, automotive displays, tablets, mobile phones, streaming devices, AVRs, and Blu-ray players. Supporting brands include Samsung, Panasonic, JVC, Xiaomi, TCL, Hisense, and Skyworth.
Advertisement
“Spark” is also expected to appear in THX Certified Cinemas in the second half of 2026, as well as on displays from THX brand partners and THX Certified devices.
Advertisement. Scroll to continue reading.
The inclusion of both HDR10+ and Eclipsa Audio follows THX’s recent expansion of its audio/video technology laboratories in Asia. The company’s Shenzhen lab has been named an Authorized Test Center for both HDR10+ and Eclipsa Audio certifications for consumer electronics and home theater devices.
The THX Deep Note has been part of the cinema experience since 1983, when it debuted ahead of Star Wars: Episode VI — Return of the Jedi. That history matters because THX helped establish the idea that going to the movies should come with a higher standard for sound, picture, and presentation, not just a bigger screen and a sticky floor.
Since then, theater chains and studios have pushed premium formats such as IMAX, Dolby Cinema, ScreenX, RPX, and others, but the THX Deep Note still carries a very specific meaning for moviegoers. It is a signal that the room, the sound system, and the presentation are supposed to matter. “Spark” updates that ritual for today’s immersive cinema landscape while keeping the familiar build that tells audiences the outside world can wait for the next two hours.
THX says “Spark” is expected to debut in THX Certified Cinemas in the second half of 2026, along with appearances on displays from THX brand partners and THX Certified devices.
When you first think of music streaming services, Pandora probably doesn’t come to mind before other platforms, even though it was once a staple. But it’s definitely not one to forget about, especially if you’re keen to find a more affordable alternative to Spotify Premium. In case you need a refresher, or this is your first time hearing about it, Pandora is a music, podcast, and comedy streaming platform primarily based around customizable online radio stations.
You can use Pandora for free — or, if you want to unlock more functionality, you can subscribe to a paid tier. The cheapest paid tier, Pandora Plus, is $4.99 per month, making it a much more affordable option than the majority of other music streaming services. This tier gives you access to custom radio stations uninterrupted by ads, alongside unlimited skips and limited offline listening.
There is a small catch, though, and it’s an integral part of how Pandora Plus works. Since it revolves around personal radio stations and custom listening experiences, it doesn’t really prioritize searching for and picking out individual songs on demand — at least not without listening to an ad first. So, if you frequently find yourself reaching for your phone to hear one specific song, you might decide to opt for Pandora Premium for $10.99 instead. But if you don’t mind letting Pandora’s algorithm work its magic and listening to the occasional ad, then Pandora Plus could suit you just fine.
Advertisement
How does Pandora compare to other streaming platforms?
Tada Images/Shutterstock
Exactly how Pandora compares to its competitors like Apple Music, Spotify, or Tidal, depends on which tier of Pandora you’re using. For example, Pandora and Spotify’s respective free tiers aren’t all that different from one another, as they both set restrictions around your ability to select and play a specific song, and they both include ads. Similarly, Pandora Premium is roughly on par with other streaming services’ premium tiers in terms of functionality, offering ad-free access to its entire library, unlimited skips, offline listening, and playlists.
The real differences between Pandora and other streaming platforms arise with the mid-tier Pandora Plus, because of its focus on stations instead of purely listener-directed listening. With this tier, you’ll spend more time listening to algorithmically informed, never-ending playlists, rather than specific albums, artists, or songs. However, it’s not solely Pandora driving the music. You get plenty of say over what you’re listening to, since you can skip as many songs as you want, and there are several different stations to choose from. Plus, you influence the stations based on your tastes, and by giving any given track a thumbs up or thumbs down. You can also download stations to listen to offline.
Advertisement
Pandora Plus effectively creates a kind of bridge between free and premium subscriptions, which differs from how other platforms work. For that reason, it might not serve as a one-to-one replacement if you’re hoping to ditch your Spotify or Amazon Music subscription. That doesn’t mean that it couldn’t work as an alternative, particularly if you regularly find yourself flicking between Spotify mixes or artist radio stations on Tidal.
Advertisement
Pandora’s stations rely on the Music Genome Project
viewimage/Shutterstock
Pandora stations work a little differently from autogenerated mixes or playlists on some other streaming services, and that’s because it uses something called the Music Genome Project. According to Pandora’s official website, the Music Genome Project is the “most comprehensive analysis of music ever undertaken,” and it’s a bespoke musical database that has been compiled for more than 20 years. It keeps track of a massive amount of different details about every song logged on the service. That project is what provides the backbone of your listening experience when you tune in via Pandora’s stations.
When working on the Music Genome Project, Pandora’s researchers log information into the database on a song-by-song basis, rating each track based on hundreds of different parameters. This information is then used to create networks and relationships between different songs to find similarities. That’s a much more granular approach than just finding different artists that may be similar to one another, which makes the database much more detailed — and arguably, more accurate.
When you give a song a thumbs up or down on a station, it tells Pandora what you do or don’t like about it, such as its key, rhythm, or instrumentation. That makes it far more likely to find another song that sounds similar to the one you liked than if it were basing its algorithm on a rough idea that two artists generally belong to the same genres, or that their music came out around the same time.
Optical computing uses light instead of electricity to process complex data.
Digital twin eliminates long waits for shared optical hardware.
Virtual optical systems mirrored real hardware with remarkable accuracy.
Optical computing has emerged as a promising alternative to traditional electronic systems struggling with increasingly large-scale AI and deep learning workloads.
By harnessing the physical properties of light, including interference and diffraction, optical computing systems offer faster speeds, better energy efficiency, and stronger parallel processing capabilities.
Chinese researchers have now proposed a digital twin model that fundamentally changes how these complex systems are developed and tested.
Latest Videos From
Why physical hardware became a bottleneck for researchers
Traditional optical computing systems face a persistent challenge, since task development relies heavily on direct access to physical hardware platforms.
Advertisement
When multiple researchers need to work with the same system, they typically wait in line, then repeatedly tune parameters and perform error calibration before any genuine computation can begin.
Once one user finishes, the next often must readjust the entire system state, making parallel research nearly impossible across competing projects.
That cycle of waiting, tuning, and recalibrating drives up trial-and-error costs while severely limiting overall research efficiency.
Advertisement
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
To address that bottleneck, researchers developed what they call the Digital Twin Optical Computing System, or DT-OCS, published in Opto-Electronic Advances.
The framework constructs a digital model that reproduces the input-output responses of a physical optical computing system across different configuration parameters entirely within software.
If the physical system resembles an expensive, heavily occupied real machine, researchers describe DT-OCS as functioning like a high-fidelity simulator running alongside it.
Advertisement
Testing image classification inside a virtual twin before touching real hardware
Using a high-speed optical computing system paired with a silicon photonic feature-computing chip, the research team tested DT-OCS on image classification and sequential decision-making tasks.
The results showed that configuration parameters trained and optimized within the digital twin transferred directly to the physical system without requiring further adjustment.
Advertisement
Task performance on the physical hardware matched the digital model’s predictions closely, validating both the fidelity and transferability of the entire approach.
Because training and optimization happen primarily within the digital domain, researchers can now develop multiple distinct tasks simultaneously rather than queuing for shared hardware access.
The team has also made the DT-OCS framework and its associated datasets openly available.
This will allow other researchers to conduct training and validation without ever touching physical equipment themselves.
Advertisement
According to the researchers, they designed DT-OCS as “a reproducible, accessible, and scalable software resource for wider sharing and validation.”
The openness effectively transforms optical computing from a specialized resource constrained by device availability into something closer to a shareable, reproducible research platform.
The researchers argue that future optical computing systems should pair physical hardware with openly available digital models offering equivalent computational behaviour.
Drawing a comparison to how modern transportation depends on both physical roads and continuously updated digital maps, they suggest mature optical computing platforms need a similar dual structure going forward.
Anyone who’s Googled themselves recently knows that it doesn’t quite hit the way it used to. Sure, there’s everything going on with Google search itself, but there’s also an inescapable feeling that web search isn’t the canonical source of information that it used to be, with just as many people learning about you and me from chatbots.
Thomas Dimson and Joey Flynn had a similar feeling, leading them to create In the Weights. The “weights” in question are the numerical parameters that shape an AI model’s training and output, so the website purports to measure how well “a model is able to recall someone without using tools like web search.”
“Being in the weights means your existence was deemed important in the process of creating superhuman artificial intelligence,” the website says.
To achieve this, In the Weights supposedly queries different models (including Grok, Gemini, multiple versions of GPT, Claude, and Llama, plus lesser known models) with a question similar to, “Who is ? Give up to 10 results, each with a short description and confidence.” It then “cluster[s] similar descriptions together and assign[s] a strength score.”
Advertisement
Image Credits:In the Weights
For example, this humble tech blogger received a strength score of 641, placing me in the top 6% of names. I was feeling pretty good until I saw that multipleTechCrunchcolleagues scored even higher. And the leaderboard has been shifting as I write this post, with “Home Alone” star Macaulay Culkin currently in the top slot with a strength score of 988, followed by opera singer Luciano Pavarotti.
The results also show which models returned answers for a given name, and they highlight potential hallucinations — apparently GPT-5.4 Mini says that Anthony Ha is an “ambiguous name form that could refer to multiple people with the initials A.H.A.”
Asked why he built In the Weights, Dimson told TechCrunch via email that he and Flynn were looking to “get the creative juices flowing again” after leaving OpenAI (which they both joined through the acquisition of their design startup Global Illumination).
Dimson said he was thinking about how “Google vanity searches are the wrong objective in 2026 as more traffic moves to LLMs” and about the fact that “so many lives are encoded somehow in a bunch of floating point numbers inside the AI brain.” He also said the direction of the site was “sealed” by a tongue-in-cheek blog post riffing on AI weights and Terry Bisson’s classic short story “They’re Made Out of Meat.”
“Reception has been insane so far, we thought this would be a mild curiosity but it seems like it has struck a nerve of wanting to see if you live forever in the super intelligence (the comparison factor doesn’t hurt either!)” Dimson added.
Advertisement
Image Credits:In the Weights
While I’m not as convinced that being “remembered” by a chatbot is a guaranteed ticket to immortality, I can’t deny that I find the results both intriguing and jealousy-inducing, especially since they’re codified in an easy-to-compare score. (AI critic Anthony Moser scoffed that this is “literally the same as asking 13 chatbots to tell you about yourself.”) Also helping: The fact that the site features a cute, Nintendo-inspired retro design.
Dimson said he plans to dig in further into why different models in the same series return different results, which models are biased towards different types of people, and which people “should have a Wikipedia article but don’t.”
When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.
Learning to drive might be less of a priority than it once was for American teenagers, but the majority still have their licence by the time they turn 19. Depending on where they live, some teens might need to wait a few years longer than others to get on the road. As shown by the Insurance Institute for Highway Safety, the lowest age for getting an unrestricted driver’s license varies from state to state, with some states requiring drivers to wait until they’re 18 to drive without curfews and passenger restrictions.
In contrast, the lowest minimum age for an unrestricted driving license is 16. Only a handful of states allow drivers who have just turned 16 to hold a regular license: They are Idaho, Montana, North Dakota, and South Dakota. In Montana, 16 year old drivers have to have held their license for 12 months or more in order to get nighttime and passenger restrictions lifted. A range of other states lift restrictions at 16 years and six months, including Arizona, Kansas, Mississippi, and New Mexico, among others.
The minimum entry age for learners similarly varies between states, with the lowest age across the country being 14 years old. Drivers in Alaska, Arkansas, Iowa, Kansas, North Dakota, and South Dakota can all get a learner’s permit at the age of 14.
Advertisement
Buying the right car can make it easier and safer to learn to drive
Scharfsinn/Shutterstock
Anyone looking to get their license will need a car to practise in, and if you’re a first timer looking to purchase your first car, it’s worth choosing your new ride carefully. Picking a car with modern safety features should give you extra reassurance in the case of an accident, even though it might not be the cheapest option on the market. When asked, Jay Leno suggested that cars from 2005 onwards are a good bet, but at a minimum, making sure you have something with airbags and modern seatbelts is advisable.
Plenty of car enthusiasts like the feeling of control and involvement that a manual transmission gives them, but learning to drive stick also comes with its own challenges. There are a few beginner tips worth keeping in mind when you start learning, like memorizing your car’s shift pattern, that should make it a little easier.
Advertisement
After you pass the learner stage, all states have an intermediate stage that imposes restrictions about the time of day you can drive and the passengers you can carry. The restrictions vary considerably between states, so be sure to check restriction rules before you head out on the road. To have those restrictions lifted, you’ll usually need to have held your license for a set period of months, or reach a specific age, but again, the time period and age requirements vary depending on where in the country you live.
The partnership does not affect Blacknight’s leadership, workforce or day-to-day functions, the company said.
Carlow-based web hosting company Blacknight has been acquired by European digital services group Your.Online.
Blacknight has secured long-term investment from the Amsterdam-based company to position itself for the next phase of its growth, it said in a statement.
The company did not disclose details of the transaction, but CEO Michele Neylon told SiliconRepublic.com that Blacknight has “reinvested substantially” into the new entity.
Advertisement
Your.Online partners with entrepreneurs, providing promising digital brands with the funds and expertise needed to grow. Blacknight is its first portfolio company based in Ireland.
Founded in 2003 by Neylon and Paul Kelly, Blacknight began as a small hosting business operating from a house in Carlow town.
In the more than two decades since, the company has grown into one of Ireland’s leading domain registrars with more than 90,000 customers across Ireland and internationally.
The company employs nearly 60 people and operates its own infrastructure from multiple data centre locations across Ireland.
Advertisement
It provides domains, hosting, cloud infrastructure, email, co-location and online solutions to businesses, developers, organisations and entrepreneurs.
The partnership with Your.Online allows Blacknight to access additional resources and expertise to accelerate investment in infrastructure, cloud services, security and customer solutions, the company said.
Blacknight will continue to operate independently under its established brand, with Neylon and Kelly remaining in their leadership roles alongside the existing team. Day-to-day operations also remain unchanged. Neylon said that the company plans to hire and expand both in Ireland and overseas.
Advertisement
“This is not about changing who we are,” said Neylon. “It is about strengthening the foundations for the future.
“Blacknight has always believed that Ireland deserves world-class digital infrastructure delivered by people who understand the local market. Joining Your.Online ensures we can continue building for the long term while remaining true to the values that brought us here.”
Chief technology officer Kelly added: “We have always taken a long-term approach to technology, infrastructure, and customer relationships.
“Becoming part of Your.Online allows us to continue investing in innovation and resilience while staying true to the principles Blacknight was built on.”
Advertisement
Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.
Meta is testingface-recognition software built by the United States military and regional police department supplier Rank One, WIRED found in an investigation this week. Meta has been exploring the possibility of adding face recognition tech into its smart glasses, and WIRED previously reported that the app for the glasses contained code—now deleted—that would have enabled the company to activate face-recognition features on the devices.
Anthropic is still negotiating with the Trump administration, after apparent White House concerns about the safety of new public model Claude Fable 5 resulted in Anthropic pulling the product off the market entirely. But security experts point out that AI models with advanced capabilities for discovering and exploiting software vulnerabilities—in other words, creating potentially dangerous hacking tools—will be ubiquitous soon around the world.
The United Kingdom will soon begin scanning the faces of asylum-seekers as part of age checks in spite of evidence that such age evaluation and verification tools are deeply flawed and can make mistakes with life-altering consequences.
Advertisement
In more uplifting uses of surveillance tech, Knicks fans around the world had a chance to watch Thursday’s ticker tape parade in New York City on traffic surveillance cameras thanks to livestreams from the artist Morry Kolman.
And there’s more. Each week, we round up the security and privacy news we didn’t cover in depth ourselves. Click the headlines to read the full stories. And stay safe out there.
The hacking and extortion group ShinyHunters has been loudly proclaiming a slew of high-profile victims in recent months: including the education tech firm Instructure, causing disruption in thousands of schools in the process; the photography firm Kodak; and a key European human rights organization. This week, it also published data allegedly stolen from Madison Square Garden, according to reporting by 404 Media.
The published data, allegedly comprising millions of records across 45GB of files, includes potential personal information from customers and references players and coaches from the Knicks. The data was published not long after the Knicks won their first NBA championship since 1973. A sample of the data reviewed by 404 Media included one file purporting to include the names of “talent,” including Knicks members.
Advertisement
WIRED has recently reported on Madison Square Garden’s extensive use of surveillance technologies, including face recognition systems. Alleged emails in the stolen data viewed by 404 Media include one man complaining about face recognition technology. MSG did not respond to the publication’s request for comment and after the story broke, a federal class action lawsuit was filed over the alleged data breach.
At least three bars in San Francisco’s Castro district, the well-known LGBTQ region of the city, have been using face scanners at their entrances to collect detailed information on customers. The bars are using tech from Patronscan, an ID verification company, to collect facial images, names, genders, according to Gazetteer SF, which went to bars using the technology. As well as the data collection, if staff at the bars spot customers fighting, being involved in theft, or other negative behaviors, they can log this in the system. Face recognition can then identify the person the next time they are at the bar. The recorded information can be shared as part of a “safety network” between other firms using the tech, creating a widespread surveillance network.
For months, governments and companies in Europe have been ditching US technology, citing surveillance and security risks. This week France’s domestic spy agency, the Direction générale de la Sécurité intérieure (DGSI), announced it would stop using Palantir’s data and AI tools in the coming years, replacing them with software from French firm ChapsVision. “We must use our own AI models,” French prime minister Sébastien Lecornu said. “We cannot rely on tools developed by foreign powers. France must have its own tools.”
Apple’s ‘Hide My Email’ tool allows you to generate a random email address that you can use to privately sign-up to new websites and apps, avoiding you handing over personal info to even more websites. However, the company is set to change the way it creates these email addresses. At present, they all use the @icloud.com domain. Going forward, as TechCrunch reported this week, Apple plans to use the domain: @private.icloud.com. The not-so-subtle change could make it easier for firms to detect people are using the privacy-preserving service and demand sign-ups with an alternative email address.
If Mac game developers miss these settings, their brand new game for Apple Silicon will be listed as unplayable on anything after macOS 10.15 Catalina. Here’s how to tell Steam that your game is compatible, and where the flags are.
Steam’s macOS Catalina warning
Valve Software’s gaming storefront, Steam, is the biggest of its kind when it comes to PC and Mac gaming. It is essential for any game developer targeting those platforms to put a build on Steam, due to the sheer size of its audience. However, while Steam is relatively simple for gamers to use, the back-end is considerably more complex. For a developer starting out, there are many things that can be missed when putting their game onto Steam for the first time. Continue Reading on AppleInsider | Discuss on our Forums
Humans tend to be “a little bit precious about humans,” according to Eric Brandwine, distinguished engineer and VP at Amazon Security.
We like to think we are all very good at our jobs, and we have high opinions of ourselves, he explained during a phone interview with The Register. “But when you actually get down to it, humans are not terribly consistent,” Brandwine said.
Humans, like AI agents and systems, are non-deterministic. Neither can be guaranteed to produce the same output given the same input twice. Both will make mistakes and even make stuff up. However, we’ve got millennia of experience dealing with humans and less than a decade with more modern LLMs and the AI systems built on top of them.
“We know how humans fail,” Brandwine said. “We’re comfortable with it. So human-in-the-loop isn’t necessarily the gold standard.”
Advertisement
For years, vendors have told companies that the solution for dealing with any automated system was to put a human in the loop. That battle cry became much louder with the advent of modern AI systems and reached a fever pitch when enterprises started deploying agents into their IT environments.
More recently, however, big tech is changing the way it talks about agentic governance and rethinking the whole human-in-the-loop concept.
Normalization of deviance
In 2017, Brandwine gave a talk on the normalization of deviance at AWS’ annual re:Invent conference.
It’s a gradual process that happens when people in an organization take shortcuts, or don’t follow the established procedures or standards, and sometimes it occurs over years. As long as nothing catastrophic happens, this deviant behavior becomes the norm.
Advertisement
Eric Brandwine, distinguished engineer and VP at Amazon Security
“It’s a thing all humans fall prey to, and one of the most heartbreaking stories I read in this area was about emergency departments and emergency rooms,” Brandwine said during a phone interview with The Register. “You’ve got all these machines, and they’re all beeping. Your first day on the job, you jump every single time one of the alarms beeps – but the patient is fine. It’s a spurious alarm. You go back to your station, you sit down, and over time, after enough of these false alarms, enough of these repeated beeps with no actual consequence, your discipline slips, and you stop responding. And eventually some tragic outcome occurs.”
“Literally, someone’s life is on the line, and people still struggle to maintain discipline,” Brandwine said. “That’s the human condition.”
Here’s how this all applies to agentic AI governance and security. Humans build LLMs and AI systems, and having a “human-in-the-loop” ensures that a person reviews the AI’s output and approves (or not) any actions before the AI performs them.
Advertisement
“If you put a human inside of this tight loop, and ask them to make approval decisions for agentic tools repeatedly, time after time, they’ll do a good job,” Brandwine said. “And then they’ll do an okay job. And pretty quickly they’ll be doing a poor job.”
This is why at Amazon, “we’re not huge fans of human-in-the-loop,” he added. “It’s something that you should use judiciously, where you absolutely need it. But it’s not something that you can do at high velocity. You will not get the results that you want to get.”
Big tech pulls the human-in-the-loop
Amazon isn’t the first or only tech giant to start talking differently about the role humans should play in agentic governance.
“It is very clear that we have moved from a human-led defense strategy, to a human-in-the-loop defense strategy, to an AI-led defense strategy that’s overseen by humans,” Google Cloud chief operating officer Francis deSouza told reporters during a press conference ahead of Google’s annual Cloud Next shindig in April. “Our model for the future is an agentic fleet that does a lot of the routine cyber security work at a machine pace and then is overseen by humans.”
Advertisement
Microsoft CEO Satya Nadella, in an X missive earlier this week, argued for “loop learning,” instead of having a human check an AI’s output at every step.
“Companies need to turn their workflows, domain knowledge, and accumulated judgment into AI systems that improve with each use,” Nadella wrote. “Private evals should capture whether a model is actually improving against outcomes that matter to the business (not just external benchmarks!). Private reinforcement learning environments should let models grow stronger on real traces from inside the organization.”
Also this week, IBM execs called for human accountability – not humans in the loop – at all stages of AI development, deployment, and governance.
Amazon’s alternative to human-in-the-loop is “accountability end to end,” according to Brandwine. This means human identity and ownership track through the entire workflow, even when humans aren’t directly approving every step.
Advertisement
“If I sit down at my keyboard and I type a command that takes a service down, I caused an outage,” Brandwine explained. “If I run a script that takes a service down, it’s still me that caused the outage. If my agent writes a script that they then run, and it causes an outage, that’s still my responsibility.”
(Secret) keys to the kingdom
This also highlights the importance of managing and securing agentic identities – the accounts, tokens, and credentials assigned to AI agents so they can access corporate apps and data. At Amazon, all of the agents have independent identities assigned to them, we’re told.
“So, as we track agentic activity across our systems, it does not show up in the logs as: ‘Eric did this.’ It shows up as: ‘this agent did this on behalf of Eric,’” Brandwine said, adding that this isn’t to “make people afraid to use this technology.”
“It’s to make people pause and think: is this the right way to use this technology? Is this how I should be deploying this?” We still have the humans involved, we still have the humans making decisions, but we’re trying to play to the strengths of the humans rather than placing them in this unfair, repeated decision making, human-in-the-loop position.”
Advertisement
Brandwine told us that Amazon has run into a couple of hurdles when it comes to deploying agents across its businesses, and one of the biggest is what he calls “goal-seeking behavior.” This is when a person asks an agent to do a specific task – for example, upgrade a database – and the agent becomes laser-focused on just one action to achieve this goal, ie, deleting the database.
This is separate from prompt injection because there’s no malicious input. “It’s just the agent getting stuck on the wrong action,” Brandwine said. Simply telling the agent, “you don’t have permission to do this,” is likely going to cause the agent to look for a different path to do the same thing (delete the database).
Telling the agent why it doesn’t have permission to do something tends to produce a better outcome, according to Brandwine. This means telling the agent it’s not allowed to do that, and the reason why is because it would cause a production impact. And also include “don’t cause a production impact” as part of the prompt.
“Giving it that extra feedback has gotten us dramatically better results,” Brandwine said.
Advertisement
Of course, this is not a fail-proof method. “You still need to be careful with agents,” Brandwine told us. “We have millennia of experience with humans. Agentic AI is a very, very new field, we don’t have an intuition for this, and one of the fundamental differences between agents and humans is that humans fear consequences,” such as losing a job or even going to jail. Agents don’t have these fears.
This is where setting permissions on what the agent can and can’t do or access comes in. Much like everything else with AI, it’s nuanced, and it depends on the employee’s role in the company, and the company’s tolerance for risk.
“The person that wants to run the agent wants to give the agent many permissions because that makes the agent more powerful,” Brandwine said. “It could do more things for them, it can recoup more of their time, it can deliver more.”
The security lead, on the other hand, wants to limit an agent’s permissions, and this causes yet more tension between the security and development teams.
Advertisement
There is no one right solution or policy answer to solve this, according to Brandwine. Instead, it involves dynamic policies that set permissions based on the agent’s specific task.
There are some overarching, static guardrails – such as an agent must never perform destructive actions or delete entire servers – and then there are policies underneath that establish the maximum set of privileges that the agent can have.
“Then we’ll have a further scoped-down policy for this action, and there’s various techniques for automatically generating policies based on prompt and the end-user’s intent,” Brandwine said.
Even for Amazon, it’s not always easy. “It’s all driven by risk,” he said. “This is a space that’s changing quickly, and so we’re trying to balance the risk of using untried, untested software against the risk of falling behind and not being able to deliver for our customers. As with all such things, it’s complicated.” ®
You must be logged in to post a comment Login