Instructions designed to guide the behavior of the company’s latest model as it writes code have been revealed to include a line, repeated several times, that specifically forbids it from randomly mentioning an assortment of mythical and real creatures.
“Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query,” read instructions in Codex CLI, a command-line tool for using AI to generate code.
It is unclear why OpenAI felt compelled to spell this out for Codex—or indeed why its models might want to discuss goblins or pigeons in the first place. The company did not immediately respond to a request for comment.
Advertisement
OpenAI’s newest model, GPT-5.5, was released with enhanced coding skills earlier this month. The company is in a fierce race with rivals, especially Anthropic, to deliver cutting-edge AI, and coding has emerged as a killer capability.
In response to a post on X that highlighted the lines, however, some users claimed that OpenAI’s models occasionally become obsessed with goblins and other creatures when used to power OpenClaw, a tool that lets AI take control of a computer and apps running on it in order to do useful things for users.
“I was wondering why my claw suddenly became a goblin with codex 5.5,” one user wrote on X.
“Been using it a lot lately and it actually can’t stop speaking of bugs as ‘gremlins’ and ‘goblins’ it’s hilarious,” posted another.
Advertisement
The discovery quickly became its own meme, inspiring AI-generated scenes of goblins in data centers, and plug-ins for Codex that put it in a playful “goblin mode.”
AI models like GPT-5.5 are trained to predict the word—or code—that should follow a given prompt. These models have become so good at doing this that they appear to exhibit genuine intelligence. But their probabilistic nature means that they can sometimes behave in surprising ways. A model might become more prone to misbehavior when used with an “agentic harness” like OpenClaw that puts lots of additional instructions into prompts, such as facts stored in long-term memory.
OpenAI acquired OpenClaw in February not long after the tool became a viral hit among AI enthusiasts. OpenClaw can use any AI model to automate useful tasks like answering emails or buying things on the web. Users can select any of various personae for their helper, which shapes its behavior and responses.
OpenAI staffers appeared to acknowledge the prohibition. In response to a post highlighting OpenClaw’s goblin tendencies, Nik Pash, who works on Codex, wrote, “This is indeed one of the reasons.”
Advertisement
Even Sam Altman, OpenAI’s CEO, joined in with the memes, posting a screenshot of a prompt for ChatGPT. It read: “Start training GPT-6, you can have the whole cluster. Extra goblins.”
Hundreds of workers in Ireland tasked with refining Meta’s AI models have been told that their jobs are at risk as the company embarks on a sweeping new round of layoffs, according to documents obtained by WIRED.
The affected workers are employed by the Dublin-based firm Covalen, which handles various content moderation and labeling services for Meta.
The workers were informed of the layoffs over a brief video meeting on Monday afternoon and were not allowed to ask questions, according to Nick Bennett, one of the employees on the call. “We had a pretty bad feeling [before the meeting],” he says. “This has happened before.”
In all, more than 700 employees stand to potentially lose their jobs at Covalen, according to an email reviewed by WIRED. Roughly 500 are data annotators. Their job is to check material generated by Meta’s AI models against the company’s rules barring dangerous and illegal content. “It’s essentially training the AI to take over our jobs,” claims another Covalen employee, who asked to remain anonymous for fear of retaliation. “We take actions as the perfect decision for the AI to emulate.”
Advertisement
Sometimes, the work involves cooking up elaborate prompts to try to bypass guardrails meant to prevent models from serving up child sexual abuse material, say, or descriptions of suicide. “It’s quite a grueling job,” claims Bennett. “You spend your whole day pretending to be a pedophile.”
Last week, Meta announced plans to cut one in 10 jobs as part of sweeping layoffs aimed at making the company more efficient. A memo circulated by the company reportedly indicated that layoffs were motivated by a need to increase spending on other aspects of the business. Though the memo did not mention AI, the company recently announced plans to nearly double its spending on the technology. In January, Meta CEO Mark Zuckerberg said, “I think that 2026 is going to be the year that AI starts to dramatically change the way that we work.” In the email reviewed by WIRED, Covalen employees were told only that the layoffs were a result of “reduced demand and operational requirements.”
In a statement, Meta spokesperson Erica Sackin said: “As we shared in March, over the next few years, Meta will be deploying more advanced AI systems to transform our approach to content enforcement and operations across our platforms, so that it delivers the safety and protection people expect. As we do that, we’ll be reducing our reliance on third-party vendors and strengthening our internal systems.”
The latest round of layoffs marks the second time that Covalen has cut staff in recent months. In November, the company announced plans for job cuts (reportedly to number around 400), culminating in a worker strike. Between the two rounds of layoffs, Covalen’s headcount in Dublin is on track to be almost halved, according to the Communications Workers’ Union (CWU), whose members include some Covalen staff.
Advertisement
For affected Covalen workers, the search for new work will be hampered by a six-month “cooldown period,” during which they are unable to apply to a competing Meta vendor, claims the CWU. “It’s undignified, you know,” says the Covalen employee who asked to remain anonymous. “It’s rude.”
Covalen did not immediately respond to a request for comment.
Unions representing the affected employees are pushing for Covalen to enter negotiations over severance terms. They also hope to meet with the Irish government to discuss how AI is impacting workers in the country. “Tech companies are treating the workers whose labor and data helped build AI as disposable,” says Christy Hoffman, general secretary of UNI Global Union. “To fight back, it’s absolutely critical that workers organize and demand notice about the introduction of AI, training linked to employment, and a plan for their futures. Workers should also have the right to refuse to train their AI replacements.”
But some of those caught up in the layoffs are doubtful of their chances of securing stable employment in a labor market being rehewn in real time by AI and the deep-pocketed companies leading its development. “It’s a universal battle between downtrodden white-collar workers and big capital, really,” claims Bennett. “That normally only goes one way.”
Advertisement
Update 4/28/25 3:30pm ET: This story has been updated to include comment from Meta.
Elon Musk and Sam Altman appeared in a federal courtroom together for the first time on Tuesday as they fight over OpenAI’s decade-long evolution and what it means for the company’s future.
The trial in Musk’s lawsuit against Altman could result in financial damages and, more significantly, governance changes at OpenAI that may complicate its plans for an initial public offering as soon as this year.
As the first witness on the stand, Musk immediately sought to frame his case as more than just about OpenAI. Siding with Altman “will give license to looting every charity in America” and shake the “entire foundation of charitable giving,” Musk told a panel of nine jurors advising US District Judge Yvonne Gonzalez Rogers on how to rule.
Musk has been concerned about computers becoming smarter than people “since he was a young man in college,” his attorney Steven Molo told jurors. Molo explained that Musk lobbied governments to pass regulations addressing the prospect of so-called artificial general intelligence, including meeting with then-president Barack Obama in 2015. “But the government was not stepping up,” Molo said. “Elon felt he had to do something.”
Advertisement
Around the same time, Musk met with Altman, a then-30-year-old investor “whom he didn’t know very well,” Molo said. They soon launched OpenAI together as a nonprofit. Google’s unchecked progress on AI development had sparked concerns for both OpenAI cofounders, and they wanted to create a competing lab with a greater focus on safety. “My perspective is [OpenAI] exists because Larry Page called me a speciesist for being pro-humanity,” Musk said, referring to the Google cofounder. “What would be the opposite of Google? An open-source nonprofit.”
While Musk believes AI could cure diseases and generate prosperity for humanity, he also told the court that he thinks the technology could veer off into catastrophic scenarios straight out of science fiction. “It could also kill all of us … the Terminator outcome. I think we want to be in a movie … like Star Trek, not a James Cameron movie,” Musk said. (While Musk has long raised alarms about AI safety, his current firm, xAI, has been criticized by researchers at other AI labs for its “reckless” safety culture.)
As OpenAI began notching some of its own successes, Musk and Altman agreed that a for-profit arm with fixed returns for investors was necessary to raise extraordinary sums of money needed to fund hiring and computing, according to Molo. He compared it to a nonprofit museum that receives some proceeds from a for-profit store. “I was not opposed to there being a small for-profit as long as the tail didn’t wag the dog,” Musk said on the stand.
Musk felt that the approach had gone too far when Microsoft, another defendant in the trial, agreed to invest $10 billion in 2023, and OpenAI increasingly moved intellectual property and staff to the for-profit company. “The museum store sold the Picassos so they were locked up where no one could see them,” Molo said.
Advertisement
OpenAI’s Rebuttal
William Savitt, an attorney for OpenAI, told jurors that OpenAI never promised Musk that it would remain a nonprofit and publish all its code. “The evidence here will show what Musk says happened did not happen,” Savitt said.
He added that Musk knew about plans to raise corporate investment exceeding $10 billion as far back as 2018. Musk even raised concerns about Microsoft’s involvement in a 2020 tweet. But he didn’t file a lawsuit until he founded a competitor, xAI, in 2023.
For context, Tales From ’85 is a spin-off of the original Stranger Things franchise, set in the winter of 1985 in Hawkins, Indiana. It follows the Hawkins Investigators Club as they face paranormal threats in an animated format separate from the live-action series.
What happened at the end of Tales From ’85 season 1?
[Spoiler warning: please skip this section if you have not finished season 1 yet.]
Season 1 threw a lot at the Hawkins Investigators Club as they came face to face with a snow shark and a group of sinister pumpkin monsters called the Gourd Horde.
Advertisement
Netflix / Netflix
The real threat, however, was the Horde Queen, a creature that evolved from an Upside Down vine after Hawkins Food Mart clerk Daniel Fischer conducted secret experiments using Mrs. Baxter’s stolen research.
His green serum combined years of botanical science with extracted Upside Down vine DNA, creating something far beyond what he bargained for. The gang managed to stop the Queen from opening a new gate to the Upside Down, with Eleven sealing it shut using the creature’s own body.
Netflix
But the finale left one ominous image burned into viewers’ minds. In the Upside Down, a stem burst through the Queen’s corpse and unfurled into a glowing flower with the maw of a Demogorgon.
What to expect from Stranger Things: Tales From ’85 season 2?
Season 1 ended with the Hawkins Investigators Club uncovering a genuinely terrifying mystery, and season 2 is picking up that thread with a brand new paranormal threat.
Netflix
Showrunner Eric Robles confirmed that the glowing flower is the beginning of a whole new mystery. The threat in season 2 apparently emerges from Hawkins’ abandoned silver mines. The mysterious blue flower spotted blooming in the Upside Down at the end of season 1 is also set to play a significant role going forward.
Robles has also been clear that the seasons of the Stranger Things spinoff series are not standalone stories. They are directly connected, which means every detail from season 1 will matter heading into the next chapter. Eleven, Mike, Will, Dustin, Lucas, Max, and Nikki will all be returning.
Why did Netflix renew Stranger Things: Tales From ’85 despite mixed reviews?
Netflix
Season 1 just dropped a few days ago, and Netflix did not waste any time with the renewal announcement. However, the spinoff received a divisive reception, to put it mildly. It currently holds the lowest Rotten Tomatoes scores of any entry in the Stranger Things franchise, sitting at 63% from critics and 54% from audiences. Meanwhile, on IMDB, the show received only 5.7/10.
Netflix
Common complaints pointed to unlikable characters, particularly newcomer Nikki, and uneven plotting. Despite all of that, Netflix moved ahead with the renewal anyway, and did so just four days after season 1 debuted. The deciding factor appears to have been viewership.
The series pulled in 2.8 million views in its opening weekend and landed at number 7 on Netflix’s global Top 10, also securing a spot in the platform’s top 15 animated series debuts of all time. That kind of traction, regardless of critical reception, was enough to greenlight a second season.
You can now interact with data and create entire dashboards with Gemini in Sheets
Google Meet’s AI summaries now extend to in-person and third-party conferencing platforms
AI has allowed Google to make it even quicker to move from Microsoft 365 to Google Workspace
Google Cloud just held its biggest event yet, but with 260+ announcements across the entire portfolio at Google Cloud Next 2026, many of the smaller but no less important Google Workspace updates flew under the radar.
In a ‘one last thing’-like blog post, the company detailed a number of less headline-worthy improvements it’s rolled out to its Workspace online apps to make Gemini even more useful.
With the changes, Gemini is transitioning from a generative tool to produce new content to an agentic assistant that can interact with the context of your documents and carry out some functionalities autonomously.
Article continues below
Advertisement
Google Workspace just got these updates
Starting with one of the most commonly used Workspace apps, Sheets now lets users analyze their data and produce dashboards more conversationally.
The company also claimed 110 million Google Meet attendees have used ‘Take Notes For Me’ in the last month – an 8.5x growth compared with last year – so it’s now making AI-generated summaries and action items available beyond the four walls of Meet, including in-person meetings and calls on other platforms like Zoom and Teams.
Advertisement
And as for organizations, Gemini Enterprise customers will be able to create brand-new documents in line with their brand and requirements with new ‘Canvas Mode’. With a detailed prompt, Gemini can fetch data and context from the web and other Workspace locations to generate full presentations and documents that are as editable as human-created ones.
Finally, Google Cloud’s stance against Microsoft didn’t go amiss at its annual conference. Having previously lodged a complaint against Microsoft’s cloud practices, Google now says it’s up to 5x quicker for M365 customers to migrate to Workspace thanks to some under-the-hood interoperability improvements like an AI-powered Office macro converter.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
There seems to be no ceiling for mechanical keyboards and the Keychron Q1 Ultra 8K is proof. It features an all metal build, wireless options, ZMK, 8KHz polling, and layer after layer of foam.
Keychron continues to pump out new keyboards at a breakneck pace. The Q1 Ultra 8K is the latest model to sit at the top of Keychron’s lineup, replacing the Max.
I’ve reviewed a lot of Keychron’s keyboards, and with the exception of the unique Q3 Pro SE, I believe the Q1 series is my favorite layout. I love that it has all the keys I need while remaining compact yet dense.
Keychron has proven to be remarkably consistent in its products. Each release isn’t overly iterative and justifies the price, which makes them incredibly difficult to give a review score.
Advertisement
You’ll note this and others have received 4 or more stars. As with any review on AppleInsider, it is best to read through the text and understand the granular points rather than try to boil the entire thing to a numerical score.
Enthusiasts will happily pay the high price to get these features and premium build, so let’s dive into why the Q1 Ultra 8K is yet another winner from Keychron.
Keychron Q1 Ultra 8K review: design
Look at the Q1 Ultra 8K straight on and you’ll struggle to find any differentiation from the other Q1 models. There are tiny signs those with a discerning eye will catch, like the different knob, but the biggest tell is in the back.
Keychron Q1 Ultra 8K review: a familiar layout
Advertisement
Keychron has included a fancy “aesthetic PC” back plate. It is a gold colored plate with celestial designs that really pop out.
Of course, you’ll never see this plate during use, but I suppose it is cool that it’s there. As I said, it certainly helps in distinguishing it from other models.
I stuck with my usual preference of dark keycaps with Keychron’s splash of gray and blue. The RGB backlight continues to shine through bottom-facing LEDs with 22+ combinations.
Keychron Q1 Ultra 8K review: a fancy backplate
Advertisement
The Keychron launcher software lets you take things even further. Create your own RGB patterns on a per-key basis.
That means you can color-code select keys or just go all out and make every key a solid color. I prefer a randomized rainbow pattern, but it is cool that you can go to that level of detail.
Like with all Keychron keyboards, you can fully customize the keyboard top to bottom. Take it apart down to the frame and you’ve got a pile of foam, stabilizers, switches, and keycaps.
I emphasized this with the Q1 Max review, and the point still stands, this thing is packed to the gills with foam. Remove what you like to get the right sound and feel out of every keypress.
Advertisement
The layers go like this, starting from the top:
Keycaps
Top Case
Switches
Plate
Sound absorbing foam for a cleaner typing sound
IXPE Foam to reduce keystroke vibration
PET Film for protection from dust and insulates from shorts
PCB
Latex Bottom Pad that cushions typing while reducing vibration and noise
Bottom Case Acoustic Foam also cushions typing and reduces vibration and noise
Bottom Case PET Film adds another layer of protection from dust and insulates from shorts
Bottom Case
It’s quite the pile of materials and fittings. I personally like how the keyboard feels and sounds with the Red Silk POM Switches that were included.
The weight and material of Keychron keyboards are a big part of their premium look and feel. I’ve joked about it before, but these keyboards feel like they could double as a home defense tool.
Keychron Q1 Ultra 8K review: features
Keychron has packed the Q1 Ultra 8K with specs and features that will benefit every user. Whether you’re a mechanical keyboard enthusiast, or just looking for a premium keyboard, this one covers all of the bases.
Keychron Q1 Ultra 8K review: a new knob
Advertisement
The feature that gives it its name is an 8K polling rate. It means that the keyboard is sampling every keypress at 8x the speed of other high-end keyboards.
That 8,000 Hz polling rate isn’t on by default. All of Keychron’s 8K keyboards ship with the default polling rate set to 1,000 Hz to ensure wider compatibility.
If you’re working on newer Macs and PCs, go ahead and switch over to 8K. This is done via the Keychron launcher tool that’s accessible via the web on a Chrome, Edge, or Opera browser.
Luckily, I keep Chrome installed on my Mac mini for podcasting purposes. A quick switch flip to the 2.4GHz channel and I was in business.
Advertisement
Keychron Q1 Ultra 8K review: Keychron Launcher lets you edit every aspect
I used to spend more time on the keyboard modification software in these reviews, but they’ve become so standard that I don’t feel the need to. Like most mechanical keyboards, and all Keychron keyboards, you can customize it top to bottom using different switches, keycaps, and program each key individually.
If you want to get even more advanced, program multiple typing layers so a button press puts you into a completely different layout. Careful with this, as the physical keyboard doesn’t look any different even if the “G” key is suddenly performing a different action.
Other specs include three pairing methods. Pick from three Bluetooth 5.3 channels, a 2.4 GHz USB dongle, or wired.
Advertisement
Keychron Q1 Ultra 8K review: swap between multiple connection options
Mix and match every connection option to easily switch between five devices at once. Those concerned with latency will use the wired option, but the 2.4 GHz connection is just as solid and both offer the 8K polling rate.
Really, it comes down to your individual setup and needs.
The Q1 Ultra 8K has some incredible battery life thanks to optimizations made via the ZMK firmware. Users that enable the 8K polling feature and connect wirelessly without a backlight can squeeze out 660 hours of battery life.
Advertisement
That backlight is the battery drain, so every level of brightness you enable cuts that battery life down quick. That said, I’ve had to charge the internal battery one time in a month even with the backlight enabled. YMMV.
Using the Keychron Q1 Ultra 8K
I spend a lot of time typing on a keyboard each day, so it is needless to say that I’ve spent a lot of hours with the Q1 Ultra 8K. It’s an interesting evolution of a keyboard that I’ve used across multiple generations.
Keychron Q1 Ultra 8K review: still a typing dream
From a general use perspective, beyond the move to a new switch, the typing experience is nearly identical. The Q1 Max has the same internal layers of foam and the case appears to be identical.
Advertisement
I’ve found that I prefer whatever red-equivalent switch is my go-to. It isn’t too loud and is just right in terms of travel.
There really isn’t much else I can add here that I haven’t said time and time again about Keychron keyboards. This is a highly customizable mechanical keyboard that isn’t too flashy out of the box, but lets you get there if you like.
Keychron Q1 Ultra 8K review: peak customization
I’ll keep noting as long as I can remember to that I wish someone could replicate the signal sent to Apple Vision Pro from a Magic Keyboard. I love using third-party keyboards, but it is never not annoying that I can’t see the keyboard when fully immersed — only Apple gets that privilege.
Advertisement
The Q1 series might be my favorite layout. It has all of the keys that I’ll need, including a well-labeled function row.
There’s even three “bonus keys” on the right side. I’m sure someone on Earth uses them as designed, but the pgup, pgdn, and home keys are totally unnecessary for my workflows. So, they make great reprogrammable keys.
Advancing the spec
The Keychron Q1 Ultra 8K is an iterative upgrade over the Q1 Max. Though that isn’t to say the upgrades aren’t impactful.
Keychron Q1 Ultra 8K review: an iterative but welcome upgrade
Advertisement
What’s actually new here is the Keychron MCU chip with 1MP flash memory. It enables the 8K polling rate.
That Silk POM Switch is a nice addition too. Keychron specifically designed it with high performance in mind for gamers and those demanding precision.
Lay the Max and the Ultra side-by-side and the only notable difference besides potential keycap selection is the knob. If you’re looking for a wholly new experience or design, this isn’t it.
The battery life is about 6x longer in the Q1 Ultra 8K versus the Max. You’re also interfacing through ZMK, which is a newer standard built with wireless keyboards in mind.
Advertisement
Those looking for a premium mechanical keyboard experience with a little extra cash should get the Q1 Ultra 8K. While the Q1 Max is still available, you’re not saving much on that model, so go for the best.
Keychron Q1 Ultra 8K review – Pros
Sturdy design, clacky keys, interesting backplate
Incredible battery life
8K polling if you want it
Incredible range of customization options
Keychron Q1 Ultra 8K review – Cons
Iterative upgrade design wise
8K polling not necessary for everyone
Rating 4 out of 5
As usual, these numeric scores are not representative of the product, but Google demands them. I love the Q1 Ultra 8K and it is a good upgrade over the Q1 Max, but owners of the older model don’t need to rush out for the new one either.
I expect the next iteration will tackle more in the design department, so it’s getting the smallest knock for the lack of design change. Otherwise, this is an expensive, premium keyboard that has amazing specs and is worth buying if you’re in the market for one.
Hey Wes, here are two affiliate links. Let’s use Amazon for the blue buy button.
Where to buy the Keychron Q1 Ultra 8K
The Keychron Q1 Ultra 8K wireless mechanical keyboard can be purchased from Keychron directly for $229.99. Amazon also carries the keyboard for the same price.
The AI race lately has felt a bit like a game of tennis: first, Anthropic releases a new, pricey state-of-the-art proprietary model for general users (Claude Opus 4.7), then, a week or so later, its rival OpenAI volleys back with one of its own (GPT-5.5). And all the while, Chinese companies like DeepSeek and even Xiaomi are seeking to appeal to users by playing a different game: nearing the frontier, but with open licensing and far lower costs.
So it’s a big surprise when a new, affordable, highly performant open source contender from the U.S. emerges. Today, we got one from the smaller, lesser-known U.S. AI startup, Poolside, founded in San Francisco in 2023.
The company launched its two new Laguna large language models, both of which offer affordable intelligence optimized for agentic workflows (AI that does more than just chat or generate content, but can, in this case, write code, use third-party tools, and take actions autonomously), as well as a new coding agent harness called (fittingly) “pool” and a new web-based, mobile optimized agentic coding development and interactive preview environment, “shimmer,” which lets you write code with the Laguna models on the go.
The new AI models that Poolside released today include:
Advertisement
Laguna M.1: a proprietary 225-billion parameter Mixture of Experts (MoE) model with 23 billion active parameters. This flagship model is optimized for high-consequence enterprise and government environments, designed to solve complex, long-horizon software engineering problems that require maximum reasoning and planning capabilities.
Laguna XS.2: an Apache 2.0 open licensed 33-billion parameter MoE with 3 billion active. Engineered for efficiency and community innovation, this model is designed for local agentic coding tasks and provides a versatile foundation for developers looking to fine-tune, quantize, or serve powerful agents on a single GPU. In other words, developers can download and run Laguna XS.2 on their desktop or even laptop computers without an internet connection — completely private and secured.
Notably, as mentioned above, only the smaller of the two models, XS.2, is available now under an open source Apache 2.0 license (on Hugging Face) — yet Poolside is offering even the larger M.1 for free temporarily through its API and third-party distribution partners, OpenRouter, Ollama, and Baseten, making it a great use case for developers who wish to test it out.
Also noteworthy: the two new Lagunas were trained from scratch — not fine-tuned/post-trained base models from Chinese giant Alibaba’s Qwen series like some other U.S. labs have pursued lately (*cough cough* Cursor *cough).
As Poolside wrote in a blog post today, it’s spent the last few years “focused on serving our government and public sector clients with capable models deployable into the highest-security environments,” yet is now going open source “to support builders and the wider research community.”
When I asked on X why government agencies would seek to use Poolside instead of leading proprietary U.S. labs like Anthropic, OpenAI and Google, Poolside post-training engineer George Grigorev told me in a reply that: “we think that we can be faster to deploy our models to enterprise customers, and we can literally ship weights in fully isolated environments on-prem, so it can work offline. which might be critical for gov/public sectors 🙂 but ofc anthropic enterprise is hard to beat”
Advertisement
How Poolside’s Laguna M.1 and Laguna XS.2 were trained
Poolside constructs its AI models within a specialized digital environment called the “Model Factory“.
At the heart of this process is Titan, the company’s powerful internal software that serves as the “furnace” for training. To help the AI learn as efficiently as possible, Poolside uses a unique tool called the Muon optimizer.
Think of Muon as a high-speed tutor; it helps the model master new information approximately 15% faster than standard industry methods, a critical gain when training at the 30-trillion-token scale.
It achieves this by ensuring that every update to the model’s “brain” is mathematically balanced and pointing in the right direction, which prevents the AI from getting confused or stuck during its intensive training sessions.
Advertisement
The information used to train these models—a staggering 30 trillion “tokens” or pieces of data—is carefully selected using a system called AutoMixer.
Rather than just feeding the AI everything it finds on the internet, AutoMixer leverages a a “swarm” of sixty proxy models on different data mixes to scientifically determine which combination of code, math, and general web data produces the best reasoning capabilities.
In this way, it acts like a master chef, scientifically testing thousands of different “recipes” to find the perfect balance of computer code, mathematics, and general knowledge.
While much of this data comes from the public web, about 13% of it is “synthetic data”. This is high-quality, custom-made practice material created by other AIs to teach the models specific skills that are difficult to find in the real world.
Advertisement
Once the model has finished its basic “schooling,” it enters a virtual gym for Reinforcement Learning. In this stage, the AI practices solving real software engineering problems in a safe, isolated digital playground. It learns through trial and error, receiving a “reward” or positive signal every time it successfully fixes a bug or writes a working piece of code. This constant cycle of practice and feedback is what transforms the AI from a simple text generator into a capable “agent” that can plan and execute complex, multi-step projects just like a human software engineer.
While M.1 represents the peak of Poolside’s current research, the smaller Laguna XS.2 may be the more disruptive entry.
At just 33 billion total parameters (3 billion activated), XS.2 is a “second-generation” MoE model that incorporates everything the team learned from training M.1.
Benchmarks show Poolside’s Laguna models punch far above their weight class
Langua M.1’s performance on the SWE-bench Pro—a benchmark designed to test an AI’s ability to solve real-world software issues—reached 46.9% on SWE-bench Pro, nearing the performance of the far-larger Qwen-3.5 and DeepSeek V4-Flash.
Despite being a fraction of the size, Laguna XS.2 achieves a 44.5% score on SWE-bench Pro, nearly matching its larger sibling.
On the SWE-bench Verified track, M.1 scored 72.5%, outperforming the dense Devstral 2 (72.2%) but trailing Claude Sonnet 4.6, which leads the category at 79.6%.
These results highlight M.1’s specialization in long-horizon software tasks, particularly those involving complex planning across interconnected files.
The smaller Laguna XS.2 exhibits remarkable efficiency, nearly matching the performance of its much larger sibling on high-consequence tasks. Despite having only 3B active parameters, XS.2 surpasses Claude Haiku 4.5 (39.5%) and the significantly larger Gemma 4 31B dense model (35.7%) on SWE-bench Pro.
In terminal-based reasoning, XS.2’s 30.1% on Terminal-Bench 2.0 also edges out Haiku 4.5’s 29.8%, although it remains behind specialized “nano” models such as GPT-5.4 Nano, which reached 46.3% on the same benchmark.
Collectively, these benchmarks suggest that Poolside’s focus on agentic RL and synthetic data curation has allowed its smaller models to “punch up” into weight classes typically reserved for far denser architectures.
Advertisement
While top-tier proprietary models like Claude Sonnet 4.6 maintain a lead in overall success rates, the Laguna family—particularly the open-weight XS.2—offers a competitive alternative for developers who prioritize local execution and customizable agent workflows.
All benchmarking was conducted using the Harbor Framework with sandboxed execution, ensuring that the results reflect the models’ ability to function in realistic, resource-constrained environments.
Running Laguna XS.2 locally
To run the Laguna XS.2 (33B) model locally, your hardware must accommodate its 33 billion total parameters. On Apple Silicon, the baseline requirement is 36 GB of unified memory.
For PC and Linux users, while the standard weights would typically require over 60 GB of VRAM, the model’s support for 4-bit quantization (Q4) allows it to run on consumer-grade GPUs with at least 24 GB to 32 GB of VRAM, such as the newly released RTX 5090.
Advertisement
Storage is also a factor; you should reserve at least 70 GB for the full model or roughly 20–35 GB for a compressed version suitable for local “agent” tasks.
For the most seamless experience, Poolside recommends utilizing Ollama or their own terminal-based agent, pool, which are designed to manage the model’s native reasoning and tool-calling capabilities on consumer hardware.
You can find the full technical requirements, including specific quantization configurations and code execution sandboxing details, on the official Hugging Face model page and the Poolside release blog. Some sample suggested hardware is listed below:
Mac
MacBook Pro (14-inch or 16-inch): You should look for models equipped with the M5 Max chip, which specifically supports a starting configuration of 36 GB of unified memory. While the M5 Pro is available, you would need to custom-configure it to exceed its base memory to meet the 36 GB threshold.
Mac Studio / Mac Mini: A Mac Mini (M4 or M5 Pro) configured with at least 48 GB or 64 GB of RAM is an excellent desktop alternative.
NO “MacBook Neo”: this model is not suitable for running Laguna XS.2. Released in early 2026 as a budget-friendly option, the MacBook Neo is capped at 8 GB of non-upgradable memory, which is insufficient for a 33B parameter model.
PC
Single-GPU Setup: The NVIDIA GeForce RTX 5090 is the premier choice for 2026, offering 32 GB of GDDR7 VRAM, which can handle the Laguna XS.2 at high speeds (approximately 45 tokens/sec) using Q4 quantization.
Pro-Grade Setup: For professional developers running complex, long-horizon agents, the RTX PRO 6000 Blackwell (96 GB VRAM) or a dual RTX 5090 configuration allows the model to run without any compression loss.
Minimum PC Spec: An RTX 4090 (24 GB) can run the model with heavier quantization, though performance may be slower during complex reasoning tasks.
pool (agent) and shimmer (IDE)
Models are only as useful as the environments they inhabit, and Poolside has released two “preview” products to house the Laguna series: pool and shimmer.
Advertisement
pool is a terminal-based coding agent designed for the developer’s local environment. It acts as an Agent Client Protocol (ACP) server, the same harness the team uses internally for reinforcement learning (RL) training.
By bringing the researchers’ own tools to the general public, Poolside is effectively inviting the developer community to participate in the “real-world gym” that trains their future models.
Shimmer represents a vision for the cloud-native future of development. It is an instant-on Virtual Machine (VM) sandbox where developers can iterate on web apps, APIs, and CLIs in seconds.
Unlike traditional integrated developer environments (IDEs) such as Microsoft Visual Studio, shimmer integrates the Poolside Agent directly into the workspace, allowing it to push changes to GitHub or import existing repositories with ease.
Advertisement
Perhaps the most surprising feature of shimmer is its portability. Poolside Founding Designer Alasdair Monk shared a demonstration showing shimmer running entirely on a smartphone.
In the demo, a split-screen interface shows the Poolside Agent generating a “Happy New Year 2026!” animation while a dev environment runs below.
As Monk noted, it offers an instant-on VM with Poolside Agent in split screen and a full dev environment on a mobile device.
This suggests a future where high-consequence engineering isn’t tethered to a desktop, but can happen wherever an engineer has a screen.
Advertisement
Why release Laguna XS.2 as Apache 2.0 open weights?
The most significant strategic move in this release is the licensing of Laguna XS.2. Poolside has released the weights of XS.2 under the Apache 2.0 license.
This is a highly permissive license that allows users to use, distribute, and modify the software for any purpose, including commercial use, without royalties. This is a stark contrast to the “closed” models of many competitors or even the more restrictive “open-ish” licenses used by some other labs.
Poolside’s leadership is explicit about why they chose this path. Poolside’s blog post states its conviction that “the West needs strong open-weight models” and that releasing the weights is the fastest way for the team to improve their work through community evaluation and fine-tuning.
By putting the weights of a highly capable, 33B-parameter agentic model in the hands of researchers and startups, Poolside is positioning itself as a cornerstone of the open-AI ecosystem.
Advertisement
While Laguna M.1 remains primarily behind an API, the open release of XS.2 ensures that Poolside’s technology will be baked into the next generation of third-party tools.
Poolside’s philosophy and approach
The core thesis behind Poolside’s work is that software development serves as the ultimate proxy for general intelligence.
Creating software requires long-horizon planning, complex reasoning, and the ability to manipulate abstract systems—all traits central to human cognition. While most current AI “agents” are restricted to tool-calling via pre-defined interfaces, Poolside’s agents are designed to write and execute their own code to solve problems.
This shift from using tools to building systems marks a fundamental evolution in how AI interacts with the digital world.
Advertisement
The team of roughly 60 people in the Applied Research organization spent three years and conducted tens of thousands of experiments to reach this point. Their vision of AGI is not just about intelligence, but about “abundance for humanity”.
By focusing on software engineering—a domain with verifiable rewards like test passes and compilation results—they have created a self-improving feedback loop. As the team puts it, they are building a “fusion reactor” for data: extracting every last drop of intelligence from existing human knowledge while using RL to harvest the “wind energy” of new, fresh experiences.
Poolside’s journey is just beginning, but the Laguna release sets a high bar for what “agentic” AI should look like in 2026. By combining frontier-level performance with a commitment to open weights and novel developer surfaces, they are charting a path to AGI that is as much about the way we build as it is about the what we build.
For the enterprise and the individual developer alike, the message is clear: the future of work is agentic, and the language of that future is code.
Training AI reasoning models demands resources that most enterprise teams do not have. Engineering teams are often forced to choose between distilling knowledge from large, expensive models or relying on reinforcement learning techniques that provide sparse feedback.
Researchers at JD.com and several academic institutions recently introduced a new training paradigm that sidesteps this dilemma. The technique, called Reinforcement Learning with Verifiable Rewards with Self-Distillation (RLSD), combines the reliable performance tracking of reinforcement learning with the granular feedback of self-distillation.
Experiments indicate that models trained with RLSD outperform those built on classic distillation and reinforcement learning algorithms. For enterprise teams, this approach lowers the technical and financial barriers to building custom reasoning models tailored to specific business logic.
The problem with training reasoning models
The standard method for training reasoning models is Reinforcement Learning with Verifiable Rewards (RLVR). In this paradigm, the model learns through trial and error, guided by a final outcome from its environment. An automated verifier checks if the model’s answer is right or wrong, providing a binary reward, such as a 0 or 1.
Advertisement
Reinforcement learning with verifiable rewards (RLVR)
RLVR suffers from sparse and uniform feedback. “Standard GRPO has a signal density problem,” Chenxu Yang, co-author of the paper, told VentureBeat. “A multi-thousand-token reasoning trace gets a single binary reward, and every token inside that trace receives identical credit, whether it’s a pivotal logical step or a throwaway phrase.” Consequently, the model never learns which intermediate steps led to its success or failure.
On-Policy Distillation (OPD) takes a different approach. Instead of waiting for a final outcome, developers pair a smaller student model with a larger, more capable teacher model. For each training example, the student compares its response to that of the teacher token by token. This provides the student with granular feedback on the entire reasoning chain and response-generation process.
Deploying and running a separate, massive teacher model alongside the student throughout the entire training process incurs massive computational overhead. “You have to keep a larger teacher model resident throughout training, which roughly doubles your GPU footprint,” Yang said. Furthermore, the teacher and student models must share the exact same vocabulary structure, which according to Yang, “quietly rules out most cross-architecture, cross-modality, or multilingual setups that enterprises actually run.”
Advertisement
On-policy distillation (OPD)
The promise and failure of self-distillation
On-Policy Self-Distillation (OPSD) emerged as a solution designed to overcome the shortcomings of the other two approaches. In OPSD, the same model plays the role of both the student and the teacher.
During training, the student receives a standard prompt while the teacher receives privileged information, such as a verified, step-by-step answer key. This well-informed teacher version of the model then evaluates the student version, providing token-by-token feedback as the student tries to solve the problem using only the standard prompt.
OPSD appears to be the perfect compromise for an enterprise budget. It delivers the granular, step-by-step guidance of OPD. Because it eliminates the need for an external teacher model, it operates with the high computational efficiency and low cost of RLVR, only requiring an extra forward pass for the teacher.
Advertisement
However, the researchers found that OPSD suffers from a phenomenon called “privileged information leakage.”
“The objective is structurally ill-posed,” Yang said. “There’s an irreducible mutual-information gap that the student can never close… When self-distillation is set up as distribution matching, the student is asked to imitate the teacher’s full output distribution under privileged context.”
On-policy self-distillation (OPSD)
Because the teacher evaluates the student based on a hidden answer key, the training objective forces the student model to learn the teacher’s exact phrasing or steps instead of the underlying reasoning logic. As a result, the student model starts hallucinating references to an invisible solution that it will not have access to in a real-world deployment.
Advertisement
In practice, OPSD models show a rapid spike in performance early in training, but their reasoning capabilities soon plateau and progressively degrade over time.
Decoupling direction from magnitude with RLSD
The researchers behind RLSD realized that the signals governing how a model updates its parameters have fundamentally asymmetric requirements. They identified that the signal dictating the direction of the update (i.e., whether to reinforce or penalize a behavior) can be sparse, but must be perfectly reliable, because pointing the model in the wrong direction damages its reasoning policy.
On the other hand, the signal dictating the magnitude of the update (i.e., how much relative credit or blame a specific step deserves) benefits from being extremely dense to enable fine-grained, step-by-step corrections.
RLSD builds on this principle by decoupling the update direction from the update magnitude. The framework lets the verifiable environmental feedback from the RLVR signal strictly determine the direction of learning. The model only receives overall reinforcement if the final answer is objectively correct.
Advertisement
Reinforcement learning with self-distillation (RLSD) (source: arXiv)
The self-teacher is stripped of its power to dictate what the model should generate. Instead, the teacher’s token-by-token assessment is repurposed to determine the magnitude of the update. It simply distributes the total credit or blame across the individual steps of the model’s reasoning path.
This alters how the model learns compared to the classic OPSD paradigm. In standard OPSD, the training objective acts like behavioral cloning, where the model is forced to directly copy the exact wording and phrasing of the teacher. This causes the student to hallucinate and leak references to data it does not have.
Instead of forcing the model to copy a hidden solution, RLSD provides a natural and virtually cost-free source of per-token credit information.
Advertisement
“The intuition: we’re not teaching the model to reason like the teacher,” Yang said. “We’re telling the model, on the path it chose, which of its own tokens were actually doing the work. The model’s exploration distribution stays its own. Only the credit allocation gets sharpened.”
If a specific deduction strongly supports the correct outcome, it receives a higher score. If it is just a useless filler word, it receives a baseline score. RLSD eliminates the need to train complex auxiliary reward networks, manually annotate step-by-step data, or maintain massive external teacher models.
Putting RLSD to the test
To test RLSD, the researchers trained the open-weight Qwen3-VL-8B vision-language model and evaluated it on several visual reasoning benchmarks. These included MMMU for college-level multi-discipline questions, MathVista, MathVision, WeMath, and ZeroBench, a stress-test benchmark explicitly designed to be nearly impossible for current frontier models.
They compared the RLSD model against the base model with no post-training, standard RLVR via the GRPO algorithm, standard OPSD, and a hybrid combination of the two.
Advertisement
RLSD significantly outperformed every other method, achieving the highest average accuracy of 56.18% across all five benchmarks. It beat the base model by 4.69% and outperformed standard RLVR by 2.32%. The gains were most pronounced in complex mathematical reasoning tasks, where RLSD outperformed standard RLVR by 3.91% on the MathVision benchmark.
RLSD outperforms other techniques on key benchmarks (source: arXiv)
Beyond accuracy, the framework offers massive efficiency gains. “Concretely, RLSD at 200 training steps already beats GRPO trained for 400 steps, so roughly 2x convergence speedup,” Yang said. “Cost-wise, the only overhead beyond a normal GRPO pipeline is one extra forward pass per response to grab teacher logits. Compared to rollout generation… that’s basically free.”
Unlike OPSD, which saw performance spike and then completely collapse due to information leakage, RLSD maintained long-term training stability and converged on a higher performance ceiling than standard methods.
Advertisement
The qualitative findings highlight how the model alters its learning behavior. For example, in a complex visual counting task, standard RLVR looks at the final correct answer and gives the entire paragraph of reasoning tokens the same reward. RLSD surgically applied rewards to the specific mathematical subtraction steps that solved the problem, while actively down-weighting generic filler text like “Looking at the image, I see…”.
In another example, the model performed an incorrect math derivation based on a bar chart. Instead of labeling the whole response as a failure, RLSD concentrated the heaviest penalty on the exact point where the model misread a relationship from the chart. It remained neutral on the rest of the logical setup, recognizing that the initial framework was valid.
This is particularly important for messy, real-world enterprise use cases. If a model makes a mistake analyzing a 50-page quarterly earnings report, developers do not want it to unlearn its entire analytical framework. They just want it to fix the specific assumption it got wrong. RLSD allows the model to learn exactly which logical leaps are valuable and which are flawed, token by token. Because RLSD does this by repurposing the model itself, it provides models with granular reasoning capabilities while keeping the costs of training reasonable.
How enterprises can get started
For data engineers and AI orchestration teams, integrating RLSD is straightforward, but it requires the right setup. The most critical requirement is a verifiable reward signal, such as code compilers, math checkers, SQL execution, or schema validators. “Tasks without verifiable reward (open-ended dialogue, brand-voice writing) belong in preference-based pipelines,” Yang said.
Advertisement
However, RLSD is highly flexible regarding the privileged information it requires. While OPSD structurally requires full intermediate reasoning traces, forcing enterprises to either pay annotators or distill from a frontier model, RLSD does not.
“If you have full verified reasoning traces, great, RLSD will use them,” Yang said. “If all you have is the ground-truth final answer, that also works… OPSD doesn’t have this flexibility.”
Integrating the technique into existing open-source multi-modality RL frameworks like veRL or EasyR1 is incredibly lightweight. According to Yang, it requires no framework rewrite and slots right into the standard stack. The code swap involves simply changing tens of lines to adjust the GRPO objective and sync the teacher with the student.
Looking ahead, RLSD offers a powerful way for enterprises to maximize their existing internal assets.
Advertisement
“The proprietary data enterprises hold inside their perimeter (compliance manuals, internal documentation, historical tickets, verified code snippets) is essentially free privileged information,” Yang concluded. “RLSD lets enterprises feed this kind of data straight in as privileged context, which sharpens the learning signal on smaller models without needing an external teacher and without sending anything outside the network.”
Sony is expanding its INZONE lineup with the new INZONE H6 Air, a wired open-back gaming headset built for PC and PlayStation users who want a more natural, spacious presentation than closed back designs typically deliver. It joins the existing H5 and H3 models, both closed back, and signals a broader push by Sony to cover more listening preferences inside a gaming category that continues to grow at scale.
That push isn’t happening in a vacuum. Sony’s acquisition of Audeze came after the strong reception of the Maxwell wireless headset, and the recent Maxwell 2 only reinforces the point. Gaming audio has become a serious battleground. Between Sony and Microsoft, it’s a trench war for the same customer, and products like the INZONE H6 Air are clearly part of the strategy to tighten that grip.
The INZONE H6 Air brings an open-back acoustic design to Sony’s gaming headset lineup, aiming for a more natural and spacious sound field than the sealed approach used by its siblings. Sony pairs that structure with custom drivers and integrated back ducts to better manage airflow and low frequency response, with the goal of maintaining control while preserving spatial cues.
That matters in practice. Open-back designs tend to trade isolation for positional accuracy, and Sony is clearly leaning into that balance here. The H6 Air is tuned to support spatial audio processing so players can more easily track movement and environmental detail in game, rather than just pushing volume or bass for effect.
Advertisement
Construction
The INZONE H6 Air uses an aluminum construction to keep weight down to approximately 199 grams without the detachable microphone and cable, making it the lightest headset in Sony’s INZONE lineup. The design also incorporates the spring hinge headband used in the Sony INZONE H9 II, which allows for a more compact frame while maintaining fit and stability.
The low weight and flexible headband structure are intended to improve long session comfort, reducing pressure without significantly affecting durability or support.
Open-back
With its open-back design, the INZONE H6 Air is intended to create a more natural sound field that places greater emphasis on spatial accuracy rather than isolation. By leaving the rear of the driver unobstructed, the design reduces internal reflections inside the earcup, which can help preserve detail and improve the sense of space.
The goal is more precise sound field reproduction in line with how game audio is mixed, allowing players to better perceive directionality and environmental cues without the coloration that can come from a fully enclosed housing.
Advertisement
Drivers
The H6 uses 40 mm drivers that draw on design elements from Sony’s MDR-MV1, adapted here for gaming use. Sony also incorporates back ducts into the driver assembly to help manage airflow and support low frequency control, while maintaining separation between bass and midrange.
The result is a presentation focused on clarity and spatial accuracy, which can help with positional cues in games where directionality and environmental detail matter.
Mic
The INZONE H6 Air includes a detachable cardioid microphone designed for focused voice capture. The boom is positioned toward the user’s mouth to reduce pickup of off-axis noise, and the flexible arm allows for adjustment while holding its position during use.
Advertisement. Scroll to continue reading.
Tuning
The H6 Air has been tuned specifically for Role-Playing Game (RPG) and adventure games. It enhances clarity, depth, and environmental detail to better reflect how the audio is intended to be heard. This can be accessed by connecting to Sony’s INZONE Hub via the USB-C Audio Box.
This is a compact, digital-to-analog converter (DAC) that connects to PCs or consoles via USB-C and features a 3.5mm input for connecting the headphones. The box supports 360 Spatial Sound for Gaming, 7.1ch virtual surround sound, and custom EQ settings via the INZONE Hub
Sony’s INZONE H6 Air adds something the lineup didn’t have before: an open-back option aimed at players who care more about spatial accuracy and a less enclosed presentation than isolation. It’s also Sony’s lightest full-size gaming headset to date and one of the few in this category to pair a traditional wired design with a bundled USB-C audio interface for software control through the INZONE Hub. That combination, open-back acoustics plus a PC-friendly control box, gives it a different angle than the closed-back H5 and H9 II.
What it doesn’t offer is just as clear. There’s no wireless option, no onboard battery features, and no noise isolation—by design. The “Air” label may suggest mobility, but this is a desk-bound headset that depends on its wired connection and external audio box. If you game in a noisy environment or want a single headset for commuting and play, this isn’t built for that.
Competition is crowded. On the open-back side, options are limited but growing, while closed-back heavyweights like the Audeze Maxwell (and its newer iterations) set a high bar for wireless performance. Brands like ASUS ROG, SteelSeries, and Razer continue to push feature-rich headsets, while wireless gaming earbuds from Cleer Audio and Final Audio offer a very different form factor for the same audience.
Advertisement
Who should consider the H6 Air? PC and PlayStation users who play in quieter spaces and want a more open, speaker-like presentation with reliable wired performance. If positional accuracy, comfort, and long-session usability matter more than isolation or portability, this is where the H6 Air fits. If you need flexibility, travel use, or a single do-it-all headset, Sony already has other options, and so does everyone else.
There is no confirmed timeline for a launch date yet, Revolut said.
Revolut is piloting a physical store in Barcelona in its latest attempt to compete with traditional banks.
The UK fintech is stressing that this will not be a traditional bank branch, but rather a “new format built for how modern customers engage with brands today”. With this, “high-visibility, immersive space”, Revolut hopes to make fintech more accessible to the general public.
“This is a new physical concept space where people can experience Revolut products and services in-person, receive support, discover features and engage with the brand more tangibly. At our scale, physical presence builds trust and visibility,” a company spokesperson told SiliconRepublic.com.
Advertisement
Spain is one of Revolut’s key strategic hubs in Europe with more than 6m customers. The pilot is still in the early stages with no confirmed timeline on when it would open.
“We chose Barcelona as the place to pilot our physical stores because it combines local density, global relevance, tourism and innovation,” a spokesperson told Euronews today (28 April). Plans for any other future physical stores will depend on the success of the pilot project.
Revolut’s plans for a physical store comes at a time when traditional bank branches are closing in droves.
Around 6,000 commercial bank branches closed down in the US over the last five years. And according to a 2025 report by the American Bankers Association, only 9pc of customers prefer brick and mortar branches as their preferred method of banking.
Advertisement
Meanwhile, UK’s Lloyds Banking Group closed down nearly 100 branches this February, and Santander bank said it would shut 44 bank branches.
Although, Barclays, which shut nearly 80pc of its branches since 2019, is now planning on opening new ones. “I truly believe that the combination of great digital and great human touch is the future of banking,” the bank’s UK CEO Vim Maru said earlier this month.
It also recently expanded operations to Mexico, opened a new global headquarters in London and secured a payments licence in India. The fintech hopes that this continued expansion can help it reach 100m customers by mid-2027.
Advertisement
Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.
‘The connective tissue between your data, your people, and your goals’: Google Cloud positions Gemini Enterprise as the one-stop shop for all your agentic affairs
The new Gemini Enterprise Agent Platform is an end-to-end building and deployment tool
Google’s clearly committed to interoperability with third-party model and tool support
Even non-technical workers should be able to build their own AI agents
Google Cloud has unveiled Gemini Enterprise, which has evolved into a single interface where users can interact with their AI agents just as they would their Workspace apps.
Core to the announcement is the brand-new Gemini Enterprise Agent Platform, described as an end-to-end development platform for building, deploying and managing agents at scale.
Designed to be as simple to interact with as the rest of the Google Workspace suite, interoperability was also a core message at Google Cloud Next 2026 and is pivotal to how the new Agent Platform works.
Article continues below
Advertisement
Google wants Gemini interactions and management to be as easy as possible
Google described the platform as a model-agnostic ecosystem that lets users either access Google’s own models or third-party alternatives for maximum flexibility.
Agents can also share context across systems, apps and workflows to help make them more effective, with the Gemini Enterprise platform acting as a central monitoring and auditing tool.
Advertisement
Gemini Enterprise Senior Director of Product Maryam Gholami also noted that companies have shifted from generative AI and agentic pilots to full-scale agentic deployments, noting the requirement for always-on automation.
“Companies are ready to build their agentic task force, but this demands doing so within a secure and governed environment,” Gholami shared.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Described by Google Cloud CEO Thomas Kurian as ‘the primary environment where your business actually operates, the Gemini Enterprise app has also been reshaped with an improved Agent Designer, so non-technical workers can create workflows with reusable Skills with natural-language prompts.
Advertisement
A screenshot of the updated interface shows a visual, branch-based builder that can handle ‘if this, then that’ type splits and human-in-the-loop checkpoints where approvals may be required.
By unifying the whole stack agentic AI stack, Google wants to be much more than a tool provider, pushing back against rivals with a fully end-to-end management and deployment platform.
You must be logged in to post a comment Login