Connect with us

Tech

Enterprise MCP adoption is outpacing security controls

Published

on

AI agents now carry more access and more connections to enterprise systems than any other software in the environment. That makes them a bigger attack surface than anything security teams have had to govern before, and the industry doesn’t yet have a framework for it. “If that attack vector gets utilized, it can result in a data breach, or even worse,” said Spiros Xanthos, founder and CEO of Resolve AI, speaking at a recent VentureBeat AI Impact Series event.

Traditional security frameworks are built around human interactions. There’s not yet an agreed-upon construct for AI agents that have personas and can work autonomously, noted Jon Aniano, SVP of product and CRM applications at Zendesk, at the same event. Agentic AI is moving faster than enterprises can build guardrails — and Model Context Protocol (MCP), while decreasing integration complexity, is making the problem worse.

“Right now it’s an unsolved problem because it’s the wild, wild West,” Aniano said. “We don’t even have a defined technical agent-to-agent protocol that all companies agree on. How do you balance user expectations versus what keeps your platform safe?”

MCP still “extremely permissive”

Enterprises are increasingly hooking into MCP servers because they simplify integration between agents, tools and data. However, MCP servers tend to be “extremely permissive,” he said.

Advertisement

They are “actually probably worse than an API,” he contended, because APIs at least have more controls in place to impose upon agents.

Today’s agents are acting on behalf of humans based on explicit permissions, thus establishing human accountability. “But you might have tens, hundreds of agents in the future with their own identity, their own access,” said Xanthos. “It becomes a very complex matrix.”

Even as his startup is developing autonomous AI agents for site reliability engineering (SRE) and system management, he acknowledged that the industry “completely lacks the framework” for autonomous agents.

“It’s completely on us and to anybody who builds agents to figure out what restrictions to give them,” he said. And customers must be able to trust those decisions.

Advertisement

Some existing security tools do offer fine-grained access — Splunk, for instance, developed a method to provide access to certain indexes in underlying data stores, he noted — but most are broader and human-oriented.

“We’re trying to figure this out with existing tools,” he said. “But I don’t think they’re sufficient for the era of agents.”

AI Impact Series 1password

Credit: Michael O’Donnell, ShinyRedPhoto

Who’s accountable when an AI mis-authenticates a user?

At Zendesk and other customer relationship management (CRM) platform providers, AI is involved in a number of user interactions, Aniano noted — in fact, now it’s at a “volume and a scale that we haven’t contemplated as businesses and as a society.”

Advertisement

It can get tricky when AI is helping out human agents; the audit trail can become a labyrinth.

“So now you’ve got a human talking to a human that’s talking to an AI,” Aniano noted. “The human tells the AI to take action. Who’s at fault if it’s the wrong action?” This becomes even more complicated when there are “multiple pieces of AI and multiple humans” in the mix.

To prevent agents from going off the rails, Zendesk tends to be “very strict” about access and scope; however, customers can define their own guardrails based on their needs. In most cases, AI can access knowledge sources, but they’re not writing code or running commands on servers, Aniano said. If an AI does call an API, it is “declaratively designed” and sanctioned, and actions are specifically called out.

However, customer demand is flooding these scenarios and “we’re kind of holding the gates right now,” he said.

Advertisement

The industry must develop concrete standards for agent interactions. “We’re entering a world where, with things like MCP that can auto-discover tools, we’re going to have to create new methods of safety for deciding what tools these bots can interact with,” said Aniano.

When it comes to security, enterprises are rightly concerned when AI takes over authentication tasks, such as sending out and processing one-time passwords (OTP), SMS codes, or other two-step verification methods, he said. What happens if an AI mis-authenticates or misidentifies someone? This can lead to sensitive data leakage or open the door for attackers.

“There’s a spectrum now, and the end of that spectrum today is a human,” Aniano said. However, “the end of that spectrum tomorrow might be a specialized agent designed to do the same kind of gut feeling or human-level interaction.”

Customers themselves are on a spectrum of adoption and comfort. In certain companies — particularly financial services or other highly-regulated environments — humans still must be involved in authentication, Aniano noted. In other cases, legacy companies or old guards only trust humans to authenticate other humans.

Advertisement

He noted that Zendesk is experimenting with new AI agents that are “a little more connected to systems,” and working with a select group of customers around guardrailing.

Standing authorization is coming

In some future, agents may actually be more trusted than humans to do some tasks, and granted permissions “way beyond” what humans have today, Xanthos said. But we’re a long way from that, and, for the most part, the fear of something going wrong is what’s holding enterprises back.

“Which is a good fear, right? I’m not saying that it is a bad thing,” he said. Many enterprises simply aren’t yet comfortable with an agent doing all steps of a workflow or fully closing the loop by itself. They still want human review.

Resolve AI is on the cusp of giving agents standing authorization in a few cases that are “generally safe,” such as in coding; from there they’ll move to more open-ended scenarios that are not all that risky, Xanthos explained. But he acknowledged that there will always be very risky situations where AI mistakes could “mutate the state of the production system,” as he put it.

Advertisement

Ultimately, though: “There’s no going back, obviously; this is moving faster than maybe even mobile did. So the question is what do we do about it?”

What security teams can do now

Both speakers pointed to interim measures available within existing tooling. Xanthos noted that some tools — Splunk among them — already offer fine-grained index-level access controls that can be applied to agents. Aniano described Zendesk’s approach as a practical starting point: declaratively designed API calls with explicitly sanctioned actions, strict access and scope limits, and human review before expanding agent permissions.

The underlying principle, as Aniano put it: “We’re always checking those gates and seeing how we can widen the aperture” — meaning don’t grant standing authorization until you’ve validated each expansion.

Source link

Advertisement
Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Tech

Get a $200 gift card with the Samsung Galaxy S26 Ultra, plus a free upgrade to 512GB

Published

on

The Samsung Galaxy S26 Ultra is already looking like an unbelievable flagship phone, but this deal makes it even better.

You might have noticed by now that the pre-order deals for the Ultra are now in full swing, and for anyone looking to upgrade to an outstanding new handset, there’s no shortage of offers. With that in mind, we’ve picked out one of the very best.

Head on over to Amazon right now, and you can pick up the unlocked 512GB variant of the Samsung Galaxy S26 Ultra, and get a $200 gift card for your trouble, plus that free upgrade from 256GB.

To put the whole thing into context, that super high-end Galaxy S26 Ultra will set you back $1299.99 right now, which is a full $400 cheaper than the phone’s asking price after the pre-launch period.

Advertisement
Deal Samsung Galaxy S26 Ultra and $200 Gift CardDeal Samsung Galaxy S26 Ultra and $200 Gift Card

Get a $200 gift card with the Samsung Galaxy S26 Ultra, plus a free upgrade to 512GB

Pick up the Samsung Galaxy S26 Ultra and you’ll walk away with a $200 gift card and a complimentary bump to 512GB.

View Deal

That’s a good deal for what is likely to be one of the best phones on the market, and a gift card that can get you towards the cost of a cheaper product in Samsung’s ecosystem, such as a Galaxy Buds.

Advertisement

Advertisement

The phone certainly doesn’t skimp on the user experience; it’s quite the opposite, in fact. Being a ‘S Ultra’ phone, the S26 Ultra is loaded to the gills with the best that Samsung has to offer.

As an example, the included Privacy Display is unlike anything we’ve seen on another phone, as it can keep the information on your locking screen away from prying eyes. When it’s time to unlock the screen, it also lets you do so with the power of your own unique Galaxy AI.

You also have an eye-popping 512GB of storage, giving you far more leeway to add apps, download films, and more.

Also included is Super Fast Charging 3.0, which apparently allows the phone to reach up to 75% capacity in around just 30 minutes.

Advertisement

This incredible form of top-up is made possible by a durable battery which has a 5000 mAh capacity, giving the phone plenty of longevity from a single charge.

Advertisement

Given everything that the Samsung Galaxy S26 Ultra brings to the table, even without the added $200 gift card and $400 off ($600 in total), it’s still a fantastic buy in the world of premium phones, but with the gift card in tow, it’s a must-buy for upgrading bargain hunters.

SQUIRREL_PLAYLIST_10148964

Advertisement

Source link

Continue Reading

Tech

Why Did Samsung Remove Bluetooth From The S Pen?

Published

on





Samsung’s history in the smartphone arena is one of constant innovation. Not all of the Korean tech giant’s ideas are good (looking at you, Bixby), but it has consistently been willing to throw concepts at the wall to see what sticks. In the 2020s, that experimentation led to a whole new category of folding smartphones, but all the way back in 2011, it led to the S Pen stylus. Samsung introduced the S Pen alongside that year’s Galaxy Note to aid users with the sort of productivity work the device was designed for.

But the most impressive S Pen features didn’t come until the launch of the Note 9 in 2018, when Samsung added low-energy Bluetooth to the tiny stylus. With that, users could wave the pen like a tiny wand to control their phone, thanks to the many S Pen productivity tricks. Air Actions, as they were called, allowed users to make specific motions with the stylus to navigate the OS, control media playback, and take photos and videos — even if their phone was across the room. That functionality remained even as the Note line was deprecated and the S Pen was moved to the Galaxy S Ultra series.

Then, in early 2025, Samsung shocked dedicated S Pen users by stripping Bluetooth from the Galaxy S25 Ultra’s S Pen, undoing seven years of progress. Outrage was palpable, and users demanded answers. The answer they got only made matters worse. According to Samsung, Bluetooth was removed from the S Pen because not enough people used it. This predictably didn’t sit well with those who did use it. So, here’s Samsung’s explanation, and why its story doesn’t line up for everybody.

Advertisement

Samsung says no one used Bluetooth S Pen features  — but some users beg to differ

Samsung’s decision to strip Bluetooth functionality from the S Pen on the Galaxy S25 Ultra came as a shock to many users. After all, it had become a staple of Samsung’s top-range devices, a flagship feature that set those premium products apart from the competition. But according to Samsung, diagnostic data and a study showed that fewer than 1% of users used the wireless functionality.

Advertisement

Users were so shocked by the removal that they even started a petition asking Samsung to reverse course, which racked up over 9,500 signatures. So, when a blog post on the Samsung website claimed that a Bluetooth-enabled S Pen would be sold separately, users began to scour Samsung’s website. Unfortunately, Samsung eventually confirmed the blog contained incorrect information, further embittering S Pen die-hards. Indeed, when users got their hands on the new phone, they found that older Bluetooth S-Pens would not fit in the stylus slot, nor would Bluetooth features from those older styluses work with the S25 Ultra.

By now, it’s clear that Samsung has no plan to keep Bluetooth on any of its S-Pen compatible devices. Its most recent flagship tablet, the Galaxy Tab S11 Ultra, comes with an S Pen that lacks Bluetooth, making the Tab S10 Ultra the last such device to support the wireless protocol.

Advertisement

Samsung has a long history of removing features from new hardware

Some users held out hope that the Galaxy S26 Ultra would return Bluetooth functionality to the S Pen, but those hopes look all but dashed. If the company’s 2025 product cadence hadn’t demonstrated its total abandonment of the Bluetooth S-Pen, the S26 Ultra may drive that point home. A content creator, Sahil Karoul, got his hands on one of the brand-new devices ahead of launch, and his testing showed no sign of the Bluetooth features. The S26 Ultra will have features that make it hard to beat, but a Bluetooth S Pen will almost certainly not be one of them.

Air Actions in the S Pen are only the latest in a long line of enthusiast features Samsung has removed from its flagship smartphones. Samsung has removed plenty of other features over the years. These include removable batteries, the headphone jack, the SD card slot, a mechanical camera aperture, a pressure-sensitive display, heart rate and SpO2 readers, MST technology in Samsung Pay for older payment terminals, and LED notification indicators, to name a few.

It even got rid of its signature curved displays and stopped including a charging brick in the box with most new devices. Samsung might argue that some of those features were outdated, while the functionality of others can be replicated through alternative means, but a loss is a loss. One thing that certainly hasn’t been removed is the high price tag that accompanies many of the devices, despite feature cuts. When you’re charging over $1,300 for a phone, customers are expecting a luxury experience. Thus, it’s not shocking they’d feel slighted by the removal of a Bluetooth sensor.

Advertisement



Advertisement

Source link

Continue Reading

Tech

OpenAI fires employee for using confidential info on prediction markets

Published

on

OpenAI has fired an employee over the employee’s activity on prediction markets, including Polymarket, the company confirmed to Wired. The employee used confidential OpenAI information in connection with the trades made, the company alleges.

OpenAI didn’t release the name of the employee. However, a spokesperson said that such actions violated a company policy that bans workers from using inside information for personal gain, including on prediction markets.

Prediction markets like Polymarket and Kalshi allow people to make wagers on the outcomes of real-world events. For instance, on Polymarket, there are wagers being made around the kind of products OpenAI will announce in 2026 and when the company will go public. They can cover any event, and some eye-popping money can be made. As we recently reported, an accountant won a $470,300 jackpot on Kalshi by betting against DOGE believers.

Prediction markets insist they are not gambling sites, preferring to label themselves as financial platforms. Kalshi is a regulated exchange and, in fact, it fined and banned a MrBeast editor for similar alleged insider trading earlier this week. OpenAI did not immediately respond to a request for additional comment.

Advertisement

Source link

Continue Reading

Tech

O House Becomes Japan’s First Fully Seismic-Approved 3D-Printed Reinforced Concrete Home

Published

on

O House Japan 3D-Printed Reinforced Concrete Home
Photo credit: Onocom
O House, finished in late 2025 within Kurihara, Miyagi Prefecture, is a 50-square-meter, 3D-printed two-story home. The 31-square-meter ground floor holds a master bedroom and bathroom, while the 19-square-meter upper level hosts the kitchen, dining, and living areas. Curved walls rise 7 meters, stacked like bricks and set half a meter below ground for stability, while skylights let natural light pour into every corner.


O House Japan 3D-Printed Reinforced Concrete Home
O House Japan 3D-Printed Reinforced Concrete Home
Kizuki Co. Ltd. and Onocom Co. Ltd. jointly led this project, while COBOD International handled the printing technology. A modified version of their BOD series printers was exactly what they needed to get the job started. Working layer by layer, the printer built the inner and exterior walls, floor slab, roof, and a few inside elements. The printer handled on-site work, however some components were manufactured off-site. To cap it off, the crew installed custom-cut styrofoam for those problematic overhangs and arches that the printer couldn’t handle and increased as the process progressed. Finally, the team gave some of the wall parts a shining polish to get the smooth marble appearance.


Bambu Lab A1 3D Printer, Support Multi-Color 3D Printing, High Speed & Precision, Full-Auto Calibration…
  • High-Speed Precision: Experience unparalleled speed and precision with the Bambu Lab A1 3D Printer. With an impressive acceleration of 10,000 mm/s…
  • Multi-Color Printing with AMS lite: Unlock your creativity with vibrant and multi-colored 3D prints. The Bambu Lab A1 3D printers make multi-color…
  • Full-Auto Calibration: Say goodbye to manual calibration hassles. The A1 3D printer takes care of all the calibration processes automatically…

O House Japan 3D-Printed Reinforced Concrete Home
O House Japan 3D-Printed Reinforced Concrete Home
O House Japan 3D-Printed Reinforced Concrete Home
They used a combination of rebar inserted directly inside the layers of concrete and good old-fashioned reinforced steel, all linked together by a steel frame that bears the primary loads. With a firm anchor in the ground thanks to some ground-improvement piles, this hybrid technology enabled the construction to exceed Japan’s stringent seismic regulations, which are among the strictest in the world. All of this demonstrates that in earthquake-prone regions, printed reinforced concrete is a viable alternative to timber framing.

O House Japan 3D-Printed Reinforced Concrete Home
Rounding up a crew of only four in challenging winter conditions made things tough, with freezing temperatures below 10 degrees demanding heated water be added to the mix to keep things flowing, while hot summers of 30 to 35 degrees required careful temperature control to stop the material from setting too fast, but they kept going, working continuously from below ground to the top, and creating multi-purpose walls with an aesthetic finish, structure, and hidden services.
[Source]

Source link

Advertisement
Continue Reading

Tech

BMW Sends AEON Humanoid Robots to the Line in Leipzig

Published

on

BMW AEON Humanoid Robots Leipzig Factory
BMW employees at the Leipzig plant have been juggling all the complex parts of vehicle assembly, especially the hefty battery modules. However, a new member has joined the team: the AEON humanoid robot. AEON was created by Hexagon Robotics, a company BMW has been collaborating with for years on grunt work, to manage the physically taxing and repetitive tasks that wear people out.



By April 2026, the AEON prototype will be put through its paces in a larger round of assessments, with a complete pilot phase scheduled to begin this summer. The goal is to make AEON extremely adaptable so that it can transition between activities as needed; simply switch out the gripper or scanner and you’re ready to go. It travels from station to station without the need for fixed rails because it has wheels rather than legs.


Unitree G1 Humanoid Robot(No Secondary Development)
  • Height, width and thickness (standing): 1270x450x200mm Height, width and thickness (folded): 690x450x300mm Weight with battery: approx. 35kg
  • Total freedom (joint motor): 23 Freedom of one leg: 6 Waist Freedom: 1 Freedom of one arm: 5
  • Maximum knee torque: 90N.m Maximum arm load: 2kg Calf + thigh length: 0.6m Arm arm span: approx. 0.45m Extra large joint movement space Lumbar Z-axis…

Electric vehicle battery modules need extra care, and workers frequently need to wear safety gear simply to move them. After a few shifts, it becomes monotonous, but AEON is more than willing to relieve them of that kind of work without putting undue strain on the human workers. External component production is also relevant since, let’s face it, robotic consistency is a huge bonus when performing the same operation repeatedly.

BMW AEON Humanoid Robots Leipzig Factory
BMW is doing something a little different from the conventional industrial arms that are fastened to the ground. Thanks to data from BMW’s recently unified systems, AEON is able to move around, adjust to any arrangement, and become more intelligent every day. For a more seamless operation, the business had to dismantle the outdated data silos that were creating so much friction; now, all of that data streams directly into AEON. Naturally, safety is the top priority, which entails improved wireless coverage, additional barriers to keep people safe, and other measures. Since everything is linked into the current Smart Robotics network, there is no need to worry about anything getting left behind.

BMW AEON Humanoid Robots Leipzig Factory
The main benefit in this case is that human labor is still essential to the entire operation. BMW wants employees to be able to do something more interesting for a change by eliminating the monotony of their jobs. The personnel on the floor have been won over by early buy-in from safety teams, IT, and logistics, and having their support from the beginning has made everything go much more smoothly. The strategy here was shaped by lessons learned from an earlier test in the US facility in Spartanburg, which demonstrated how rapidly AEON could catch up and the dependability of everything in an actual production environment.

BMW AEON Humanoid Robots Leipzig Factory
In all of this, Leipzig has established a new Center of Competence for Physical AI, and its experts are getting to work assessing partners, conducting pilots, and expanding the concepts that prove effective. To stay competitive in Europe, executives are already discussing quantifiable improvements in speed and accuracy for these difficult activities.
[Source]

Advertisement

Source link

Continue Reading

Tech

City of Seattle CTO Rob Lloyd is resigning to lead a government institute with national reach

Published

on

Rob Lloyd, chief technology officer for the City of Seattle, announced that he is stepping down from his role in March. (Photo courtesy of Rob Llloyd)

Rob Lloyd, Seattle’s chief technology officer, is leaving his post to become executive director of the Center for Digital Government. His last day will be March 27.

“Leading IT and our dedicated teams in service to Seattle has been an honor,” Lloyd said to colleagues in an email sent Thursday night.

Lloyd told GeekWire that while he appreciated Mayor Katie Wilson’s invitation to stay in the role, he was “beyond excited” to take the new job, which would allow him to perform similar work with local and state governments nationwide.

Lloyd became CTO in June 2024 after eight years as deputy city manager of San José, Calif. While his new employer is based in California, he will remain in Seattle. “My family wanted it no other way,” Lloyd said.

The city provided GeekWire with Lloyd’s letter of resignation, in which he said the “timing is right for a change.” The mayor is reshaping her executive team and its direction, he wrote, and strategizing actions related to the budget and this summer’s FIFA World Cup games.

Advertisement

Seattle is facing about a $140 million budget deficit for next year. The Seattle Times reported that Wilson is asking departments to provide plans for funding cuts of 5% to 10%.

In the letter, Lloyd also highlighted some of his team’s accomplishments during his tenure, including:

  • Recovering more than $130 million “in failing and stalled technology projects.”
  • Executing the city’s IT Strategic Plan.
  • Partnering with fire, police, mental health and emergency management services on public safety technologies.
  • Managing a $21 million operating budget reduction while increasing service reliability and employee retention.
  • Updating cybersecurity practices.
  • Formalizing his department’s first customer service and staff feedback surveys.

Lloyd has been responsible for overseeing roughly 670 employees, and joined the city with a $270 million operating budget and a capital budget of about $24 million.

In December, the city appointed Lisa Qian as its first AI Officer. Her experience includes serving as a senior manager of data science at LinkedIn, as well other tech company leadership positions.

When Lloyd came to Seattle, he told GeekWire he hoped the city would be his “forever home” — and that he wanted to step outside City Hall and build relationships with the community members and companies driving the region’s tech scene. He was eager to play a part in tackling difficult issues such as public safety, homelessness and downtown recovery.

Advertisement

In his email to employees, Lloyd said that during his final weeks he would “be focused on completing the final commitments I made to the organization when I arrived.”

“What I’ll carry most from my time here isn’t the projects or the milestones though, it’s the memories of you and our partners,” Lloyd continued. “So many people made this work a true gift. Thank you to the City for letting me serve this community with you.”

Source link

Advertisement
Continue Reading

Tech

Microsoft’s new AI training method eliminates bloated system prompts without sacrificing model performance

Published

on

In building LLM applications, enterprises often have to create very long system prompts to adjust the model’s behavior for their applications. These prompts contain company knowledge, preferences, and application-specific instructions. At enterprise scale, these contexts can push inference latency past acceptable thresholds and drive per-query costs up significantly. 

On-Policy Context Distillation (OPCD), a new training framework proposed by researchers at Microsoft, helps bake the knowledge and preferences of applications directly into a model. OPCD uses the model’s own responses during training, which avoids some of the pitfalls of other training techniques. This improves the abilities of models for bespoke applications while preserving their general capabilities. 

Why long system prompts become a liability

In-context learning allows developers to update a model’s behavior at inference time without modifying its underlying parameters. Updating parameters is typically a slow and expensive process. However, in-context knowledge is transient. This knowledge does not carry across different conversations with the model, meaning you have to feed the model the exact same massive set of instructions or documents every time. For an enterprise application, this might mean repeatedly pasting company policies, customer tickets, or dense technical manuals into the prompt. This eventually slows down the model, drives up costs, and can confuse the system.

“Enterprises often use long system prompts to enforce safety constraints (e.g., hate speech detection) or to provide domain-specific expertise (e.g., medical knowledge),” said Tianzhu Ye, co-author of the paper and researcher at Microsoft Research Asia, in comments provided to VentureBeat. “However, lengthy prompts significantly increase computational overhead and latency at inference time.”

Advertisement

The main idea behind context distillation is to train a model to internalize the information that you repeatedly insert into the context. Like other distillation techniques, it follows a teacher-student paradigm. The teacher is an AI model that receives the massive, detailed prompt. Because it has all the instructions and reference documents, it generates highly tailored responses. The student is a model being trained that only sees the main question and doesn’t have access to the full context. Its goal is simply to observe the teacher’s responses and learn to mimic its behavior.

Through this training process, the student model effectively compresses the complex instructions from the teacher’s prompt directly into its parameters. For an enterprise, the primary value happens at inference time. Because the student model has internalized the context, you can deploy it in your application without needing to paste in the lengthy instructions again. This makes the model significantly faster and with far less computational overhead.

context distillation

However, classic context distillation relies on a flawed training method called “off-policy training,” where the model is trained on fixed datasets that were collected before the training process. This is problematic in several ways. During training, the student is only exposed to ground-truth data and teacher-generated answers, creating what Ye calls “exposure bias.” In production, the model must come up with its own token sequences to reach those answers. Because it never practiced making its own decisions or recovering from its own mistakes during training, it can easily derail when operating independently. It’s like showing a student videos of a professional driver and expecting them to learn driving without trial and error.

Another problem is the “forward Kullback-Leibler (KL) divergence” minimization measure used to train the model. Under this method, the model is graded on how similar its answers are to the teacher, which encourages “mode-covering” behavior, Ye says. The student model is often smaller or lacks the rich context the teacher had, meaning it simply lacks the capacity to perfectly replicate the teacher’s complex reasoning. Because the student is forced to try and cover all those possibilities anyway, its underlying guesses become overly broad and unfocused.

In real-world applications, this can result in hallucinations, where the AI gets confused and confidently makes things up because it is trying to mimic a depth of knowledge it does not actually possess. It also means that the model cannot generalize well to new tasks.

Advertisement

How OPCD fixes the teacher-student problem

To fix the critical issues with the old teacher-student dynamic, the Microsoft researchers introduced On-Policy Context Distillation (OPCD). The most important shift in OPCD is that the student model learns from its own generation trajectories as opposed to a static dataset (which is why it is called “on-policy”). Instead of passively studying a dataset of the teacher’s perfect outputs, the student is given a task without seeing the massive instruction prompt and has to generate an answer entirely on its own.

As the student generates its answer, the teacher acts as a live instructor. The teacher has access to the full, customized prompt and evaluates the student’s output. At every step along the student’s generation, the system compares the student’s token distribution against what the context-aware teacher would do.

on-policy context distillation

On-policy context distillation

OPCD uses “reverse KL divergence” to grade the student. “By minimizing reverse KL divergence, it promotes ‘mode-seeking’ behavior. It focuses on high-probability regions of the student’s distribution,” Ye said. “It suppresses tokens that the student considers unlikely, even if the teacher’s belief assigned them high probability. This alignment helps the student correct its own mistakes and avoid the broad, hallucinatory distributions of standard distillation.”

Advertisement

Because the student model actively practices making its own decisions and learns to correct its own mistakes during training, it behaves more reliably when deployed in a live application. It successfully bakes complex business rules, safety constraints, or specialized knowledge directly into its permanent memory.

What OPCD delivers: The benchmark results

The researchers tested OPCD in two key areas: experiential knowledge distillation and system prompt distillation. For experiential knowledge distillation, the researchers wanted to see if an LLM could learn from its own past successes and permanently adopt those lessons. They tested this on models of various sizes, using mathematical reasoning problems.

First, the model solved problems and was asked to write down general rules it learned from its successes. Then, using OPCD, they baked those written lessons directly into the model’s parameters. The results showed that the models improved dramatically without needing the learned experience pasted into their prompts anymore. On complex math problems, an 8-billion-parameter model improved from a 75.0% baseline to 80.9%. For example, on the Frozen Lake navigation game, a small 1.7-billion parameter model initially had a success rate of 6.3%. After OPCD baked in the learned experience, its accuracy jumped to 38.3%.

The second set of experiments were on long system prompts. Enterprises often use massive system prompts to enforce strict behavioral guidelines, like maintaining a professional tone, ensuring medical accuracy, or filtering out toxic language. The researchers tested whether OPCD could permanently bake these dense behavioral rules into the models so they would not have to be sent with every single user query. Their experiments show that OPCD successfully internalized these complex rules and massively boosted performance. When testing a 3-billion parameter Llama model on safety and toxicity classification, the base model scored 30.7%. After using OPCD to internalize the safety prompt, its accuracy spiked to 83.1%. On medical question answering, the same model improved from 59.4% to 76.3%.

Advertisement

One of the key challenges of fine-tuning models is catastrophic forgetting, where the model becomes too focused on the fine-tune task and worse at general tasks. The researchers tracked out-of-distribution performance to test for this tunnel vision. When they distilled strict safety rules into a model, they immediately tested its ability to answer unrelated medical questions. OPCD successfully maintained the model’s general medical knowledge, outperforming the old off-policy methods by approximately 4 percentage points. It specialized without losing its broader intelligence.

Where OPCD fits — and where it doesn’t

While OPCD is a powerful tool for internalizing static knowledge and complex rules, it does not replace all external context methods. “RAG is better when the required information is highly dynamic or involves a massive, frequently updated external database that cannot be compressed into model weights,” Ye said.

For enterprise teams evaluating their pipelines, adopting OPCD does not require overhauling existing systems or investing in specialized hardware. “OPCD can be integrated into existing workflows with very little friction,” Ye said. “Any team already running standard RLVR [Reinforcement Learning from Verifiable Rewards] pipelines can adopt OPCD without major architectural changes.”

In practice, the student model acts as the policy model performing rollouts, while the frozen teacher model serves as a reference providing logits. The hardware requirements are highly accessible. According to Ye, enterprise teams can reproduce the researchers’ experiments using about eight A100 GPUs.

Advertisement

The data requirements are similarly lightweight. For experiential knowledge distillation, developers only need around 30 seed examples to generate solution traces. Because the technique is applied to previously unoptimized environments, even a small amount of data yields the majority of the performance improvement. For system prompt distillation, existing optimized prompts and standard task datasets are sufficient.

The researchers built their own implementation on verl, an open-source RLVR codebase, proving that the technique fits cleanly within conventional reinforcement learning frameworks. They plan to release their implementation as open source following internal reviews.

The self-improving model: What comes next

Looking ahead, OPCD paves the way for genuinely self-improving models that continuously adapt to bespoke enterprise environments. Once deployed, a model can extract lessons from real-world interactions and use OPCD to progressively internalize those characteristics without requiring manual supervision or data annotation from model trainers.

“This represents a fundamental paradigm shift in model improvement: the core improvements to the model would move from training time to test time,” Ye said. “Using the model—and allowing it to gather experience—would become the primary driver of its advancement.”

Advertisement

Source link

Continue Reading

Tech

Ultrahuman’s Pro smart ring can go for two weeks

Published

on

Ultrahuman is back with its most capable smart ring yet, the Ring Pro.

This third-generation smart ring can deliver up to 15 days of battery life alongside a new Pro Charging Case and an AI health platform – Jade.

The 15-day battery claim triples the four to six-day lifespan of the Ring Air, a gap that Ultrahuman frames as a category-defining shift.

The Ring Pro uses a titanium unibody construction and carries a redesigned internal architecture. A redesigned heart-rate sensing architecture improves signal quality during sleep and recovery, while an upgraded dual-core processor handles faster data processing and on-chip machine learning, both of which directly affect the accuracy of the health metrics the ring generates overnight.

Advertisement

ProRelease Technology allows the ring to be cut apart more easily in the event of finger swelling or injury, a safety feature that most competing smart rings have not addressed despite growing consumer concerns about wearables worn continuously for extended periods.

Advertisement

The Charging Case carries its own 45-day battery reserve and stores up to one year of ring data, using a magnetic UltraSnap connection that Ultrahuman states generates less heat than conventional wireless charging during repeated daily use.

ultrahuman pro charging caseultrahuman pro charging case

Jade and PowerPlugs

Jade, the company’s new biointelligence AI platform, pulls real-time data from across the Ultrahuman ecosystem, including the Ring Pro, Blood Vision biomarkers, M1 continuous glucose monitoring, and Ultrahuman Home environmental sensors to surface personalised health insights rather than retrospective summaries.

The system differs from standard AI health integrations by executing real-time actions such as triggering AFib detection or initiating breathwork sessions, a capability that Ultrahuman positions closer to Tesla’s Full Self-Driving model of continuous real-time processing than a conventional backward-looking data query tool.

Advertisement

PowerPlugs, the company’s expanding micro-application platform, adds new capabilities including GLP-1 lifestyle tracking, respiratory health and snoring analysis through a Sleep Cycle integration, and migraine management tools built on Click Therapeutics’ FDA-authorised digital therapeutic technology.

Advertisement

The Ring Pro is available to pre-order globally, excluding the United States, for $479, with shipments beginning in March, and trade-in discounts of up to $115 apply for existing Ring Air and other smart ring owners.

Advertisement

Source link

Continue Reading

Tech

Who is hiring in AI, robotics and automation right now?

Published

on

AI solutions architect, senior automation engineer and machine learning engineer are just three of the exciting roles open to qualified professionals in Ireland at the moment.

Click here to access the entire catalogue of Automation Focus.

February at SiliconRepublic.com is when we take a closer look at all things AI, robotics and automation. It is a rapidly evolving space that often requires a significant commitment to upskilling and training, but on the upside, it is also a fascinating ecosystem to be a part of. 

So, if you are a STEM professional looking for a new role or opportunity, then why not consider applying at one of these 14 organisations? Each has a range of positions in Ireland open to experts aiming to work in AI, robotics or automation, or a combination of the three. 

Analog Devices

US semiconductor manufacturing company Analog Devices has a presence in a number of countries, including Ireland. Currently, in Limerick, the organisation is looking to onboard a robotics and automation graduate, whose responsibilities will include assisting in the calibration and maintenance of robotic arms and automated handlers, supporting troubleshooting of robotic motion control systems and sensors, learning about robotic integration with high-vacuum and pneumatic systems, and participating in projects to optimise robotic throughput and reliability, among other tasks. 

Advertisement

BMS

US pharmaceutical company Bristol Myers Squibb (BMS) is currently advertising a position for a senior manager in EHSS performance enablement. The role will require a professional with the skills to support the execution of EHSS performance monitoring, ensure that EHSS systems and processes deliver accurate, reliable data for decision-making, and lay the foundation for future predictive analytics capabilities.

BMS states that the position supports the maintenance and operational improvement of EHSS data and reporting systems, including performance measurement frameworks, maintaining data quality, and enabling analytics and visualisation through data science to track EHSS performance.

There is also a similar role for a senior manager in EHSS systems implementation.

Clio

In November of last year, Canadian AI legal-tech Clio officially opened a new office in Dublin’s Docklands, just a few days after the company announced a $500m raise. Over the next couple of months, Clio plans to expand its Dublin-based team from 60 to more than 100 employees, adding new roles.

Advertisement

At the moment, someone with AI or automation skills could be well-suited to a role as a software developer at the organisation. There is also a vacancy for a senior compliance analyst, EMEA. This job involves work in dealing with expansion and automation of compliance programs. The successful candidate will work with stakeholders across Clio’s operations to support compliance initiatives such as risk mitigation, support of innovation in AI and product development, customer inquiry support, control maintenance and instilling best practices.

Equinix

AI infrastructure provider Equinix recently announced plans to create 200 jobs in Dundalk, Co Louth via an investment of up to $700m in a new facility that will be built by local company Hanley Energy. The new roles are expected to be in a range of technical areas, such as precision engineering, quality assurance and lean manufacturing. While more jobs are likely to be announced down the line, one of the roles currently open to prospective employees is that of senior director, controls engineering and service management.

EXL

Data and artificial intelligence company EXL is headquartered in New York; however, the organisation has a presence in multiple countries, including Ireland, where there are two opportunities open to professionals with AI and automation skills: senior engineer in applied AI and digital AI solution architect. Both roles are advertised as hybrid and full-time. EXL is also offering Dublin-based full-stack engineering positions for people with varying degrees of seniority. 

EY

UK-based professional services firm EY has a number of AI-specific job opportunities for qualified professionals. Anyone interested in applying could consider jobs in agentic AI engineering, AI-lab full-stack engineering, agentic and generative intelligence, AI analytics, and data architecture, among others. The Dublin office is also recruiting for an intelligent automation assistant manager and an intelligent automation manager. 

Advertisement

Fidelity Investments 

For professionals looking for a new role in which AI and automation skills are a plus, Fidelity Investments has opportunities at its Dublin and Galway-based facilities. In the capital, there is a vacancy for a principal site reliability engineer, and out west there are roles for principal full-stack engineer, senior software engineer, principal software engineer and senior full-stack engineer.

Fixify

US software company Fixify has a position in senior data science for an Ireland-based professional looking to work remotely. Fixify states that the successful candidate will be at the forefront of transformation by wielding machine learning, AI and “data wizardry”. 

Liberty IT

Liberty IT, the technology arm of the insurance company Liberty Mutual Insurance, has offices in Belfast, Dublin and Galway. The team is looking for a professional skilled in ML automated workflows, software engineering, cloud architecture and AI, for a role as a senior AI solutions architect. The role is open to professionals located in any of the three Irish premises. Also on offer to an expert with automation skills is a role as a senior data engineer. 

MSD

Pharmaceutical multinational MSD has a vacancy for a senior specialist in manufacturing automation. The successful candidate will join the team at the multiproduct facility in Dunboyne, Co Meath and will work closely with colleagues across engineering, operations, quality, validation and global technology, among other teams.

Advertisement

PwC

Professional services company PwC is adding to its AI and automation capabilities with a number of key hires. In Dublin, open roles include AI Azure architect in data and AI, senior associate engineer for agentic AI, AI technology consultant manager in data and AI, and senior manager and AI architect in Azure.  

TCS

IT services, consulting and business solutions platform TCS has an opportunity open to a Cork-based professional in its IT department. Its Little Island facility is hiring for a senior automation engineer, and desired skills for the role include experience in a manufacturing environment, high volume automated assembly experience and medical device manufacturing experience.

Version1

Dublin-headquartered IT services and consulting company Version1 is looking to expand its teams. Currently, the organisation is recruiting professionals armed with a range of AI and automation skills. Vacant roles include AI engineer, cloud/AI solution architect, power apps architect, full-stack developer and Azure cloud consultant, among others. 

Yahoo

Technology company Yahoo is looking to recruit professionals with AI and automation skills to a number of its Ireland-based teams. The Yahoo Mail division has a vacancy for multiple positions, including backend engineer II, principal software apps engineer, senior data engineer and machine learning engineer II. 

Advertisement

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Source link

Advertisement
Continue Reading

Tech

eBay cuts 800 jobs after Depop acquisition

Published

on

eBay acquired Depop from Etsy for $1.2bn earlier this month.

Online marketplace eBay is laying off around 800 jobs – or 6pc of its workforce spread globally. The job cuts are a response to operating model needs and future priorities, the company has explained.

As of 31 December last year, company filings show that eBay employed approximately 12,300 people globally, 5,100 of which are situated outside the US.

“We are taking steps to reinvest across our business and align our structure with our strategic priorities, which will affect certain roles across our workforce,” an eBay spokesperson told news publications.

Advertisement

“We are grateful for the contributions of the employees impacted and are committed to supporting them with care and respect.” eBay, however, will still continue to hire in key areas, it said.

The announcement comes after eBay posted a 2025 annual net revenue of $11.1bn, up 8pc from the year before, while gross merchandise volume (GMV) was up 7pc to $79.6bn. eBay noted that fashion alone represents more than $10bn in GMV annually.

The layoffs also come on the heels of eBay announcing a $1.2bn acquisition of the second-hand fashion marketplace Depop, from Etsy. With the acquisition, eBay wants to target the under-34 consumer base – which represents a majority of Depop’s user base.

Etsy purchased Depop for $1.6bn in 2021. The same year, it bought Brazilian online marketplace Elo7 for $217m. In 2019, it purchased music gear marketplace Reverb. All three have since been sold by Etsy, which has been suffering from slowed growth in recent years. Its year-over-year revenue grew by just 2.2pc in 2024, down from 7.1pc in 2023.

Advertisement

eBay laid off 9pc of its total workforce in 2024, or 1,000 jobs, citing macroeconomic conditions. In early 2023, it cut 500 jobs.

Company filings showed that eBay’s Irish arm, which handles its European operations, paid out more than €1.8m in redundancy costs after cutting 75 jobs in 2024.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Advertisement

Source link

Continue Reading

Trending

Copyright © 2025