Connect with us
DAPA Banner
DAPA Coin
DAPA
COIN PAYMENT ASSET
PRIVACY · BLOCKDAG · HOMOMORPHIC ENCRYPTION · RUST
ElGamal Encrypted MINE DAPA
🚫 GENESIS SOLD OUT
DAPAPAY COMING

Tech

Andreea Wade quits VC to fix AI’s invisible plumbing problem

Published

on

After exiting Opening.io to iCIMS and spending two years on the investor side as a partner at Delta, Andreea Wade is back in the founder seat, and this time the moonshot is hers.

“You always hear there are no operators in VC. So I really thought, ‘OK, this is what I want to do’,” says Andreea Wade about her decision to join VC firm Delta Partners back in 2024. That was the plan, until November of last year, when her former co-founder Adrian Mihai sent her a 3am message. He had just beaten the numbers on a research paper that, in theory, solved one of AI’s most invisible infrastructure problems, explains Wade.

“I leaned in a bit more, and I was like, ‘Can I help you?’ Because that’s my thing. How can I help?” What started with a spark of interest finished with the decision to drop her new VC career with Delta Partners and return to the start-up world to co-found Univec.ai with Mihai.

It wasn’t supposed to go like this. When Wade joined Delta Partners as a general partner (a rarity at the partner-led firm) after exiting Opening.io, the AI talent intelligence company she’d built with Mihai, to iCIMS, the message was clear.

Advertisement

“The guys were like, ‘Oh, this is a job for life. You come in, that’s it’. Because that’s how VC works. You raise a fund, you have to be there, especially as a partner.”

Wade got it. She’d done the building thing. Now she’d do the helping-others-build thing, and she was good at it. As a self-described “founder whisperer”, she threw herself into the role and found she had a special insight born from sitting at the other side of the table.

“Regardless of what people were saying to me, how they were saying it, I knew exactly where they were, even if the words were not necessarily pointing at the thing.”

She liked being useful. “Where I come alive is when founders need help. I’m like, ‘OK, sleeves up, let me help you with the raise, with the rebrand, with whatever it is’.” But there are only so many companies a partner can get behind in any given year. So when Mihai (her co-founder of two decades, the cool kid from her hometown who once won the Romanian national programming olympiad) landed on something that might genuinely change a market, founder-whispering wasn’t going to cut it.

Advertisement

The problem he’d cracked sits “deep, deep in AI infrastructure”, says Wade, invisible to most, but foundational. AI models speak in vector embeddings, a layer of numbers that turns text and other content into something machines can reason about. Every vendor (OpenAI, Google, Anthropic) has its own embedding models, each effectively speaking a different language. Worse, they get deprecated. “Every single time a model is killed, you have to redo it again. Imagine that. You’ve trained, you paid all this money to train on all the poetry in the world, and a year from now, six months from now, a month from now, it’s equal to zero.”

Research to real world

Until recently, the only published solutions lived in academic papers. Mihai beat them. Univec.ai now has 87-plus bridging models that translate between embedding spaces without re-embedding from scratch. They’ve open-sourced a chunk, and are publishing benchmarks and model cards for every release – partly because the market doesn’t yet know it has this problem, says Wade.

That last bit is critical. When Wade showed the work to a hugely experienced AI lead in her investor network, his reaction was immediate. He’d only ever seen the underlying research paper. He told Wade: “Andreea, 75pc of the companies in our portfolio will not know that they have this problem, but every single one of them will.”

It’s the kind of “beautiful problem with beautiful solutions for the geeks within infrastructure” that Wade and Mihai have tackled before. At iCIMS, they were sitting on 600 million CVs, which Mihai corrected to trillions of data points.

Advertisement

“I remember building all this kind of marketing speak, and Adrian going, ‘It’s actually trillions, but don’t say trillions because it sounds like gazillions, so just say billions’.”

They also built one of the early vector databases, before that was a category, and didn’t spin it out. “We still had a little bit of regret on not turning that into a company.” This time, they’re not making that mistake.

What followed for Wade was a few weeks of long walks at the end of last year, and an honest reckoning. “I was already solutioning in my head. I was already working. I was already there before I was there. I just felt alive in a way that I haven’t felt in a long time.”

So she told Mihai she was in. Then came the hard part – breaking the news to her partners at Delta.

Advertisement

“I was having 50 heart attacks at the same time,” she says of the Monday morning she told them. When she finally got the words out, she was taken aback by the level of understanding. “They were like, ‘You need to do what you need to do, and don’t worry about anything else’.”

The resilience of immigrant founders

It’s far from Wade’s first reinvention. She arrived in Ireland 24 years ago, at 23, on an inter-company transfer through Chubb, the only way she could get here before Romania joined the EU – Wade was born in Romania to Hungarian parents. Back home, she had been a senior editor on an advertising magazine. Her first Irish gig was patrolling Coca-Cola warehouses in Drogheda on night shift.

“It’s raining, it’s cold, things are creaking, and I’m patrolling. I remember thinking if my friends, my parents at home, could see me, they’d be like, ‘What are you doing?’” Several decades later, Wade will get her Irish citizenship on 22 June.

She has leaned on that arrival story before, mostly privately. But the resilience of immigrant founders is something she finds herself returning to. “You genuinely can only depend on yourself. Something goes wrong? You can’t move into your parents’ house. It’s sink or swim.”

Advertisement

Between security work, journalism, a stint running an underground metal festival in Romania (Dark Bombastic Evening, or DBE, which still runs), the product curriculum at the Digital Skills Academy that start-up scene regular Gene Murphy handed her two weeks before launch, time as head of product at Independent News & Media and her own start-up branding consultancy called Brandalism, Wade has picked up a broad set of skills.

The common thread, she says, is being able to explain complicated things plainly, which is useful when your second company is building a new category of AI model that most of the market hasn’t realised it needs.

And what’s next? Well, Univec.ai will start a fundraising process in the next couple of months. The inbound interest is already there, says Wade, from European and US funds, generalists, infrastructure specialists and female-focused funds.

Wade is particularly interested in the infrastructure specialists, given the brand new start-up’s mission. “We want to contribute,” she says. “OpenAI and the others are building the foundational models. There are slices within infrastructure where we want to make our own contribution to AI.

Advertisement

“We want to build a new category, and be the leaders in it.” Given her and Mihai’s track record as founders, you would not bet against them. Job for life, indeed.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Source link

Advertisement
Continue Reading
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Tech

ICE Officers Break Cameras. Cops Steal Them. Welcome To New Jersey.

Published

on

from the oh-cool-more-fascists dept

If federal officers are going to murder another person, it will likely happen here.

Newark, New Jersey is the newest battleground for the administration, as Trump goes to war with his own constituents. The foundation was laid months ago, when ICE officers assaulted, arrested, and illegally refused to grant access to detention facilities to congressional reps.

Now, there’s a war being fought at the Delaney Hall detention facility, overseen by ICE and run by private prison contractor, GEO Group. The protests have been steadily getting more intense. The city’s mayor, Ras Baraka, has been on the Trump administration’s radar ever since officers arrested him for… um… standing on a public sidewalk as New Jersey congressional reps demanded access to the facility.

Things aren’t exactly being made better by Governor Mikie Sherrill. On one hand, she has passed laws that forbid local police cooperation with ICE’s anti-migrant efforts. On the other hand, she’s decided to expend state resources to protect federal resources from protesters.

Advertisement

The crisis remains a volatile, early test of Ms. Sherrill and her administration, with the potential for political fallout that could reverberate far beyond Newark. Ms. Sherrill, a moderate Democrat, has already faced criticism from the left, which has pointed to her decision to send in New Jersey State Police troopers to quell disturbances outside Delaney Hall as evidence of cooperation with the Trump administration’s divisive immigration crackdown. 

Seems like that might be a job that would be better handled by vastly better-funded federal agencies, like the Federal Protective Service which is overseen by the flush-with-cash DHS.

But given what’s happening outside of Delaney Hall, it might make more sense to expend state resources on protecting protesters, legal observers, and (especially!) journalists from federal officers, not to mention the locals who are supposed to be serving and protecting.

It’s nothing new to hear that federal officers are assaulting journalists or anyone else attempting to document their actions. But the specificity of these attacks makes it clear federal officers are deliberately seeking to do as much damage as possible to the tools journalists use to make a living.

According to a report by amNewYork, there have been allegations from multiple photojournalists who say they were injured while documenting clashes near the detention center, with some reporting damaged camera equipment and physical injuries, including broken fingers.

Advertisement

Reuters photojournalist Ryan Murphy tells amNewYork that he was struck with a baton over several nights of coverage and said agents targeted his camera during an incident on Thursday. Murphy said he believes the strike broke one of his fingers.

[…]

Photographer Madison Swart, a frequent contributor to The New York Times, also alleged that she was deliberately pushed to the ground while documenting the protests. Swart says an agent struck her with a baton during the confrontation. According to amNewYork, another photographer was reportedly seen curled in the fetal position as agents moved over her, while another prominent photographer, who requested anonymity, says the top of his camera was smashed.

Here’s another account that comes with photos of the damage done:

Mostafa Bassim, a photojournalist for Turkey’s Anadolu Agency, was struck with a baton by a federal officer, damaging his camera lens, while covering protests outside a private immigration detention center in Newark, New Jersey, on May 28, 2026.

[…]

Advertisement

Bassim told the U.S. Press Freedom Tracker that he arrived at the detention facility shortly before nightfall. He said that even before he was able to start documenting the scene, federal officers noticed his camera and began shining high-powered lights directly at him.

“The second they see you with a camera they just start doing that to you,” Bassim said.

Any officer who’s only interested in doing what’s necessary to maintain the peace wouldn’t deliberately target journalists, especially before the protests themselves start to get out of hand. And when it is actually time to step in to protect federal employees (or government contractors), force should be applied to those whose actions demand a forceful reaction. Deliberately targeting journalists and the tools of their trade is nothing more than being shitty just because you know no one will stop you.

And speaking of being shitty, this is still the high water mark for law enforcement response to the Delaney Hall protests:

Advertisement

[P]hotojournalist, Angelina Katsanis, 25, dropped her camera bag after she was injured at the protest on Saturday, she said in an interview. The bag contained roughly $10,000 worth of equipment, according to a statement from the state attorney general, Jennifer Davenport.

The bag was later tracked using an Apple AirTag to the home of Darryl Brown, 43, a sergeant with the Essex County Prosecutor’s Office, the statement said. Sergeant Brown, of Sparta Township, N.J., had been deployed to Delaney Hall during the protest, prosecutors said.

On top of the theft (which is a felony, given the value of items stolen), there’s the officer’s attempt to cover up the crime:

From a hospital bed, she watched on her phone as the AirTag in her camera bag traveled across northern New Jersey — on the highway, then to a private residence, and then to a bar close to that home, she said.

Ms. Katsanis said her boyfriend and the other photographer went out to track the AirTag and found that it had been removed from her bag and was on the side of the road. She said that her name and contact information were still clearly written on the AirTag.

Unfortunately, the officer is still employed, albeit not working at the moment… and better yet not being paid for not working. Suspended without pay. It’s a start. Somehow, the prosecutor’s office can’t help but shift into the exonerative tense when discussing this alleged crime, even as moves forward with its prosecution:

Advertisement

The prosecutors also received footage from Sergeant Brown’s body-worn camera, which they said “shows him interacting with a dark-colored bag consistent with the description of the victim’s belongings.”

“Interacting” is a pretty coy term for “rifling through a bag’s contents before deciding to steal the bag and everything in it.” It’s like describing molestation as “interacting with a minor” or a carjacking as “interacting with a vehicle’s driver.” Tell it like it is: the officer was digging through someone’s bag and shortly thereafter took it back to his home where it was recovered during the execution of a search warrant.

Only one of these two things looks like a trend, that being the deliberate targeting of journalists and their expensive equipment. The camera theft is probably a one-off, but possibly only because federal officers are making sure journalists’ cameras are too broken to be worth stealing.

Filed Under: 1st amendment, darryl brown, delaney hall, dhs, ice, immigration, mass deportation, new jersey, protests, thugs, trump administration

Advertisement

Source link

Continue Reading

Tech

Daily Deal: The Complete Photoshop Master Class Bundle

Published

on

from the good-deals-on-cool-stuff dept

It’s no secret that Photoshop can be a bit dense when you’re first getting your feet wet with it. That’s why it pays to have a expert instructors show you the ropes. Led by a Photoshop pro, the Complete Photoshop Master Class Bundle will help you master Photoshop CC and become an expert—no prior experience is required! From layers and filters to levels and curves, you’ll come to grips with essential Photoshop concepts and refine your skills with the included working files. It’s on sale for $30.

Note: The Techdirt Deals Store is powered and curated by StackCommerce. A portion of all sales from Techdirt Deals helps support Techdirt. The products featured do not reflect endorsements by our editorial team.

Filed Under: daily deal

Source link

Advertisement
Continue Reading

Tech

I hope these 4 Galaxy S26 Ultra software features make their way to the Galaxy A57 and more affordable Samsung phones soon

Published

on

When I was doing all the testing for our Samsung Galaxy A57 review, I enjoyed how streamlined its software was compared to that of the best Samsung phones. But since publishing that review, I’ve been jumping back and forth between the A57 and another Samsung flagship, and I’ve got a more nuanced view.

Before the A57 (and, for a little while, after it), I was using the Samsung Galaxy S26 Ultra, which is pretty much the best Android phone money can buy. It has similar hardware specs to the Galaxy S25 Ultra, with its biggest advancements instead coming in the form of new software tools and features.

Source link

Continue Reading

Tech

Nightmare Eclipse drops claimed BitLocker bypass for Microsoft Windows

Published

on

Security

Another day, another Windows exploit code

Nightmare Eclipse, the prolific zero-day vulnerability hunter with an axe to grind against Microsoft, released yet another exploit late Wednesday that the researcher claims will spawn a command prompt that provides total access to the BitLocker volume.

This bug, called GreatXML, was “an accidental discovery,” according to the researcher, who said it only took four hours to find. They claim this exploit (published on GitHub and Git-based code-hosting platforms) can bypass BitLocker on any system that has ever run a Microsoft Defender Offline scan at any point in the past.

Advertisement

GreatXML comes just a day after Nightmare released exploit code for RoguePlanet, which allows local privilege escalation and leads to SYSTEM-level control over an affected machine. This brings the researcher’s zero-day count to eight. The earlier six – RedSun, UnDefend, BlueHammer, YellowKey, GreenPlasma, and MiniPlasma – all have patches as of this week’s Patch Tuesday event. 

Redmond on Wednesday told The Register that it is aware of RoguePlanet, and “actively investigating the validity and potential applicability of these claims.” The Windows giant didn’t immediately respond to our inquiries about GreatXML, including when it planned to issue a patch.

Microsoft has said none of the vulnerabilities were reported via its official channels prior to being made public. The company also banned Nightmare’s earlier GitHub account, and seemingly threatened legal action before dialing back its rhetoric after steep backlash from the security community.

Nightmare Eclipse, who some researchers suggest is an ex-Microsoft employee, harbors a very personal grudge against the Windows giant and its communications with bug hunters. They have promised to keep the zero-days coming, but waffle on the timing. 

Advertisement

Last month, the researcher pledged a big July 14 drop: “I will make sure your bones are shattered that day,” and then added, “nothing will be released this June (or maybe I will release smtg, depending on circumstances).”

On Tuesday, they changed course. “I will be unable to mass disclose zerodays in July 14th, RoguePlanet took way more time than expected and truly drained me. I might take a break but I can’t say for sure what I will be doing for next month, maybe it’s nothing, maybe it’s smtg.”

A day later, Nightmare released the “accidental” GreatXML BitLocker bypass. 

According to the researcher, the BitLocker bypass first requires copying “unattend.xml” and the “Recovery” directory to the root of the recovery partition. The next step is rebooting into WinRE by Shift-clicking Restart. “If everything was done correctly, a shell with unrestricted access to the bitlocker volume will spawn,” Nightmare wrote.

Advertisement

Also, if the scan hasn’t even been initiated on the Windows system, first you’d need to either log in and initiate it, or “figure out a way to boot into WinRE in offline scan state.”

Security sleuth Will Dormann followed Nightmare’s steps to reproduce GreatXML, and said the writeup seems “flawed.” In his testing, Dormann said the command prompt appeared the next time a Defender Offline scan ran.

“And in order to trigger a Microsoft Defender Offline scan, you both need to be logged in to Windows, and also have admin credentials,” he wrote on social media. “And if you’ve already got that level of access, you can just turn off bitlocker.”

“The writeup for GreatXML suggests that the prerequisite is that Windows Defender Offline has been executed at some point in the past,” Dormann added. “And that after planting two files in WinRE, all you need to do is [Shift]-reboot into WinRE, and Windows will automatically go into Microsoft Defender Offline scan mode. But this is not the case in any of the 3 lineages of Win11 that I have handy.” ®

Advertisement

Source link

Continue Reading

Tech

Why Google’s New AI-Saturated Search Page Will Be A Disaster

Published

on

from the the-end-of-ten-blue-links dept

Google didn’t invent full-text search of the Internet – that honor belongs to early pioneers such as WebCrawlerLycos and AltaVista. But for the last 25 years or so, Google has been synonymous with online searching, providing the quickest and most effective way to find things online (although its results may be getting worse.) More recently, it has been adding to its search engine more features based on generative AI, first with its AI Overviews in 2024, and then a year later with its AI Mode in Search. Now it has announced the latest stage in that evolution with what it calls “A new era for AI Search”:

It’s more intuitive than ever, dynamically expanding to give you space to describe exactly what you need. Designed to anticipate your intent, it also helps you formulate your question with AI-powered suggestions that go beyond autocomplete. And you can search across modalities, using text, images, files, videos or Chrome tabs as inputs.

This new incarnation effectively turns search into a chatbot:

You can easily ask a follow-up question right from an AI Overview, and flow into a conversational back and forth with AI Mode. Your context stays with you, and as you explore more deeply, the links and supporting articles get even more relevant. This seamless experience is live today across desktop and mobile, worldwide.

As the the screenshot of the new interface above shows, the traditional search result links that are currently placed under the AI Overview have now been confined to a small panel on the right-hand side of the screen, which shows a cut-down version of today’s list. Users are encouraged to ask follow-up questions from the AI search chatbot, rather than exploring the links themselves.

What this is likely to mean in practice is that even fewer people will follow links to sites, something that was already happening last year; instead, they will engage with Google’s chatbot to gather information indirectly. This is terrible news for access to knowledge because it frames the Google AI search engine as the fount of all knowledge – one that will do all the hard work of finding information and combining it into an easily digested answer that can be interrogated further. It can do that because it has already ingested billions of Web pages and other information sources as part of the Large Language Model (LLM) training process. But search engine users will no longer know what some of those sources are unless they painstakingly click on the links in the new panel.

Most people will not bother, because the AI-generated results will be good enough – or at least will appear to be good enough. Unless visitors to the site take the trouble to follow the links to the sources they won’t really know how reliable those results are. For example, it is possible that the sources are wrong, or misleading; moreover, Google’s LLM may itself introduce new errors and distortions. There is also the question of how Google will insert ads into this AI-generated information, and to what extent advertisers will be able to buy preferential treatment in results.

Advertisement

This new mediated approach is clearly terrible news for Wikipedia – an issue already discussed on Walled Culture earlier this year – and for creators. Google will use the information found in their works, but will not actively encourage people to visit the originals. For many people, summaries will be good enough, and they will never discover the greater riches of the sites and creations that Google’s LLM is based on. Worse still, the original creators such as Wikipedia may not even be mentioned in answers that involve aggregating information from a large number of sources.

Similarly, the new Google search is the publishing industry’s worst nightmare. Not only is Google drawing on material they have published, but it is pushing links to those sources into the background. It seems inevitable that the Web traffic to publishers will fall yet further, making already struggling business models based on advertising even more precarious. That will have knock-on consequences for the funding of many sites – particularly newspapers and magazines – and for the commissioning of work from journalists and other creative professionals. Users won’t even need to visit Google Search much in order to keep up-to-date with topics of interest thanks to Google Search’s new agentic capabilities that will do the work for them in advance:

With information agents, you can stay updated on whatever matters most to you. Your agent will intelligently look across everything on the web, like blogs, news sites and social posts, plus our freshest data, such as real-time info on finance, shopping and sports, to monitor for changes related to your specific question.

In this case, not only will people not visit sites, but the latter will be constantly bombarded by various AI bots seeking information on behalf of users – increasing site running costs, and making sites less usable by humans. Another key announcement from Google will lead to a further flood of agentic activities that will pose new challenges to businesses:

We’re also expanding agentic booking capabilities in Search to a wide range of new tasks, including local experiences and services. Just share your specific criteria — like finding a private karaoke room for six on a Friday night that serves food late — and Search brings together the latest pricing and availability with direct links to finish booking through the provider of your choice. And for select categories like home repair, beauty or pet care, you can ask Google to call businesses on your behalf.

What emerges from Google’s latest announcements is less of a search engine, and more of an immersive virtual environment that is designed to keep people engaging with Google’s services, asking them for information, advice and even delegating actions to them. There is no doubt that many users will find these new features attractive, not least because they can use “conversational voice features” in Gmail, Docs and elsewhere. These are the digital assistants that have been promised for many years, able to understand spoken commands, provide information verbally, and carry out complex operations on behalf of users without the need for any complex training. For many people, that will be a boon, and they will doubtless migrate from the traditional search page, which will still be the default – at least for now – to the latest AI-infused version.

Advertisement

But these impressive technical features come at a high price, even leaving aside issues such as the environmental impact of the huge server farms they require. With the latest incarnation of its search engine, Google is making the World Wide Web as we have known it for over 30 years invisible, and therefore increasingly irrelevant to most people, who will be happy to let Google become their universal user interface to everything. And yet Google still depends on the Internet to supply all the information it is analyzing and repackaging. It risks killing the very thing that sustains it.

There’s another, more subtle issue. The new Google search features make finding information and carrying out actions very easy in many ways. Leaving aside the problem that this will require people to trust what is in effect a huge black box, where the internal workings cannot be examined, with all the loss of control this implies, there is another danger. People who use Google’s powerful new AI search services to offload many of their day-to-day actions may gradually lose the ability to understand the world and to act within it without that constant help. Such a dependence may be great for Google and its advertisers, but it surely cannot be a good thing for the future of society.

Follow me @glynmoody on Mastodon and on Bluesky. Originally published to WalledCulture.

Filed Under: ai, links, open web, search

Companies: google

Advertisement

Source link

Continue Reading

Tech

Your robot can’t be smart, fast, and free. Evolution solved that already.

Published

on

Here is a constraint that almost no one building physical AI says out loud, even though every one of them is quietly fighting it.

A robot’s intelligence wants three things at once. It wants to be smart, meaning it can reason at the level of a frontier model about an unfamiliar scene. It wants to be fast, meaning it responds inside the tight, deterministic timing a physical control loop demands. And it wants to be free, meaning it keeps working when the network drops, the warehouse Wi-Fi dies, or the machine goes somewhere no signal reaches.

You cannot have all three on one piece of compute. Pick any two.

To be precise, bounded autonomy already works. Industrial arms, drones, and constrained autonomy stacks can be fast and offline because their tasks are narrow. The trilemma bites at the frontier: you cannot put frontier-scale general reasoning, deterministic real-time response, and full offline autonomy into the same power-limited substrate, not for the same control loop.

Advertisement

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol’ founder Boris, and some questionable AI art. It’s free, every week, in your inbox. Sign up now!

A frontier-scale model is smart, and if you stream its sensors to a datacenter it can even be fast, but now it is tethered to a network and no longer free. Shrink that model until it fits on a 15-watt embedded module and it becomes fast and free, but it is no longer frontier-smart. Run the big model in the cloud and query it only occasionally, and you get smart and free, but never fast. Three corners, two available at a time. I have come to think of this as the embodied trilemma, and it is the real reason the edge/cloud question is the hardest architecture decision in robotics. Most teams treat it as a deployment detail. It is closer to a law.

Why you can’t cheat the triangle

The trilemma is not a fashion or a temporary hardware limitation you can wait out. It falls directly out of physics and power budgets.

Advertisement

Frontier reasoning quality currently lives in models that want tens of gigabytes of memory and datacenter-class accelerators. That hardware does not run on a battery a mobile robot can carry. So “smart” forces a choice: either bring the datacenter to the robot through a network link, which sacrifices freedom, or accept a smaller onboard model, which sacrifices smartness.

Real-time control is even less negotiable. A wide-area network round trip adds 30 to 100 milliseconds of latency, and the variance matters more than the average. A control loop that is usually fast but occasionally stalls is worse than one that is reliably mediocre, because controllers are tuned for deterministic timing. The moment “fast” depends on a network, you have surrendered “free,” because the network is now inside your control loop whether you meant it to be or not.

So the triangle holds. Quantization, distillation, and better accelerators move the corners, but they do not collapse them. Anyone claiming otherwise is usually hiding which corner they gave up.

Putting numbers on the triangle

It helps to make the constraint quantitative, because the moment you write the timing down, the corners stop being abstract.

Advertisement

Start with latency. The end-to-end delay of a perception-to-action decision made in the cloud is a sum of terms:

Lcloud = tcapture + tencode + tuplink + tinference + tdownlink + tdecode

Run the same decision onboard and most of that sum disappears:

Ledge = tcapture + tinference,local

Advertisement

The difference between the two is not the inference time, which can actually be lower in the cloud on better hardware. The difference is the network, tuplink + tdownlink, and more importantly its variance. A measured cloud-robotics setup over a fast wired link saw round trips of roughly 30 milliseconds [7], while real-world deployments commonly sit in the 100 to 300 millisecond range, and wireless links swing far higher. Edge processing, by contrast, pulls round trips down toward 1 to 5 milliseconds because nothing leaves the machine [8].

Now state the rule that decides where a loop can live. A control loop with timing budget Lbudget can run on a given compute path only if

Lpath + k·σjitter ≤ Lbudget

where σjitter is the standard deviation of the path’s latency and k is the safety factor you need for determinism. That k·σjitter term is the quiet killer. Teleoperation studies are blunt about it: a link that holds a steady 100 milliseconds is workable, but one oscillating between 30 and 200 milliseconds produces jerky, unpredictable motion, because the controller cannot plan around delay it cannot predict [9]. The reflex loop’s budget is 1 to 10 milliseconds. No wide-area path satisfies the inequality. The math, not the architect, forbids it.

Advertisement
Control loop Timing budget Onboard path (~1-5 ms) Wide-area path (~30-300 ms)
Reflex (motor control, e-stop) 1-10 ms Feasible Impossible
Perception (detection, tracking, SLAM) 30-100 ms Feasible Marginal, fails on jitter
Deliberation (planning, language) 1-10 s Feasible Feasible (async)

The table is the argument in one view. Reflex never clears a network round trip. Perception clears it only on unusually good links. Deliberation has budget to spare, which is why it can live in the cloud asynchronously.

Bandwidth closes the case for perception. A single 1080p camera at 30 frames per second produces raw video at 1920 × 1080 × 3 bytes × 30, which is about 1.5 gigabits per second. A modest four-camera plus depth rig clears 6 gigabits per second of raw sensor data. You can compress it, but compression costs latency and the link still has to carry it reliably, everywhere the robot goes. Edge perception is the robotic version of that move. Compress to a semantic representation on the spot; never ship the raw stream.

Finally, the economics, which is just the trilemma with a dollar sign. Onboard compute is a one-time capital cost. Cloud reasoning is an operating cost that accrues with every query:

Ccloud(t) = r·ctoken·t

Advertisement

where r is the query rate and ctoken the per-token price, against a flat Cedge = Ccapex. The two lines cross at t* = Ccapex / (r·ctoken). Push thirty frames a second to a cloud model and t* arrives almost immediately, so cloud cost dominates the lifetime of the fleet. Route only a few deliberation-class queries per minute upstream and t* recedes over the horizon.

Strategy What goes upstream Cost shape Break-even t*
Stream everything ~30 frames/sec to a cloud model Steep linear opex Almost immediate
Route deliberation only A few queries/min Shallow linear opex Past fleet service life
Fully onboard Nothing One-time capex, flat Never crossed

Same hardware, same models, opposite economics, decided entirely by which loop you placed in which corner. The gap is not subtle: a single camera streamed to a cloud vision model at 30 frames per second is on the order of a million inference calls a day per robot, while routing only deliberation-class queries upstream might be a few hundred. Across a fleet, that is the difference between cloud inference being a rounding error and being the largest line on the operating budget.

The escape nobody designed, because biology did it first

Here is the part I find beautiful, and the heart of what I want to argue: the way out of the embodied trilemma is not to solve it. It is to refuse to answer it at a single point.

Your own body is built this way, and it has been for roughly half a billion years.

Advertisement

When you touch a hot stove, your hand pulls back before your brain knows anything happened. That is the spinal reflex arc, a loop that runs through the spinal cord and never waits for the cortex. It is fast and free (it works even if you are barely conscious), and it is emphatically not smart. It does not reason about the stove. It does not need to.

Your retina does something just as telling. It has over a hundred million photoreceptors, but the optic nerve carrying signal to the brain has only about a million fibers [10]. The eye does roughly a hundredfold compression on the spot, locally, before transmitting anything. It does not ship raw pixels up the cable. It ships a processed, compact representation. Fast and free at the edge, by necessity.

And then there is the cortex, which is where the actual reasoning happens. It is slow, it is powerful, and crucially, the body has arranged things so that when the cortex is slow or offline, the reflexes still fire and you still pull your hand back. Evolution put the survival-critical functions where they never depend on the smart, slow part.

That is the whole trick. Biology never built a single neuron that was smart, fast, and free all at once. It built a hierarchy in which different loops each sit at a different corner of the triangle, and it made sure the corner each loop sacrifices is one that loop can afford to lose. Reflexes give up intelligence, which is fine, because a reflex that stops to think is a reflex that gets you killed. The cortex gives up speed, which is fine, because it has been kept off the survival path entirely.

Advertisement

A robot escapes the embodied trilemma the same way, or it does not escape at all.

Mapping the triangle onto a machine

Translate the nervous system into engineering and a practical architecture emerges. A robot has three loops, and each one belongs at a different corner.

The reflex loop (1 to 10 ms): motor control, stabilization, emergency stops. This is the spinal cord. It must be fast and free and is allowed to be dumb. It lives onboard, always, and never touches a network.

The perception loop (30 to 100 ms): detection, tracking, obstacle avoidance, visual odometry, SLAM. This is the retina. It must keep working when the link drops, and the bandwidth math forbids shipping raw sensor data anyway, since even a single camera produces well over a gigabit per second of raw video before compression. So perception compresses at the edge, exactly as the eye does, and emits a compact semantic representation rather than pixels. Fast and free, intelligence traded away on purpose.

Advertisement

The deliberation loop (1 to 10 seconds): task planning, language understanding, deciding what to do when the plan breaks. This is the cortex. It is allowed to be slow, and slowness is exactly the corner it trades away, reaching a frontier model in the cloud asynchronously rather than in the control path. It stays free in the only sense that matters, never holding the robot hostage to a live link. If connectivity vanishes, the robot gets less clever, not less safe.

The interface between these layers is the optic nerve of the system: a deliberately narrow channel carrying detections, tracks, and state summaries, never raw signal. Get that channel right and you have not just an inference boundary. You have defined your logging schema, your training-data pipeline, and your behavior when the link drops, all at once.

The industry is rediscovering the nervous system

What convinces me this is structural, not stylistic, is that the most advanced robotics programs keep reinventing the same hierarchy without necessarily naming it.

Figure AI’s Helix, the system running its humanoid robots through full eight-hour factory shifts, is explicitly two systems: a roughly 7-billion-parameter vision-language model at 7 to 9 Hz for scene understanding and language, coupled to a compact 80-million-parameter visuomotor policy that turns intent into continuous action at 200 Hz [1]. That is cortex and reflex on one robot, a 25-to-1 ratio in update rate between the loop that thinks and the loop that acts, each running at the timescale its job demands. Surveys of edge-cloud collaboration now describe the same division as standard practice, with small onboard models handling real-time perception and privacy-sensitive preprocessing while heavier reasoning is offloaded upstream [4].

Advertisement

Comparisons on real robot data quantify the trade directly: deploying an 11-billion-parameter vision-language model at the network edge held accuracy close to its cloud baseline while shaving only modest latency, whereas a compact 2-billion-parameter model more than halved latency into sub-second territory, paying for the speed with accuracy [5]. Reviews of foundation-model robotics keep flagging the same wall: LLM planners take seconds per decision, fine for the cortex, hopeless for the spinal cord [6]. NVIDIA’s own Jetson deployment guidance reflects it too, with optimized onboard inference for perception and policy and larger models living upstream [2].

Different teams, different machines, the same triangle, the same corners. When that many independent efforts converge, you are looking at structure, not style.

Lessons from the ultimate airgap

The starkest place to watch the trilemma bite is underwater robotics. An ROV below the surface has effectively no real-time link to the cloud. The ocean is the ultimate airgap, the freedom corner taken to its absolute extreme. In hands-on underwater robotics builds, perception (detection and tracking, optimized with TensorRT) runs entirely on an onboard module, while language-level mission interaction and fleet reasoning reach a frontier model in the cloud only asynchronously, on surfaced or relayed data, and never inside a control loop. The architecture is not a preference there. The water enforces it.

Three principles follow, and they generalize far beyond the sea.

Advertisement

Design for the disconnected case first. If the robot is safe and useful with zero connectivity, the cloud becomes pure upside: better reasoning, fleet learning, human oversight. If the robot needs the cloud to stay safe, you have built a cortex with no spinal cord, a liability on wheels.

Treat the narrow channel as a contract, not a cable. The compressed representation crossing the edge/cloud boundary is the single most important interface in the system. Teams that treat it as an afterthought re-architect twice.

Remember the trilemma is also an economics statement. Onboard compute is paid once, at purchase. Cloud reasoning is paid forever, per token. Routing only deliberation-class queries upstream, a few per minute instead of thirty frames per second, changes fleet unit economics by orders of magnitude. Cloud-inference cost can quietly become the largest operating line on a robotics program that put the wrong loop in the wrong corner.

The corners will move. The triangle won’t.

Onboard modules get more capable every generation, and distillation keeps narrowing the gap between edge models and their cloud teachers. Early-exit inference, where confident predictions resolve locally and only hard cases escalate, is maturing fast [3][5]. The deliberation loop will migrate partly onboard over the next few years, especially for safety-relevant replanning. The corners of the triangle will keep sliding.

Advertisement

But the triangle itself does not go away, because it is anchored in physics and energy, not in any model generation. Smart, fast, and free will never coexist on a single substrate as long as frontier intelligence costs more power than a robot can carry and the speed of light caps how fast a remote answer can return. The teams that internalize this, and that consciously assign each loop the corner it can afford to lose, will ship robots that work when the network does not. The rest will keep learning, in the field and at the worst possible moment, that they accidentally wired their spinal cord through a datacenter.

Evolution settled this argument before there were spines. We are just catching up.

References

1. Figure AI. “Helix: A Vision-Language-Action Model for Generalist Humanoid Control.” figure.ai/news/helix. 2025.

2. NVIDIA Developer Blog. “Getting Started with Edge AI on NVIDIA Jetson: LLMs, VLMs, and Foundation Models for Robotics.” developer.nvidia.com. 2025.

Advertisement

3. Qu, G., Chen, Q., Wei, W., Lin, Z., Chen, X., and Huang, K. “Mobile Edge Intelligence for Large Language Models: A Contemporary Survey.” IEEE Communications Surveys and Tutorials, 2025 (arXiv:2407.18921).

4. Li, S., Wang, H., Xu, W., Zhang, R., Guo, S., Yuan, J., Zhong, X., Zhang, T., and Li, R. “Collaborative Inference and Learning between Edge SLMs and Cloud LLMs: A Survey of Algorithms, Execution, and Open Challenges.” arXiv:2507.16731, 2025.

5. Ahmad, S., Hafeez, M., and Zaidi, S.A.R. “Vision-Language Models on the Edge for Real-Time Robotic Perception.” University of Leeds, arXiv:2601.14921, 2026.

6. Khan, M.T., and Waheed, A. “Foundation Model Driven Robotics: A Comprehensive Review.” arXiv:2507.10087, 2025.

Advertisement

7. Kapoor, A., et al. “A Predictive Application Offloading Algorithm Using Small Datasets for Cloud Robotics.” arXiv:2108.12616, 2021.

8. Coutinho, R.W.L., and Boukerche, A. “Design of Edge Computing for 5G-Enabled Tactile Internet-Based Industrial Applications.” IEEE Communications Magazine, 60(1), 2022.

9. Urbaniak, D., et al. “5G for Robotics: Ultra-Low Latency Control of Distributed Robotic Systems.” IEEE.

10. Kandel, E.R., Schwartz, J.H., and Jessell, T.M. “Principles of Neural Science.” McGraw-Hill.

Advertisement

Source link

Continue Reading

Tech

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

Published

on

Context windows are becoming a computational bottleneck. The longer an agent runs, the more tokens accumulate from retrieved documents, reasoning traces and conversation history, and the more memory and compute that growing context demands. Most existing solutions either degrade model accuracy, require the full context to load before compression begins, or produce memory savings that don’t translate into real speedups in standard serving infrastructure.

A research team from NYU, Columbia, Princeton, University of Maryland, Harvard and Lawrence Livermore National Laboratory published a paper this week that proposes a novel fix. The researchers introduce the concept of  Latent Context Language Models, or LCLMs, a family of encoder-decoder compression models that compress input context before it reaches the decoder. The models are open-sourced on HuggingFace.

Unlike KV cache compression methods — the dominant approach in the field, which still materialize the full KV cache before evicting entries — LCLMs compress the input token sequence before decoder prefill, so higher compression ratios directly reduce decoder-side compute and memory. The paper reports LCLMs at 16x compression produced output 8.8 times faster than KV cache baselines on the RULER long-context benchmark.

“These ballooning contexts take up memory and compute, and they are becoming a computational bottleneck for LLMs,” Micah Goldblum, co-lead advisor on the project and a researcher at Columbia University, told VentureBeat. “Our goal was to train language models end-to-end that can handle very long contexts efficiently and accurately. If you can make such a language model, everything becomes cheaper and faster.”

Advertisement

What LCLMs can do

LCLMs let models process much longer contexts than would otherwise be practical, at a fraction of the memory and compute cost, without the accuracy degradation that makes most compression methods a poor tradeoff in production.

At 4x compression, the paper reports accuracy of 91.76% on the RULER benchmark, compared to 94.41% with no compression at all. That is less than a 3 point drop for cutting context to a quarter of its original size. At 16x compression, where 93.75% of input tokens are removed, accuracy fell to 75.06%. Every KV cache method tested at the same compression ratio scored lower.

The gains hold on shorter inputs too. On GSM8K math word problems, where the full prompt is compressed rather than just retrieved documents, LCLMs outscored every other method tested regardless of compression ratio.

 Latent Context Language Models achieve high quality compression while being fast and memory efficient

Credit: End-to-End Context Compression at Scale research paper https://arxiv.org/pdf/2606.09659

Advertisement

How it was built

The architecture pairs a 0.6B encoder with a 4B decoder. The encoder compresses blocks of input tokens into shorter sequences of latent embeddings. The decoder processes those in place of the original tokens. Training ran across more than 350 billion tokens.

The training recipe mixes three data types:

  • Continual pre-training data with compressed and uncompressed spans interleaved throughout

  • Supervised fine-tuning data covering reasoning and long-context tasks

  • An auxiliary reconstruction task that pushes the encoder to retain fine-grained detail

The combination addresses a tradeoff that limited earlier compression work, where preserving reconstruction accuracy came at the cost of general task performance.

An architecture search identified the optimal configuration. The paper found that scaling the decoder matters more than scaling the encoder.

Advertisement

Where it fits in an agentic stack

An LCLM is not an abstract research concept. It is designed to work with an existing stack. “You can simply swap out LCLMs for any existing LLM,” Goldblum said. “Whenever you retrieve data such as documents and want to dump it into your model’s context, simply run those documents through the LCLM’s compressor first.”

He noted that in the research paper, the researchers demonstrated how to build agents that selectively decompress useful text. 

“Think about this like a human skimming content before zooming in on relevant details,” Goldblum said.

Goldblum also cautioned that teams integrating the approach into existing agentic pipelines will need to tune their RAG systems accordingly.

Advertisement

“We also haven’t worked on online compression of reasoning traces,” he said. “The naive approach of just occasionally compressing the trace while generating it might work, but that remains to be determined.”

What this means for enterprises

Context windows are growing faster than inference infrastructure can keep up, and enterprises are already spending to fix it. VB Pulse Q1 2026 survey data from 100-plus employee organizations shows hybrid retrieval adoption intent tripling from 10.3% in January to 33.3% in March. Retrieval optimization overtook evaluation as the top investment priority by March, reaching 28.9% of qualified respondents.

Three things stand out for teams evaluating production fit:

  1. Inference cost scales with context length. At 1 million tokens, uncompressed inference with standard KV cache methods runs out of memory on a single H200 GPU. The paper reports LCLMs at 16x compression remain within memory bounds at that context length.

  2. RAG pipeline integration requires tuning. Teams with existing RAG pipelines will need to validate compression behavior against their retrieval quality metrics before deploying at scale.

  3. Reasoning trace compression is unsolved. For agents running long reasoning chains, context growth from the trace is a separate problem from document retrieval. Goldblum acknowledged the gap directly: the naive approach of periodic trace compression might work but has not been tested.

The models are available at huggingface.co/latent-context and the code at github.com/LeonLixyz/LCLM.

Advertisement

“The biggest things our architectures do is give your model access to much larger contexts, but they also unlock multiscale approaches where your model can skim vast amounts of text or code super fast and then only zooms in and fully reads a small portion of the most useful text,” Goldblum said.

Source link

Continue Reading

Tech

Meta’s Edits app is getting an AI assistant and a desktop version

Published

on

Meta on Wednesday previewed upcoming additions to its video-editing app Edits at an invite-only creator event in L.A., showing off features like a new AI assistant and a desktop version of the previously mobile-only app.

The company also announced other new tools will launch in the app today, such as a “Beta” tab for experiments and expanded audience insights.

Edits first arrived last year as a direct competitor to ByteDance’s CapCut. With the addition of the new and upcoming tools, Meta is looking to both retain and attract new users.

The upcoming AI assistant will help creators analyze their insights and brainstorm ideas for their content. The assistant will use their Instagram data, like their views and video-retention insights, to help them see what’s working and why. It will suggest video ideas based on performance and suggest making content with trending audio.

Advertisement

By integrating an AI assistant directly into Edits, Meta is aiming to keep creators engaged on Instagram as it continues to compete with TikTok and YouTube for creators’ attention. Additionally, by offering creators content ideas, Meta is encouraging more frequent posting, which could, in turn, boost user engagement. Direct access to an AI assistant also gets rid of the need for creators to turn to outside tools like ChatGPT when brainstorming content ideas and understanding performance.

Meta launched a similar AI assistant tool for creators on Facebook last week. It’s worth noting that YouTube and TikTok also offer tools to creators to help them brainstorm ideas. For instance, YouTube Studio features an Inspiration tab that uses AI to help creators generate video ideas, while TikTok offers creators an AI assistant that can brainstorm ideas and uncover trends.

The desktop version of Edits will give creators more precise control over the editing process as well as the ability to work on a larger screen, which can be helpful during more advanced editing workflows. The company says creators will be able to sync their workflows seamlessly between mobile and desktop devices.

The upcoming desktop version will also allow Edits to better compete with CapCut, which already offers a desktop version.

Advertisement
Image Credits:Instagram

Among the new features launching today is a Beta tab, which will provide creators with early access to experimental features that are still in development and allow them to provide Meta with feedback. The rollout of the Beta tab indicates that Meta wants to better compete with CapCut and accelerate feature development based on what creators actually want and will use.

Creators will also now be able to see more detailed metrics like their audience demographic breakdown and the time of day their audience is the most engaged. The new metrics join the app’s existing analytics, which include data such as how long viewers watch a video, how many followers were gained from a specific video, where users stop watching a certain video, and more.

Additionally, creators can search specific topics within the app’s Inspiration feed to discover reels and templates other creators are making around a given trend or idea. They’ll also be able to create multiple versions of a single piece of content to test what performs best before publishing.

Although Instagram didn’t share specific numbers about how many users Edits has, the company says that content made with the app sees a 10% higher save rate and 2% higher reshare rate compared to content not made on Edits, and that more than half of people watching reels on Instagram are seeing Edits-created content every day. 

Edits is free to download on iOS and Android.

Advertisement

The AI assistant announced today is currently in testing with attendees of Thursday’s creator event, while the desktop version of Edits is “coming soon,” Meta says. The rest of the features are launching to everyone today.

When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.

Source link

Advertisement
Continue Reading

Tech

AI is bankrupting your cloud budget. Here’s what savvy bizs are doing instead.

Published

on

[This is a sponsored article with Synology.]

The cloud was supposed to eliminate infrastructure headaches. Instead, some businesses are discovering a new one: invoices they no longer understand.

Storage fees, data retrieval charges, and backup costs are quietly pushing cloud spending higher than many organisations anticipated—and artificial intelligence (AI) is about to make it significantly worse. 

It’s a challenge Taiwanese storage company Synology has been watching closely. 

Advertisement

At Computex 2026 in Taipei last week, the company argued that the economics of cloud-first infrastructure are beginning to shift.

The cloud bill that keeps growing

Businesses that moved enthusiastically to cloud-first infrastructure in the early 2020s are now sitting with bills that look nothing like the ones they signed up for. As costs continue to climb, some are reassessing whether keeping more data on-premise could make better financial sense.

The main catalyst is artificial intelligence. 

AI doesn’t just store data—it constantly accesses, moves, and processes it. Every inference call, every model training run, every search query is pulling data in and out of storage.

Advertisement

Gartner forecasts that more than 80% of enterprises will have deployed AI-enabled applications by 2026. And in a cloud environment, every one of those activities comes with a price tag.

Image Credit: Summit Art Creations via Shutterstock

From the start, cloud storage has been largely marketed on a simple figure: storage cost per gigabyte. Amazon Web Services S3 Standard, for example, runs at roughly US$23 per terabyte per month, with Google Cloud and Microsoft Azure in a similar range. 

For many businesses, the math looked straightforward. But what’s less visible is everything layered on top of that base rate.

Every time data is retrieved or moved—something AI workloads do constantly—cloud providers charge additional fees. 

On AWS, data egress starts from around US$0.09 per gigabyte transferred out. At scale, even restoring a 10TB dataset can quietly add about US$700 to the bill, before factoring in anything else.

Advertisement

Add API requests, cross-region replication, and backup-related charges, the total cost can be pushed up to four times the advertised storage rate. 

A Backblaze survey of more than 400 IT leaders found that 95% encountered unexpected cloud storage costs. And according to the Wasabi Global Cloud Storage Index, which surveyed 1,600 IT decision-makers globally, including 525 across APAC, 63% of organisations in the region exceeded their cloud storage budget in 2024. 

Businesses are bringing data back in-house

Amid rising costs, cloud repatriation—bringing data and workloads back from public cloud providers onto private or on-premise infrastructure—has moved from a niche IT discussion to a mainstream business decision. 

Synology’s on-premise storage solutions, PAS7700 and FlashStation Series./ Image Credit: Synology

A 2025 Barclays CIO survey found that 86% of enterprise CIOs planned to shift at least some workloads back to private or on-premise systems, the highest level recorded. The Flexera 2025 State of the Cloud report similarly shows that 21% of workloads have already been repatriated, even as overall cloud spending continues to grow.

However, it’s important to note that most organisations aren’t abandoning cloud wholesale

Advertisement

Instead, they are taking a hybrid approach by keeping cloud platforms for global accessibility and collaboration, while bringing back workloads that involve heavy storage, protection, and processing. Backup and production storage are among the most commonly repatriated, as these are the areas where costs scale most quickly.

For Singapore businesses, there’s another factor at play: regulation. Under MAS Technology Risk Management guidelines and PDPA obligations, companies are expected to know where their data is stored and how it is being accessed.

That becomes harder in large public cloud setups, where data can be spread across multiple regions and servers. On-premise systems, by contrast, make it easier to keep track of exactly where information sits and who has access to it, since everything is managed within a company’s own infrastructure.

What Synology is bringing to the table

Image Credit: Synology

Synology is one of the companies building for this shift. 

The firm is best known for its Network Attached Storage (NAS) hardware—physical devices that store data locally while still functioning as a private cloud. 

Advertisement

At Computex 2026, it outlined how it is expanding its NAS ecosystem beyond storage into AI-enabled data management and backup infrastructure.

At the centre of this push is Synology’s next-generation DiskStation Manager (DSM), the operating system that powers every Synology NAS device.

The Taiwanese firm has spent more than two decades building NAS hardware and software. Today, it has shipped over 14 million systems worldwide, managing more than 400 exabytes of data. 

AI that stays in-house

Synology Product Marketing Manager Katherine Chiang unveils DSM Agent 2.0 at Computex 2026./ Image Credit: Synology

At Computex 2026, the company announced the roadmap for the next generation of DiskStation Manager, DSM Agent 2.0, expanding it from a storage operating system into an intelligent data platform for governed, on-premises AI workflows. The goal is to turn DSM from a storage system into a smarter data platform that can support AI tools running on a company’s own infrastructure.

Instead of sending data to external cloud services, businesses can use their own data, such as files, system logs, and usage data, to power AI tools internally, while keeping everything under their control.

Advertisement

“The next generation of DSM leverages over two decades of expertise to create an AI-ready platform that keeps organisations firmly in control of their data,” said Philip Wong, Chairman and CEO of Synology.

Some AI features available include a conversational assistant for troubleshooting and system management. More advanced AI agents are also in development, designed to handle tasks such as email drafting, formula searches, meeting transcription, and real-time translation, although no release date has been announced yet.

As these capabilities expand, privacy becomes even more important in the age of AI. The system already includes a feature that masks sensitive data such as names, ID numbers, email addresses, and financial information locally before anything is sent to external AI providers like OpenAI or Azure AI. 

Future updates will go further, with support for fully on-premise large language models, where no data needs to leave the organisation’s infrastructure.

Advertisement

Synology’s infrastructure is already at work in Singapore

The value of on-premise data infrastructure is already clear for Singapore businesses using Synology. 

Image Credit: I Love Taimei/ Lasalle College of the Arts

Food chain I Love Taimei, which has 17 outlets in Singapore, uses Synology’s DSM system to manage surveillance footage across all locations. This cuts management time by 65% and also allows the company to run AI-powered customer analysis without sending footage to the cloud.

LASALLE College of the Arts also uses Synology NAS for file storage and 4K video collaboration, allowing students and staff to access large project files easily across Mac computers without compatibility issues or rising costs.

Together, these examples show why some organisations are rethinking the assumption that everything belongs in the cloud.

Cutting backup costs without the cloud

Synology Product Manager Cody Hall unveils ActiveProtect Manager 2.0 at Computex 2026./ Image Credit: Synology

The same push toward more controlled, on-premise infrastructure also extends to backup. At Computex, Synology introduced ActiveProtect Manager 2.0, a centralised backup system that will launch in Q3 2026.

The key issue it addresses is cost. Most backup services charge per server, virtual machine, or device. ActiveProtect instead charges for the hardware, with no extra per-workload fees.

Advertisement

In some cases, customers have seen a lower total cost of ownership. For example, Taiwanese media company Info Times reduced setup costs by 65% and cut storage needs by 75%. Toyota also reduced its backup data by 75% through better storage efficiency.

ActiveProtect 2.0 works with existing systems, so companies don’t need to replace their current setup. It also uses machine learning to detect unusual backup activity and help prevent ransomware infections from being restored. 

And because everything is stored locally, recovery is faster—taking hours instead of days—and there are no cloud data transfer fees.

The bigger picture

Cloud still has an important role to play, whether for global access, extra computing capacity, or supporting teams across different regions.

Advertisement

What’s changing is that businesses are becoming more selective about what they keep in the cloud. Rather than moving everything to a single platform, many are deciding where data should live based on cost, performance, and compliance requirements.

For Singapore businesses that have quietly accepted rising cloud bills as part of the cost of doing business, it may be time to take a closer look at the numbers.

Explore Synology’s solutions here. 

Featured Image Credit: Synology

Advertisement

Source link

Advertisement
Continue Reading

Tech

Anthropic launches powerful Fable 5 model publicly, while keeping Mythos restricted over cybersecurity concerns

Published

on

In context: Anthropic’s latest release is really a story about control, not just capability. The company is offering two versions of the same underlying model: Claude Mythos 5 for a small circle of trusted partners, and Claude Fable 5 for everyone else. The split reflects a core challenge Anthropic is still trying to solve – how to deploy an extremely capable system into the wild without simultaneously handing attackers a new class of offensive tools.

Mythos has already shown what it can do when it is not heavily restricted. Since April, when an earlier preview was sent to about 150 organizations under the banner of Project Glasswing, users have reported more than 10,000 critical security flaws in their own systems. Those same capabilities could also be used by attackers looking to break in, rather than to patch security holes.

For that reason, Mythos 5 is staying behind the glass for now. Anthropic is keeping it in the hands of a “small group of cyberdefenders and infrastructure providers,” along with select biology researchers, and is coordinating with US government agencies as part of the rollout. Access is effectively on a need-to-know basis, with the company signaling that a broader “trusted access program” will come later.

Fable 5 is where Anthropic is testing what a general-purpose release of Mythos-class technology looks like under constraint. Technically, it runs on the same underlying model as Mythos 5, but with hard limits built in. The system is designed to refuse or redirect a long list of requests related to cybersecurity, biology, and chemistry. When those guardrails trigger, the query is silently routed to an older model, Claude Opus 4.8, instead.

Advertisement

Anthropic has also wired Fable 5 to watch for distillation, where a user tries to harvest large volumes of answers to train a smaller model of their own. If the system thinks that is happening, those requests are also redirected to Opus 4.8. In other words, the company is not only trying to control what the model will talk about, but also what others can learn from it.

Anthropic has been wrestling with these decisions for months. Diane Penn, the company’s head of product management, told Wired that testing and feedback since the April preview have helped shape the current strategy, even though it is still far from perfect.

“We’re trying to make improvements in a way that’s beneficial, even if we don’t have the perfect [solution] for every use case to start,” she says. “Out of all the different approaches, this emerged as the most viable and the best one. We just ended up feeling like this was the best product choice for users to get the maximum value out of Fable 5.”

For now, the filters are tuned to err on the side of over-blocking. Penn has acknowledged that some harmless queries will be routed to the older model. Anthropic says it wants to refine its classifiers over time but argues that this level of caution is the only way to justify a wider release at this stage.

Advertisement

The stakes are higher because Fable and Mythos are not just chatbots that respond to prompts and stop. Anthropic says both can run “unattended” for longer stretches than previous Claude models, carrying out sequences of instructions without constant supervision.

That shift toward more agent-like behavior could substantially boost software engineering and other technical work, especially given Fable 5’s stronger code generation and visual capabilities. But it also raises obvious questions about what happens if those capabilities are misused.

Anthropic’s pricing reflects how powerful it believes these systems are compared with its other models. Fable 5 and Mythos 5 cost $10 per million input tokens and $50 per million output tokens, roughly double the company’s other public models but still cheaper than the earlier Mythos Preview. The higher price reflects both the performance gains and the sense that these models are still positioned as specialized systems, not yet just another SKU in a growing catalog.

Around Anthropic, competitors are moving in a similar direction. OpenAI has rolled out its own advanced cybersecurity model to a small circle of partners and convened a working group that echoes Project Glasswing. Both companies are preparing for potential IPOs and are under pressure to show investors they can ship cutting-edge technology without triggering backlash over safety concerns.

Advertisement

Even some of the people watching from the outside say the unease is justified. Canadian finance minister François-Philippe Champagne told the BBC that public concern around Mythos stemmed from “it’s the unknown, unknown.”

Anthropic co-founder Jack Clark has made a similar point from the inside, arguing that the industry has not yet figured out how to slow itself down. “You want the option to be able to take your foot off the gas and put your foot on the brake,” he said. “Right now, it’s like the AI industry has a gas pedal, but it doesn’t have a brake pedal.”

Source link

Advertisement
Continue Reading

Trending

Copyright © 2025