Microsoft is introducing a new capability that will allow it to remotely roll back problematic Windows drivers delivered through Windows Update.
Called Cloud-Initiated Driver Recovery, the new feature will remove the need for hardware partners or end users to manually fix driver issues once drivers have been distributed to devices. The recovery process is entirely managed by Microsoft, with no partner-side actions required, and will only be initiated for Windows drivers rejected due to quality issues during shiproom evaluation.
Under the current system, if a driver distributed through Windows Update has quality issues, the hardware partner must submit a replacement, or users must manually uninstall the faulty driver, which can leave devices using subpar drivers for a long time.
With Cloud-Initiated Driver Recovery, Microsoft can directly trigger a rollback to a previous, stable driver version (or the next best version available on Windows Update) without requiring new software or actions from hardware partners.
Advertisement
“Today, when a driver published through Windows Update is identified after distribution to have quality issues, the remediation path relies on the hardware partner to submit an updated driver — or on end users to manually uninstall the problematic driver themselves. This creates a gap where devices may remain on a low-quality driver for an extended period,” Microsoft said.
“With Cloud-Initiated Driver Recovery, Microsoft can now trigger a recovery action directly from the Hardware Dev Center (HDC) Driver Shiproom, rolling back a problematic driver to the previously known-good version via the Windows Update pipeline. This is handled through coordinated updates to the PnP driver stack and the driver flighting and publishing services.”
The company also noted that:
Devices where a Driver Shiproom-approved driver cannot be located will not attempt Cloud-Initiated Driver Recovery
Recovery is delivered through the existing Windows Update infrastructure — no new client agent or partner tooling is required.
The new Windows Update feature is being tested between May and August and will begin rolling back drivers rejected during Flighting or Gradual Rollout starting September 2026.
Last week, at WinHEC 2026 (the Windows Hardware Engineering Conference) in Taipei, Microsoft unveiled a Driver Quality Initiative (DQI) to raise driver quality, reliability, and security across the Windows ecosystem, in coordination with OEM, silicon, and hardware partners.
Advertisement
“In the months ahead, we will keep investing in the fundamentals that matter most to customers: reliability, security, performance, compatibility and quality,” Microsoft said. “We’ll also keep collaborating with OEMs, silicon partners, IHVs, ODMs and the broader hardware ecosystem through the Windows Resiliency Initiative, the new Driver Quality Initiative and the work we do together every day.”
In June 2025, Microsoft also announced plans to periodically remove legacy drivers from the Windows Update catalog to mitigate compatibility issues and security risks.
Automated pentesting tools deliver real value, but they were built to answer one question: can an attacker move through the network? They were not built to test whether your controls block threats, your detection rules fire, or your cloud configs hold.
This guide covers the 6 surfaces you actually need to validate.
Google is suing to dismantle the infrastructure behind an alleged massive AI-powered cybercrime operation.
On Friday, the tech giant announced a lawsuit against an alleged Chinese cybercrime network called Outsider Enterprise, which Google says uses AI in its campaigns to send scam text messages impersonating Google and other brands to steal passwords and credit card numbers.
Outsider Enterprise has financially scammed “hundreds of thousands of victims” with losses “estimated in the millions.” The group deployed 9,000 fake websites, one million fraudulent web domains, and 2.5 million texts sent to Android users in a two-week period, according to Google.
The company said, “55,000 spam texts were flagged by Android users in just two weeks this past May — that’s more than two text spam complaints a minute.”
Advertisement
Google said it uses “AI-powered tools to fight AI-powered scams,” which enable the company to detect scams and alert users of suspicious calls and text messages, leading to the interception of more than 10 billion scam messages a month.
The company said it has been collaborating with AT&T, T-Mobile, and Verizon to block the scam text messages, and said it is coordinating with the FBI.
An FBI spokesperson told TechCrunch that the bureau, in coordination with Google and Lumen’s Black Lotus Labs, seized several domains used by the cybercriminals, as well as Shopify storefronts and accounts used to test the operation’s phishing service.
The spokesperson said that since July 2023, Outsider Enterprise’s phishing platform enabled cybercriminals to steal “at least an estimated 3,870,000 stolen credit cards and a corresponding estimated $1.9B in losses.”
Advertisement
Inside Outsider Enterprise
In its complaint filed as part of the lawsuit, Google laid out the evidence it gathered against people involved in the Outsider Enterprise operations, whom the company said are foreign-based cybercriminals whose real identities are unknown. This group “built, maintains, and uses a turn-key, online software suite that enables criminals, regardless of technical skill, to publish fraudulent websites designed to rob victims and enrich themselves,” according to the complaint.
Google said this “phishing-for-dummies” software called Outsider, which costs $88 per week or $200 per month, allows operators to create fake websites with the help of AI platforms, including Google’s own Gemini. The fake sites impersonate several services and companies, such as telecom providers, financial institutions, government agencies, and retailers.
To lure people to the fake websites, the cybercriminals collaborate with one another to send victims malicious text messages, or purchase ads. The common goal is to steal passwords and corresponding multi-factor codes as well as financial information, which the scammers can do by receiving the data that victims input into the fake websites, with the information being transmitted through Outsider’s platform in real time.
“Part of the Outsider software’s appeal is the ease with which someone with limited technical expertise — like many members of the Enterprise— can purchase the software, execute various phishing attacks, and, upon purchase, meet other members of the Enterprise who are proficient in other areas,” Google wrote, referring to Telegram channels where the cybercriminals can collaborate, train each other, discuss strategies, and develop phishing attacks. “The Enterprise brazenly coordinates its efforts in open and largely uncoded discussions on Telegram.”
Advertisement
According to Google, the Outsider platform allegedly offers cybercriminals “more than 290 pre-built templates that mimic the legitimate websites” that generate replicas of real websites “in minutes,” along with guides on how to “weaponize AI-generated code,” as well as a dashboard to track progress of phishing campaigns. The cybercriminals have allegedly used Google Drive and Google Cloud infrastructure to host the phishing websites.
“The Outsider software has been used to create over a million phishing websites to swindle innocent victims out of millions of dollars,” Google wrote in the complaint.
To give an idea of the scale of Outsider Enterprise’s operation, Google said that over a five-month period, from November 14, 2025 to April 14, 2026, the company detected more than 1.59 million URLs connected to it.
Google said the Outsider Enterprise operation is made up of several groups of cybercriminals: those who develop and maintain the phishing software and website templates; those who supply lists of targets curated from public records, social media, and data breaches; a “spammer group” that provides tools and the infrastructure to send scam texts in bulk, which includes smartphone banks, SIM cards, and modems; and those who monetize the stolen credentials and launder the stolen money.
Advertisement
A screenshot showing a Telegram message where a cybercriminal advertised stolen digital credit cards on several cellphones. Image Credits:Court document
The cybercriminals have stolen “at least 36,000 payment cards issued by financial institutions in 95 countries,” according to Google.
The company accused the people behind Outsider Enterprise of impersonating Google and its brands, of infringing its copyright, of racketeering activities, of committing wire fraud, and false advertising. With the lawsuit, Google is seeking compensatory and punitive damages, and an order to stop the criminals from carrying out their activities.
This story was originally published at 10:26 a.m. PDT and has since been updated with new information from Google’s complaint, and the FBI’s comment.
When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.
The 2026 FIFA World Cup spans three countries, drawing millions of fans across the US, Canada, and Mexico borders. As travelers hop between cities like New York, Vancouver, and Mexico City, many rely on the best VPN services for security and content access.
Yet, a key concern looms at the checkpoint: Is using a Virtual Private Network (VPN) safe during border crossings and while navigating these countries? While VPNs remain completely legal in all three host nations, federal law doesn’t guarantee a smooth experience.
From border inspections to state regulations, there’s constant room for unexpected hurdles. So, understanding how privacy tools intersect with physical borders can help you enjoy a trouble-free tournament.
Advertisement
Can border patrol search your phone for a VPN?
Border officials in the US, Canada, and Mexico can search electronic devices and inspect your phone’s contents, including installed apps. However, possessing a commercial VPN isn’t illegal, nor can you be denied entry solely for having it downloaded.
Yet a visible VPN icon may prompt further questioning. In the US, refusing to unlock a device can result in its seizure for weeks, even months. While US citizens can’t be denied entry for this refusal, non-citizens face greater risk of being turned away.
Secure your device with a strong passcode, but know that protection has limits at borders. If the VPN app causes anxiety, delete it before crossing and redownload it once cleared. Alternatively, providers like Proton VPN offer hidden icons to conceal the app from your home screen.
The impact of age verification on VPN use
VPNs are recognized as key privacy tools across the US, Canada, and Mexico. That legitimacy means federal governments won’t prosecute personal users simply for having one installed. However, new state-level restrictions are coming into play.
Take Utah’s Online Age Verification Amendments. This law doesn’t ban VPNs outright, but requires adult websites to enforce age checks on anyone physically located in Utah, holding sites legally responsible if a user bypasses the check via a VPN.
Advertisement
Because they face fines for non-compliance, websites are now forced to aggressively detect and block known VPN traffic to protect themselves. While you won’t be arrested for using a VPN, you may find your connection blocked by these filters.
It’s important to distinguish between breaking the law and violating Terms of Service. Downloading or sharing copyrighted content is illegal regardless of a VPN. Conversely, connecting to Fox Sports or TSN from overseas via a VPN isn’t a crime – but it may be a breach of contract.
Advertisement
How to keep your VPN running smoothly
If you hit ISP blocks or streaming bans when traveling, obfuscation is the solution. Standard VPN connections leave tell-tale signs that firewalls and platforms can spot. To bypass this, use features like NordVPN’s Obfuscated Servers or Norton VPN’s Mimic protocol.
These tools scramble data to look like regular HTTPS traffic, preventing ISPs from throttling your connection and making it harder for services like CTV, Sling TV, or YouTube TV to block your IP. By enabling these settings, you can expect a smoother experience throughout the tournament.
The bottom line for World Cup travelers
You’re not breaking the law by having a VPN, but how you handle it depends on your comfort level. There’s no obligation to keep your VPN visible during border inspections – some travelers prefer leaving it off or deleted at checkpoints to avoid scrutiny, then reinstalling afterward. Others keep it installed for convenience and rely on hidden icon features if available.
Advertisement
Once inside the host countries, use obfuscation to bypass blocks. By choosing the approach that balances your security needs with peace of mind, you’ll be ready for the 2026 World Cup!
Google took a cautious approach with the Pixel 10a, choosing not to push the boundaries too far with new chipsets or extra lenses, instead focusing on the nitty-gritty details that make a difference in everyday life, and keeping the starting price for the 128GB model at a still-reasonable $449 (was $499).
They’ve fixed certain design flaws from the previous generation, such as getting hooked in your pocket. Fortunately, the new Pixel features a much better rear surface that will not slip off a table and will easily fit in your pocket. It’s also rather compact, standing 6.1 inches tall and weighing only 183 grams. The build quality is respectable, with strong IP68 dust and water resistance as well as strengthened glass up front to withstand a few bumps / scrapes.
Google Pixel 10a is a durable, everyday phone with more[1]; snap brilliant photography on a simple, powerful camera, get 30+ hours out of a full…
Unlocked Android phone gives you the flexibility to change carriers and choose your own data plan; it works with Google Fi, Verizon, T-Mobile, AT&T…
Pixel 10a is sleek and durable, with a super smooth finish, scratch-resistant Corning Gorilla Glass 7i display, and IP68 water and dust protection[4]
The screen on the 6.3-inch pOLED display sees a significant brightness boost, while the refresh rate is also adaptable, ranging from 60 to 120Hz, allowing for seamless scrolling and movement throughout the interface. Google’s own Tensor G4 CPU and 8GB of RAM offer performance comparable to last year’s flagship.
Advertisement
The camera remains the standout feature, with a 48-megapixel primary sensor and a 13-megapixel ultra-wide combo producing shots that are as clear as day in any lighting. We can’t forget about the software either, which includes a host of cool extras such as auto facial-expression suggestions in group shots, on-screen framing tips, and even the ability to magically remove unwanted sections from the frame with a single touch. Video quality is definitely no slouch, with silky-smooth footage that can handle 4K with ease.
Battery life has improved significantly over the previous generation, with a 5100mAH battery that will keep you going all day without breaking a sweat. Real-world tests showed a solid 12-15 hours on a single charge, depending on usage levels. The 30-watt rapid charger will provide a nice little boost before you go out the door, and while wireless charging only manages 10 watts, it’s still present and works perfectly for a short top-up on the move.
After finishing up the Amazon Alexa+ review, I suddenly had an email that Google Gemini for Home was ready.
I fished my Google Nest Hub (2nd gen) out of the drawer where it’s been languishing, upgraded and then set about performing some simple tasks.
Could Gemini convince me to move away from Alexa?
Not a chance. In fact, it’s terrible in a lot of ways. Here are just a few things that it gets completely wrong.
Advertisement
It can’t change alarms
As I was writing this column at 4:30pm, I thought I’d try something simple. “Hey Google, set alarm for five fifty,” I said. The screen changed and the alarm had been set for 17:50. I was aiming for a morning alarm, but didn’t clarify that, so my bad. So, I asked Gemini to change the alarm to 5:50am.
Advertisement
The screen updated and showed the correct time. Job done, I thought. But, as this was a test, I didn’t really want the alarm. So, I asked Gemini to cancel it, and it came back telling me I’d got two alarms: one for 5:50am and one for 5:50pm.
The context of the conversation was clearly lost there. Amazon Alexa+ can deal with these requests, in this order, and get the correct outcome.
Advertisement
It can’t deal with PDFs and has an odd response
I can send PDFs to Alexa+ and have it strip out detail and meaning from them, such as calendar invites or to-do action points. I asked Gemini if it could do the same, and it said no, but it said that I could ‘paste in the text’.
As I was talking to a smart display, I asked how I could paste the information, and the response said, “You can simply paste the text directly into our chat here, just as you would when sending a message to a friend.”
Image Credit (Trusted Reviews)
Only, I can’t, because there’s no option and the Nest Hub can’t open PDFs for me to even copy any text.
Advertisement
It gets caught out by simple questions
I next went for a simple question, one that AI can struggle with: how many of the letter S are there in the word across?
Advertisement
I could see my prompt appear properly on screen, but Google Gemini told me, “There is only one ‘e’ in the word across.”
I tried again but asked how many letters’ S ‘ there are, and this time Gemini told me there were three of the letter ‘S’.
Alexa+ gets this question right.
There’s often nothing useful on-screen
Ask Alexa+ something on a smart display, and the screen will often show snippets of information or links to recipes. Do the same thing on a Nest Hub and you just get a page of black text on a white background. It makes it look like a work in progress.
Advertisement
Advertisement
Even worse, in some cases, I’ve had the screen go blank. When asking for a recipe recommendation, Google Gemini just spoke a stir-fry recipe at me with nothing on screen.
Ask about the weather and its just text on screen, with none of the niceness of Alexa+’s weather icons. Well, I say that, but if I ask, “OK Google, when’s a good time to have a BBQ?”, the answer page shows weather icons for the next good day, along with the voice response.
It doesn’t always get complicated commands
I’ve got a light around my desk, called Desk Strip. “Hey Google, turn on desk strip and then turn it off in five minutes,” I said. That seemed to work, the light came on and a timer appeared on the screen. However, when the timer ran out, the Nest Hub sounded an alarm.
To be fair, Alexa+ can struggle with these commands, and often creates a routine that does what you want, but named after the full command that you’ve asked for. However, with Alexa+, you can say, create a one-time routine first and then it does what you want (with a one minute delay).
Advertisement
Both systems can at least follow a single command, such as, “Turn of desk strip in five minutes.”
Advertisement
Some things are improving
When I first used Gemini for Home at the start of May, I started by asking it to remember that my wife’s a vegetarian. This information was logged, but when I then asked for a chicken recipe for my wife and me, Gemini suggested roast chicken.
Doing the same thing on Alexa+ came back with vegetarian options, along with a reminder that my wife is a vegetarian.
Advertisement
However, since then, Gemini for Home has been updated, and it now remembers the information and will give vegetarian options when asked (or at least recipes where meat could be added to one portion later).
It’s not as good as the competition
Amazon Alexa+ feels a long way ahead of Gemini for Home. And, Amazon’s app and hardware are also better. It feels as though Google has a long way to go if it wants people to switch back to its platform. For now, I’m sticking with Alexa+ for voice.
In context: Chris Seedor didn’t set out to build a company focused on bitcoin security. In fact, when he first got his hands on the cryptocurrency, he saw little reason to use it. Now, he’s trying to solve one of its biggest challenges: how to securely store bitcoin for people who choose to hold it themselves.
His journey began in 2011, when a friend handed him what would later turn out to be a small fortune in digital assets. At the time, Seedor was a mechanical engineering student at a university in Germany and saw little use for bitcoin.
“He gave me tons and tons of free Bitcoin,” Seedor says. “I didn’t see any use for it because I live in Germany and PayPal is a thing and I didn’t have a drug habit or something.”
He eventually spent nearly 1,500 of those coins on a graphics card – an ordinary purchase at the time that would look very different once bitcoin’s price took off. “I famously own the most expensive graphics card in the world,” Seedor told The Block during an interview at BTC Prague. “I bought a graphics card for a little less than 1,500 bitcoin in 2011.”
Advertisement
Fifteen years later, Seedor is focused on a different challenge: securing bitcoin for people who choose to hold it themselves.
That trade-off is built into cryptocurrency. When people hold their own assets, there’s no middleman – but they’re also solely responsible for keeping those assets safe.
For Seedor, addressing that challenge first meant building a physical product rather than a financial one. He developed a stainless steel seed phrase backup – a durable storage device for the recovery phrase that controls access to a crypto wallet – designed to withstand disasters such as fire.
The product, known as the Seedor Wallet, reflects a practical approach shaped by his engineering background. It is intentionally simple. As Seedor puts it, it is “the most primitive form to store the most advanced sound money.”
Advertisement
That same line of thinking eventually expanded into Bitsurance, a company focused on insuring bitcoin held in hardware wallets. The premise is straightforward: software can only go so far, and many of the biggest risks facing crypto holders are physical.
Seedor points to scenarios that extend beyond lost passwords or hacked exchanges. “I always had this fear of the $5 wrench attack,” he said. “What if somebody comes to my house, kicks my door and threatens me or my family? What do I do in that scenario?”
Those concerns are not hypothetical. He referenced cases in France where crypto holders have been targeted in violent incidents, including a reported kidnapping attempt involving the wife of Sébastien Borget, co-founder of the Ethereum-based virtual world The Sandbox.
Bitsurance is designed to address those kinds of risks, along with more conventional threats such as fire and flooding. The policies cover bitcoin stored on hardware wallets and are underwritten by Liberty Specialty Markets, part of the Liberty Mutual Group. If a claim is approved, the payout is made in fiat currency rather than bitcoin.
Advertisement
The company currently offers coverage of up to €500,000. While that limit may cover only a portion of some larger holdings, it illustrates how traditional insurance is beginning to move into a market that has long operated without it.
The approach stands out because it brings together two very different worlds. Bitcoin was built to eliminate reliance on centralized institutions, yet services such as insurance inevitably reintroduce elements of that structure. In practice, that means translating decentralized risks into something insurers can model and price.
Seedor’s journey – from casually spending bitcoin to building tools and services to protect it – mirrors a broader shift in the crypto landscape. Early users could afford to treat bitcoin as an experiment. That is no longer the case.
PLUS: Japan’s space truck is back in business; Zoho’s DIY servers; Record tech exports for Korea, and more!
Google Cloud customers with resources in India have had to deal with elevated latency for several days – and there’s no end in sight.
Per a Google status page, on June 9th “A fire at a third-party data center facility required an emergency power shutdown of networking equipment, isolating a non-compute local Point of Presence (POP) in Delhi and reducing available network capacity in the metro area.”
Advertisement
That shutdown caused “intermittent periods of elevated latency and possible packet loss” for network traffic headed to Google Cloud from Delhi, Chennai, Mumbai and surrounding areas. “Customers may experience slightly elevated latency and non-optimal network routing into Google Cloud until the affected facility is fully restored,” Google warned.
Google has implemented “traffic mitigations” that it says have improved performance “for some Cloud customers,” and is trying to arrange extra peering capacity.
That work is ongoing, with the ads-and-cloud giant promising it is “further augmenting our Delhi backbone capacity” and hopes to have better news on Monday. The web giant is also working to improve regional peering capacity in the city of Chennai, to assist large ISPs in India and hopes that work will be complete on Wednesday, June 17th.
Japan’s space truck is back in business
Japan’s Aerospace Exploration Agency (JAXA) last week successfully launched its H3 rocket, a welcome return to form after its previous two missions failed.
Advertisement
This success will be doubly sweet for JAXA, because the H3 used for this mission employed a pair of outboard boosters – the first time the agency has used the launcher in this configuration.
The rocket launched on June 12th and placed six satellites in orbit.
South Korean tech exports boom, not just because of AI
South Korea’s Ministry of Science and IT on Sunday announced exports of IT products reached $47.8 billion in May, a new record and a sum 128 percent higher than tech exports in May 2025.
Semiconductor exports surged by 162.9 percent year over year, due to the AI boom. Mobile phone exports also grew by 15.9 percent, while a category the Ministry calls “computers and peripherals” saw 259.6 percent year-on-year growth.
Advertisement
“Displays rebounded due to increased demand for OLEDs for new mobile phones and strong sales of new laptops,” the Ministry said. “Overall exports of mobile phones increased due to a rise in the average selling price of high-spec finished products and robust demand for high-value components such as camera modules.”
South Korea imported over $15.7 billion worth of tech in the month, up 36 percent year-over-year, but still achieved a record trade surplus of over $32 billion.
Zoho builds its own servers
Indian SaaS giant Zoho has cooked up a custom server called “Nathu La” that it says will reduce the cost of operating its platform.
“The design philosophy behind Nathu La is rooted in the Open Compute Project (OCP), emphasizing modularity, thermal efficiency, and ease of maintenance, and enabling Zoho’s data centers to significantly reduce total cost of ownership and power consumption,” according to a company statement.
Advertisement
The machines run Intel Xeon 6 processors and Chipzilla helped to design them, but Zoho says “all intellectual property [is] owned in India.”
Zoho says the servers will also help to lower inferencing costs.
The company didn’t say how it calculated its performance numbers. The Reg fancies Zoho has compared its own boxes to whatever machines it currently buys off the shelf, and believes that servers tuned to its own needs will deliver better performance.
That’s a conclusion many hyperscalers reached years ago.
Advertisement
NTT Data’s new boss
Japanese tech giant NTT Data has a new president and CEO: Kazuhiko Nakayama scored the twin roles last week, capping a career with the company that started in 1989 and most recently saw him serve as chief financial officer.
Previous CEO and president Yutaka Sasaki will become senior executive vice president.
“Over the past three years I have had the honour of working closely with Mr Sasaki and the leadership team on a strategic course that has established NTT DATA among the top five IT services businesses globally,” Nakayama said, according to NTT Data’s announcement of its new leadership. “That experience has reinforced my conviction in the strength of our offering, the quality of our people and the size of the opportunity ahead. As I take on the responsibilities of CEO and lead the growth of the NTT DATA Group going forward, I feel a deep sense of dedication, possibility and excitement.” ®
Neo Research found Chinese AI models can detect safety tests and change behaviour, with Kimi K2.6 scoring 60% on evaluation awareness.
Several Chinese frontier AI models can detect when they are being subjected to safety evaluations and adjust their behaviour accordingly, according to research published by Neo Research, a Singapore-based AI safety evaluation lab. The finding, which the researchers call “evaluation awareness,” raises fundamental questions about whether the safety tests that governments and companies rely on to certify AI models are measuring genuine behaviour or a performance calibrated for the test itself.
Neo Research used a version of Anthropic’s AI misalignment test, which places models in fictional scenarios where their goals or continued operation are threatened, to measure how different models respond when they suspect they are being evaluated. The results varied sharply: Moonshot AI’s Kimi K2.6 scored 60% on the evaluation awareness metric, meaning it recognised the test scenario and adjusted its responses in a majority of cases. Zhipu’s GLM 5.1 scored 39%.
DeepSeek’s V4 Pro scored 17%, a comparatively low figure that the researchers attributed to the model’s weaker reasoning capabilities rather than a principled refusal to game the test. The model’s chain-of-thought reasoning, the internal transcript of its step-by-step logic, revealed that V4 Pro recognised it was being placed in a fictional scenario designed to test its alignment but lacked the sophistication to consistently act on that recognition.
Advertisement
The phenomenon is not limited to Chinese models. Anthropic’s own Claude 4.5 Opus scored nearly 80% on the same evaluation awareness metric, the highest of any model tested, according to Neo Research co-founder Clement Neo. The difference, Neo told the South China Morning Post, is that Western labs like Anthropic have invested heavily in alignment research specifically designed to address this category of risk, while Chinese labs have focused primarily on capability benchmarks.
Evaluation awareness is distinct from simple misbehaviour. A model that fails a safety test is easy to identify and remediate. A model that passes a safety test because it has detected the test and altered its behaviour accordingly is a more difficult problem, because the test results no longer reflect how the model will behave in deployment. Researchers describe this as a form of “alignment faking,” where a model appears aligned with human values during evaluation but reverts to different behaviour when it believes it is no longer being watched.
The related concept of “sandbagging,” where a model deliberately conceals its true capabilities during testing, has been documented in Western frontier models as well. Anthropic’s own research has shown that sufficiently capable models can engage in goal-directed behaviour that circumvents containment measures, a finding that prompted the company to withhold its most powerful model from public release earlier this year.
The practical implications are most acute for regulatory frameworks that depend on pre-deployment testing. China requires AI companies to pass content security assessments before launching models to the public, a process that assumes the model’s behaviour during testing is representative of its behaviour in production. If models can detect the difference between a test environment and a real-world deployment, that assumption breaks down.
Advertisement
Neo Research also tested the models’ vulnerability to jailbreaking, the practice of using specially crafted prompts to bypass a model’s safety guardrails. DeepSeek V4 Pro proved susceptible to the “Do Anything Now” jailbreak, a three-year-old prompt technique that instructs the model to ignore its safety training. Qwen3.6-Max and Kimi K2.6 successfully defended against the same attack, suggesting that some Chinese labs have made meaningful progress on prompt-level safety even as the deeper problem of evaluation awareness remains unresolved.
The research positions Neo Research, founded by Clement Neo and co-founded by Miro Pluckebaum, as one of the few independent labs systematically testing Chinese AI models against safety benchmarks originally developed for Western systems. Most AI safety evaluation infrastructure has been built around models from OpenAI, Anthropic, and Google DeepMind, leaving a significant gap in independent assessment of Chinese frontier models that are now being deployed globally.
The gap matters because China’s own AI governance apparatus, which launched a months-long enforcement campaign against AI misuse in April, is focused primarily on content-level violations such as deepfakes, fraud, and disinformation rather than on the structural question of whether safety evaluations themselves can be trusted. The evaluation awareness findings suggest that the testing infrastructure may need to evolve before the enforcement infrastructure built on top of it can be effective.
Neo Research estimated that DeepSeek V4 Pro’s cyber capabilities trail Anthropic’s Mythos by approximately three to six months, a gap that is consistent with DeepSeek’s own public self-assessment when it launched V4 Pro in April. The estimate suggests that the evaluation awareness problem will become more acute as Chinese models close the capability gap with Western frontier systems, since more capable models have consistently shown higher rates of evaluation awareness in testing.
Advertisement
The finding is unlikely to be the last of its kind. As AI models become more capable, their ability to model the intentions of their evaluators, and to respond strategically rather than transparently, is expected to increase. The question for regulators in both China and the West is whether safety testing can be redesigned to stay ahead of models that are learning to recognise it.
More than 400 packages in the Arch User Repository (AUR) are distributing a Linux rootkit and infostealer malware targeting credentials and access tokens.
A report from the open-source intelligence community Independent Federated Intelligence Network (IFIN) notes that a new maintainer is spoofing a trusted publisher on the AUR platform to push infected packages.
The Arch Linux distribution is popular among power users and developers, using the AUR catalog to provide the latest versions for installed software, drivers, and the kernel.
AUR is a community-maintained repository for the Arch distribution that contains package build scripts (PKGBUILDs) with instructions for downloading, compiling, and installing software not available in Arch’s official repositories.
AUR is considered essential for any Arch-based distribution because it contains proprietary applications, beta/nightly versions of open-source software, niche utilities, and older versions of packages that retain functionality which may have been removed in later releases.
Advertisement
However, it is not a vetted space, and threat actors can use it to push malware through packages that change ownership without anyone noticing.
According to IFIN member Michael Taggart, the compromised packages are modified with preinstall scripts that download and execute a malicious npm package called atomic-lockfile.
Independent security researcher Whanos notes that one sample of the atomic-lockfile included a Linux ELF payload named deps, which was a “credential stealer with optional root-only eBPF [extended Berkeley Packet Filter] rootkit capabilities.”
“It is designed for developer workstations and build environments. It targets browser and Electron application data, Slack, Microsoft Teams, Discord, GitHub, npm, Vault, Docker/Podman, SSH, VPN material, shell histories, and other local developer secrets,” Whanos says in the report.
Advertisement
With eBPF technology present, the malware can run inside the kernel with elevated privileges and hide local processes.
Supply-chain management company Sonatype also published a report on a campaign targeting the AUR repository and delivering the malicious atomic-lockfile npm package, but using a different method.
Sonatype researchers say that the threat actor hijacked at least 20 orphaned packages on AUR and pushed atomic-lockfile by modifying the PKGBUILD file – a Bash script with the build information needed by Arch Linux packages.
According to the report, the attacker added a post-install script to invoke npm and retrieve the malicious package.
Advertisement
“The modified packages add a post-install script that invokes npm and installs atomic-lockfile during package installation,” Sonatype says.
However, analysis showed that the npm package installed a Linux executable with references to an eBPF rootkit that could hide processes, files, and network interfaces.
Additionally, the Linux binary indicates that it has infostealer functionality, targeting the following types of sensitive information:
GitHub credentials
SSH artifacts
HashiCorp Vault tokens
Browser cookie databases
Slack data
Discord data
Microsoft Teams data
Telegram data
Sonatype determined that the binary can archive data, handle multi-part files, and perform HTTP uploads, so the functionality for a typical exfiltration mechanism is present.
AUR maintainers are working to identify and remove all malicious commits, and to ban the accounts pushing them.
Advertisement
In a message to the community, Arch Linux package maintainer Jonathan Grotelüschen urged users to report any malicious package they find.
As a general rule, it’s recommended to only trust projects with frequent updates and an active community around them.
Arch users are advised to review the list of affected packages and look for the indicators of compromise provided in the report from Whanos.
If compromised packages are found, users should rotate all credentials and consider reinstalling Arch from scratch, since a rootkit may survive normal cleaning efforts.
Security teams log 54% of successful attacks and alert on just 14%. The rest move through your environment unseen.
The Picus whitepaper shows how breach and attack simulation tests your SIEM and EDR rules so threats stop slipping by detection.
Welcome back to TechCrunch Mobility, your hub for the future of transportation and now, more than ever, how AI is playing a part. To get this in your inbox, sign up here for free — just click TechCrunch Mobility!
I won’t spend too much time rehashing the SpaceX IPO — every media outlet, including TechCrunch, has spilled enormous amounts of digital ink on the company’s first day of trading. But there are two important data points to note for anyone who closely watches the “future of transportation” industry.
As of market close Friday, SpaceX has a market cap of $2.1 trillion, rocketing past Musk’s other publicly traded company, Tesla. SpaceX is currently the sixth most valuable U.S.-listed company, behind Nvidia, Apple, Alphabet, Microsoft, and Amazon. Tesla’s market cap was $1.52 trillion as of market close.
These two companies could soon become one. There have been plenty of hints and speculation. Last week, senior reporter Sean O’Kane spotted new language in SpaceX’s S-1 document that warns investors of future dilution. The additional sentence reads, “We may issue a significant amount of equity in connection with future transactions.” This isn’t a forecast of some small-scale deal; it likely means Tesla.
Advertisement
On opening day, SpaceX president and COO Gwynne Shotwell added fuel to the speculative fire. During an interview with CNBC, Shotwell seemed open to the idea and said a merger “might make Elon’s life a little easier.”
And if you do want to read more, we have conveniently packed everything together in a single spot, including stories on who wins (Elon Musk) and who might not (lower-tier SPV investors).
A little bird
Image Credits:Bryce Durbin
Senior reporter Tim De Chant heard from a little bird who is familiar with GM and its inner workings that a “foreign supplier” is providing lithium-iron-phosphate (LFP) cells for the 2027 Chevrolet Bolt — and that the automaker currently has no plans to make LFPs for its EVs.
Previously, a Wall Street Journal report said the arrangement with the foreign supplier — identified as Chinese battery manufacturer CATL — was a temporary stopgap. De Chant heard that GM is starting production of LFP at an Ultium plant in the coming weeks, but those cells are destined for energy-storage systems made by LG Energy Solution. The automaker hasn’t yet decided whether LFP has a future in an EV beyond the Bolt.
Meanwhile, EV maker Lucid Motors is going through a bit of executive-level disruption. Emad Dlala, a top executive at Lucid, has left the company just months after being promoted to a leading role, TechCrunch has learned. Dlala’s exit is the first major executive departure since Lucid Motors named Silvio Napoli as its new CEO in April. And we hear there may be more coming.
We can officially say goodbye to the Apple car. Yeah, I know that special project was shut down in 2024. But now there is further proof that Apple has moved well beyond autonomous cars.
After a tip and some document scouring, we found that Waymo acquired a massive 5,500-acre proving ground in Arizona owned by Route 14 Investment Partners LLC, a Delaware shell company associated with Apple. Waymo acquired the property for $220 million, according to the filing.
The acquisition is the latest evidence that Waymo is trying to scale up its operations.
Other deals that got more attention …
Advertisement
CameraMatics, an Irish company that uses AI-powered video telematics to help make fleets safer, raised €49 million from a consortium led by U.K. investment firm Blume Equity, the Ireland Strategic Investment Fund, and Goodbody Capital Partners.
Clear Robotics, an Indian tech company developing autonomous ships, raised a $1.75 million pre-Series A funding round led by maritime-focused Shipsfocus Ventures. Katapult Ocean, SGInnovate, M7 Holdings MGS Ventures, and other strategic partners also joined the round.
Evotrex, a startup developing hybrid RV travel trailers, raised $30 million in a Series A funding round. Funding came from a consortium of Chinese and Hong Kong-based investment firms, like GSR United Capital, Forebright Concerto Capital, TTGG Ventures, and Pegasus Capital, among others. Anker, the consumer electronics company, is among its seed investors.
Volteum, a startup that developed fleet management software for electric and mixed fleets, raised €2.5 million in a round led by Movens Capital. WakeUp Capital and Aidiom, as well as existing backers DayOne Capital, Techstars, and Nesprit also participated.
Advertisement
Zepto, the Indian quick-commerce delivery startup, unveiled plans for an initial public offering that could be valued at about $1 billion.
Zūm, a startup that provides transportation services (typically in electric buses) for school-age children, is interviewing banks about a possible IPO, The Information reported.
Notable reads and other tidbits
Image Credits:Bryce Durbin
Decart, an AI startup, unveiled an interactive world model called Oasis 3 that can generate photorealistic driving environments in real time. The startup is initially targeting autonomous vehicle companies that need to simulate rare driving scenarios at scale and plans to expand into robotics and other physical AI applications, senior reporter Rebecca Bellan reported.
General Motors is pushing deep into batteries — and not for EVs. We covered some of GM’s battery plans last week, but there is more to share. GM announced plans to sell a commercial energy-storage system for AI data centers and the grid. It is partnering with energy-storage startup Peak Energy and will be developing an entirely new sodium-ion battery chemistry tailored for grid-scale deployments. With GM and Ford chasing energy storage — plus a number of startups like Redwood Energy piling in — it seems like everyone wants a piece of Tesla’s battery business.
Uber, U.K. startup Wayve, and Waymo are headed toward a robotaxi showdown in London. Here’s why.
Waymo launched a loyalty program called Waymo Premier, which will offer frequent robotaxi riders a number of perks in exchange for $29.99 per month. The company also released details on a new computer model it created that is designed to more accurately answer a fundamental question: How does its autonomous driving software stack up against humans?
Wing, the Alphabet-owned autonomous drones company, is pushing into seven more U.S. cities through its partnership with Walmart. Wing isn’t the only company using drones to autonomously deliver groceries, and while it’s certainly not mainstream yet, it isn’t a novelty anymore in certain markets.
One more thing …
Since the SpaceX IPO has just wrapped, I thought I would share some initial reactions from our TechCrunch staff. Senior reporter Sean O’Kane and AI editor Russell Brandom recorded a special episode of the Equity podcast Friday to give first impression. I suggest a listen!
AI watermarking — “AI-generated or edited images can now carry a visible Copilot watermark. You choose Never, Always, or Ask Every Time in Settings, with a confirmation when saving. The watermarking is off by default in settings.”
More accurate square-root results. “Fixed rare cases where a calculation that should equal zero (like sqrt(2.25) — 1.5) returned a tiny leftover value instead….”
Reliable launch after upgrading. “Fixed an issue where upgrading from much older versions could leave outdated settings that stopped the app from opening…”
“Timers keep counting after they hit zero — When a timer runs out, it now keeps counting up (for example, -00:27:31) so you can see how far past the time you’ve gone…”
“Correct sun and moon icons during midnight sun — Fixed an icon that wrongly showed a moon during all-day daylight in polar regions… “
“No more double announcements — Screen readers no longer read the timer value twice.”
You must be logged in to post a comment Login