Connect with us
DAPA Banner

Tech

Nvidia’s DGX Station is a desktop supercomputer that runs trillion-parameter AI models without the cloud

Published

on

Nvidia on Monday unveiled a deskside supercomputer powerful enough to run AI models with up to one trillion parameters — roughly the scale of GPT-4 — without touching the cloud. The machine, called the DGX Station, packs 748 gigabytes of coherent memory and 20 petaflops of compute into a box that sits next to a monitor, and it may be the most significant personal computing product since the original Mac Pro convinced creative professionals to abandon workstations.

The announcement, made at the company’s annual GTC conference in San Jose, lands at a moment when the AI industry is grappling with a fundamental tension: the most powerful models in the world require enormous data center infrastructure, but the developers and enterprises building on those models increasingly want to keep their data, their agents, and their intellectual property local. The DGX Station is Nvidia’s answer — a six-figure machine that collapses the distance between AI’s frontier and a single engineer’s desk.

What 20 petaflops on your desktop actually means

The DGX Station is built around the new GB300 Grace Blackwell Ultra Desktop Superchip, which fuses a 72-core Grace CPU and a Blackwell Ultra GPU through Nvidia’s NVLink-C2C interconnect. That link provides 1.8 terabytes per second of coherent bandwidth between the two processors — seven times the speed of PCIe Gen 6 — which means the CPU and GPU share a single, seamless pool of memory without the bottlenecks that typically cripple desktop AI work.

Twenty petaflops — 20 quadrillion operations per second — would have ranked this machine among the world’s top supercomputers less than a decade ago. The Summit system at Oak Ridge National Laboratory, which held the global No. 1 spot in 2018, delivered roughly ten times that performance but occupied a room the size of two basketball courts. Nvidia is packaging a meaningful fraction of that capability into something that plugs into a wall outlet.

Advertisement

The 748 GB of unified memory is arguably the more important number. Trillion-parameter models are enormous neural networks that must be loaded entirely into memory to run. Without sufficient memory, no amount of processing speed matters — the model simply won’t fit. The DGX Station clears that bar, and it does so with a coherent architecture that eliminates the latency penalties of shuttling data between CPU and GPU memory pools.

Always-on agents need always-on hardware

Nvidia designed the DGX Station explicitly for what it sees as the next phase of AI: autonomous agents that reason, plan, write code, and execute tasks continuously — not just systems that respond to prompts. Every major announcement at GTC 2026 reinforced this “agentic AI” thesis, and the DGX Station is where those agents are meant to be built and run.

The key pairing is NemoClaw, a new open-source stack that Nvidia also announced Monday. NemoClaw bundles Nvidia’s Nemotron open models with OpenShell, a secure runtime that enforces policy-based security, network, and privacy guardrails for autonomous agents. A single command installs the entire stack. Jensen Huang, Nvidia’s founder and CEO, framed the combination in unmistakable terms, calling OpenClaw — the broader agent platform NemoClaw supports — “the operating system for personal AI” and comparing it directly to Mac and Windows.

The argument is straightforward: cloud instances spin up and down on demand, but always-on agents need persistent compute, persistent memory, and persistent state. A machine under your desk, running 24/7 with local data and local models inside a security sandbox, is architecturally better suited to that workload than a rented GPU in someone else’s data center. The DGX Station can operate as a personal supercomputer for a solo developer or as a shared compute node for teams, and it supports air-gapped configurations for classified or regulated environments where data can never leave the building.

Advertisement

From desk prototype to data center production in zero rewrites

One of the cleverest aspects of the DGX Station’s design is what Nvidia calls architectural continuity. Applications built on the machine migrate seamlessly to the company’s GB300 NVL72 data center systems — 72-GPU racks designed for hyperscale AI factories — without rearchitecting a single line of code. Nvidia is selling a vertically integrated pipeline: prototype at your desk, then scale to the cloud when you’re ready.

This matters because the biggest hidden cost in AI development today isn’t compute — it’s the engineering time lost to rewriting code for different hardware configurations. A model fine-tuned on a local GPU cluster often requires substantial rework to deploy on cloud infrastructure with different memory architectures, networking stacks, and software dependencies. The DGX Station eliminates that friction by running the same NVIDIA AI software stack that powers every tier of Nvidia’s infrastructure, from the DGX Spark to the Vera Rubin NVL72.

Nvidia also expanded the DGX Spark, the Station’s smaller sibling, with new clustering support. Up to four Spark units can now operate as a unified system with near-linear performance scaling — a “desktop data center” that fits on a conference table without rack infrastructure or an IT ticket. For teams that need to fine-tune mid-size models or develop smaller-scale agents, clustered Sparks offer a credible departmental AI platform at a fraction of the Station’s cost.

The early buyers reveal where the market is heading

The initial customer roster for DGX Station maps the industries where AI is transitioning fastest from experiment to daily operating tool. Snowflake is using the system to locally test its open-source Arctic training framework. EPRI, the Electric Power Research Institute, is advancing AI-powered weather forecasting to strengthen electrical grid reliability. Medivis is integrating vision language models into surgical workflows. Microsoft Research and Cornell have deployed the systems for hands-on AI training at scale.

Advertisement

Systems are available to order now and will ship in the coming months from ASUS, Dell Technologies, GIGABYTE, MSI, and Supermicro, with HP joining later in the year. Nvidia hasn’t disclosed pricing, but the GB300 components and the company’s historical DGX pricing suggest a six-figure investment — expensive by workstation standards, but remarkably cheap compared to the cloud GPU costs of running trillion-parameter inference at scale.

The list of supported models underscores how open the AI ecosystem has become: developers can run and fine-tune OpenAI’s gpt-oss-120b, Google Gemma 3, Qwen3, Mistral Large 3, DeepSeek V3.2, and Nvidia’s own Nemotron models, among others. The DGX Station is model-agnostic by design — a hardware Switzerland in an industry where model allegiances shift quarterly.

Nvidia’s real strategy: own every layer of the AI stack, from orbit to office

The DGX Station didn’t arrive in a vacuum. It was one piece of a sweeping set of GTC 2026 announcements that collectively map Nvidia’s ambition to supply AI compute at literally every physical scale.

At the top, Nvidia unveiled the Vera Rubin platform — seven new chips in full production — anchored by the Vera Rubin NVL72 rack, which integrates 72 next-generation Rubin GPUs and claims up to 10x higher inference throughput per watt compared to the current Blackwell generation. The Vera CPU, with 88 custom Olympus cores, targets the orchestration layer that agentic workloads increasingly demand. At the far frontier, Nvidia announced the Vera Rubin Space Module for orbital data centers, delivering 25x more AI compute for space-based inference than the H100.

Advertisement

Between orbit and office, Nvidia revealed partnerships spanning Adobe for creative AI, automakers like BYD and Nissan for Level 4 autonomous vehicles, a coalition with Mistral AI and seven other labs to build open frontier models, and Dynamo 1.0, an open-source inference operating system already adopted by AWS, Azure, Google Cloud, and a roster of AI-native companies including Cursor and Perplexity.

The pattern is unmistakable: Nvidia wants to be the computing platform — hardware, software, and models — for every AI workload, everywhere. The DGX Station is the piece that fills the gap between the cloud and the individual.

The cloud isn’t dead, but its monopoly on serious AI work is ending

For the past several years, the default assumption in AI has been that serious work requires cloud GPU instances — renting Nvidia hardware from AWS, Azure, or Google Cloud. That model works, but it carries real costs: data egress fees, latency, security exposure from sending proprietary data to third-party infrastructure, and the fundamental loss of control inherent in renting someone else’s computer.

The DGX Station doesn’t kill the cloud — Nvidia’s data center business dwarfs its desktop revenue and is accelerating. But it creates a credible local alternative for an important and growing category of workloads. Training a frontier model from scratch still demands thousands of GPUs in a warehouse. Fine-tuning a trillion-parameter open model on proprietary data? Running inference for an internal agent that processes sensitive documents? Prototyping before committing to cloud spend? A machine under your desk starts to look like the rational choice.

Advertisement

This is the strategic elegance of the product: it expands Nvidia’s addressable market into personal AI infrastructure while reinforcing the cloud business, because everything built locally is designed to scale up to Nvidia’s data center platforms. It’s not cloud versus desk. It’s cloud and desk, and Nvidia supplies both.

A supercomputer on every desk — and an agent that never sleeps on top of it

The PC revolution’s defining slogan was “a computer on every desk and in every home.” Four decades later, Nvidia is updating the premise with an uncomfortable escalation. The DGX Station puts genuine supercomputing power — the kind that ran national laboratories — beside a keyboard, and NemoClaw puts an autonomous AI agent on top of it that runs around the clock, writing code, calling tools, and completing tasks while its owner sleeps.

Whether that future is exhilarating or unsettling depends on your vantage point. But one thing is no longer debatable: the infrastructure required to build, run, and own frontier AI just moved from the server room to the desk drawer. And the company that sells nearly every serious AI chip on the planet just made sure it sells the desk drawer, too.

Source link

Advertisement
Continue Reading
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Tech

Meta weighing up 20pc global layoffs

Published

on

‘This is a speculative report about theoretical approaches,’ Meta responds.

Meta is planning a fresh round of layoffs that could affect 20pc or more of the company’s global workforce, reported Reuters.

The layoffs, which could affect around 15,800 jobs, are meant to offset Meta’s massive AI spend and prepare the company for AI-assisted work instead, sources told the publication. Meta employs nearly 79,000 globally, with around 1,700 in Ireland.

Meta did not verify the contents of the report. A company spokesperson told SiliconRepublic.com: “This is a speculative report about theoretical approaches.”

Advertisement

The Facebook parent, much like many in the Big Tech league, has cut thousands of jobs in recent years in favour of spending billions for its AI build-out.

Meta laid off 5pc of global staff, targeting its “lowest performers”, in early 2025, amounting to around 3,600 people at the time. In October, it cut 600 jobs from its AI division, Superintelligence Labs. In 2022, it cut 11,000 jobs globally, with Irish workers affected, and in 2023, it laid off 10,000 worldwide.

Headcount in Ireland was cut by 20pc in 2024, which followed an 18pc decline during 2023. In early 2025, Meta employed around 2,000 in the country. That number is now down by around 300.

Reports from January 2026 suggested that Meta could cut 10pc of its Reality Labs division, which employs roughly 15,000. In December, it was speculated that the company would be reducing the budget and cutting staff for its ‘metaverse’, to include the ‘Horizon Worlds’ project and its Quest virtual reality unit.

Advertisement

Meta expects its total expenses for the year to be as high as $135bn, driven by an increased investment to support its Superintelligence Labs efforts as well as its core business. The company has been building its own in-house hardware, and poaching key talent from its rivals to boost its AI efforts.

In February, Meta announced a multi-year chip deal with Nvidia that would reportedly cost the company billions of dollars. It struck a $14.2bn deal with CoreWeave for its cloud compute power four months earlier. Meanwhile, a $10bn deal with Google for its cloud services is also speculated.

Recently, Meta spent as much as $3bn to acquire the Chinese-founded AI start-up Manus. Earlier this month, it acquired the viral social platform for AI bots, Moltbook, for an undisclosed amount.

Meta stocks fell by as much as 23pc from their August peak since last Friday (13 March). Investors are also concerned after the company delayed its much anticipated AI model Avocado.

Advertisement

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Source link

Advertisement
Continue Reading

Tech

SpaceX’s Starship rocket test scores several firsts ahead of flight 12

Published

on

SpaceX chief Elon Musk said in February that the mighty Starship rocket would embark on its 12th test flight this month, although several more recent reports have suggested that it might not leave the launchpad until early April.

Preflight tests on the Starship rocket have been underway at SpaceX’s Starbase facility in southern Texas as the team works to ready the rocket for showtime.

In an important step toward launch day, the first-stage Super Heavy — the most powerful booster ever to fly — underwent a ground-based engine test, known as a static fire, on Monday.

NASASpaceflight, which has a number of cameras trained on the Starbase site, shared some footage of the test.

Advertisement

Monday’s test scored three firsts for SpaceX. It was the first time the new Pad 2 has been put to use in this way, the first such test for the new Starship Version 3, and the first time for the rocket’s new Raptor 3 engines to be fired up for this kind of procedure.

But the static fire ended after only a few seconds — much shorter than usual — suggesting there may have been an issue as the engines roared into life. SpaceX has yet to make any official comment on the outcome of the important preflight test.

The Starship flew most recently in October 2025. That means it’s been a long wait for the 12th flight of the massive rocket, especially considering that the more recent Starship launches have taken place within two or three months of each other.

The delay has been put down to the extra preparations needed for the new version of Starship.

Advertisement

Once fully ready, SpaceX’s newest and largest rocket will be used for crew and cargo flights to the moon as part of NASA’s Artemis program, and could even take the first humans to Mars.

First up, a modified version of the upper-stage Ship spacecraft will be used to return humans to the lunar surface as part of the Artemis IV mission, which is currently set for 2028.

Source link

Advertisement
Continue Reading

Tech

What is the release date for The Pitt season 2 episode 11 on HBO Max?

Published

on

It feels like a long old while since Baby Jane Doe (who’s looking good and taking formula well) was the talking point of The Pitt season 2. The shift has gotten more difficult over the last few hours, but the worst is yet to come.

The ER is currently under digital lockdown to prevent a cyber attack, meaning no computer records can be accessed, the number of patients practically doubles every five seconds, and replacement Dr. Al-Hashimi (Sepideh Moafi) isn’t making life easier for anyone.

Advertisement

Source link

Continue Reading

Tech

Nvidia DLSS 5 first look: generative AI lighting radically transforms game visuals

Published

on


Instead of upscaling games from lower resolutions or interpolating frames using AI, DLSS 5 applies machine learning to a game’s lighting model. Nvidia calls it the next stage of rendering after upscaling and ray tracing. Digital Foundry got an early hands-on look at the technology (video below), which sparked controversy…
Read Entire Article
Source link

Continue Reading

Tech

The Galaxy Watch Ultra 2 set for an Apple Watch-rivalling upgrade

Published

on

Samsung’s next rugged smartwatch could deliver a major connectivity upgrade, with new reports suggesting the upcoming Galaxy Watch Ultra 2 may become the first Galaxy smartwatch capable of connecting directly to 5G networks.

If the rumours prove accurate, the update would represent a notable shift for Samsung’s wearable lineup, which has historically relied on LTE connectivity rather than full 5G network support.

The original Galaxy Watch Ultra debuted in 2024 as Samsung’s most durable smartwatch, designed for outdoor enthusiasts and extreme activities where reliable connectivity and rugged hardware are especially important.

Although a refreshed model appeared in 2025, many observers believe the lineup is now due for a more substantial hardware upgrade rather than another incremental revision.

Advertisement

New chip could unlock faster connectivity

According to recent reports, the Galaxy Watch Ultra 2 may introduce a significant internal change by moving away from Samsung’s in-house Exynos smartwatch processors.

Advertisement

Instead, the device is expected to use Qualcomm’s Snapdragon Wear Elite platform, which includes an integrated 5G modem capable of connecting to faster mobile networks.

That upgrade could allow the smartwatch to deliver quicker data speeds, improved streaming performance and more reliable connectivity when used independently from a paired smartphone.

Advertisement

The Snapdragon Wear Elite chip is built on a 3nm manufacturing process and includes a five-core CPU, faster graphics capabilities and an integrated neural processing unit designed to accelerate AI-driven tasks.

Another capability of the processor is satellite connectivity support, though it remains unclear whether Samsung will enable it on the Galaxy Watch Ultra 2, which could allow the watch to send emergency signals or location data in remote areas without cellular coverage.

If Samsung follows the same strategy used with the previous Ultra model, the Galaxy Watch Ultra 2 may only be sold in a cellular configuration rather than offering a Bluetooth-only version.

Advertisement

Advertisement

That approach would align with the device’s positioning as a smartwatch designed for outdoor adventures, where standalone connectivity can be particularly useful.

Samsung has not officially confirmed the Galaxy Watch Ultra 2 yet, but rumours suggest the next generation of Galaxy Watch devices could arrive alongside the company’s upcoming foldable phones later this year.

Source link

Advertisement
Continue Reading

Tech

NVIDIA’s DLSS 5 Blurs the Line Between Game Worlds and Reality

Published

on

NVIDIA DLSS 5 GTC 2026 Features Reveal
NVIDIA unveiled their future graphics technology today at its annual GTC in San Jose, California, and DLSS 5 appears to be the most ambitious jump yet. It will be available to players in the fall, with lucky owners of the latest graphics hardware being among the first to try it out.



The way it works is actually a lot more complex than you might expect. Every frame generated in the game is loaded into an AI model, which contains two crucial pieces of information: the raw colors you see on screen and motion data for everything in the scene. The AI then simply takes that data to add in some lighting and material features that look exactly like the original. So, when light flows through skin from underneath, it appears soft, and garments reflect the light just as they would in real life. Hair even responds to light sources as if you were outside on a sunny day or buried in a dark corner. None of it feels like an afterthought, and, more crucially, it remains consistent with the original game geometry, so movement is always fluid and does not jolt or glitch as it would if it were all simulated on the fly.


NVIDIA Shield Android TV Pro | 4K HDR Streaming Media Player High Performance, Dolby Vision, 3GB RAM, 2X…
  • The Best of the Best. SHIELD TV delivers an amazing Android TV streaming media player experience, thanks to the new NVIDIA Tegra X1+ chip. Enhance HD…
  • Dolby Vision – Atmos. Bring your home theater to life with Dolby Vision HDR, and surround sound with Dolby Atmos and Dolby Digital Plus—delivering…
  • Best-In-Class Design. Designed for the most demanding users and beautifully designed to be the perfect centerpiece of your entertainment center…

In comparison to previous versions of DLSS, DLSS 5 aims to create richer, more realistic sceneries from the ground up, rather than simply filling in gaps to make games run quicker and look better. Developers have very granular capabilities to dial in how powerful the effect is, change the tone, or lock down specific regions so they may be any color they want while knowing the game will still appear exactly how they want it to.


The supported titles list is already looking strong, with Starfield, Hogwarts Legacy, Assassin’s Creed Shadows, and the freshly revived Elder Scrolls IV: Oblivion among the early standouts, each looking noticeably richer and more convincing with DLSS 5 active. Resident Evil Requiem, Delta Force, and a slew of more titles are already preparing to join in the fall, with even more developers completing their integrations ahead of the big launch.

Advertisement

NVIDIA DLSS 5 GTC 2026 Features Reveal
The demos at the event were ran on two top-of-the-line cards working together to handle the large memory demands, but by the time DLSS 5 hits the stores, it will be possible to run it on a single card without issue. Getting everything set up will be a breeze because its integration is already built into NVIDIA’s existing dev framework, which many studios are familiar with and like.

NVIDIA DLSS 5 GTC 2026 Features Reveal
Visually, the effects are subtle at first, but if you know what to look for, they’re impossible to miss, as scenes that were previously flat take on new depth, while features show up in the screen’s corners that used to blend into the background. Plus, everything works smoothly enough to keep the game running at maximum resolution, so you won’t have to worry about your frame rate dropping.
[Source]

Source link

Continue Reading

Tech

Nvidia introduces Vera Rubin, a seven-chip AI platform with OpenAI, Anthropic and Meta on board

Published

on

Nvidia on Monday took the wraps off Vera Rubin, a sweeping new computing platform built from seven chips now in full production — and backed by an extraordinary lineup of customers that includes Anthropic, OpenAI, Meta and Mistral AI, along with every major cloud provider.

The message to the AI industry, and to investors, was unmistakable: Nvidia is not slowing down. The Vera Rubin platform claims up to 10x more inference throughput per watt and one-tenth the cost per token compared with the Blackwell systems that only recently began shipping. CEO Jensen Huang, speaking at the company’s annual GTC conference, called it “a generational leap” that would kick off “the greatest infrastructure buildout in history.” Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure will all offer the platform, and more than 80 manufacturing partners are building systems around it.

“Vera Rubin is a generational leap — seven breakthrough chips, five racks, one giant supercomputer — built to power every phase of AI,” Huang declared. “The agentic AI inflection point has arrived with Vera Rubin kicking off the greatest infrastructure buildout in history.”

In any other industry, such rhetoric might be dismissed as keynote theater. But Nvidia occupies a singular position in the global economy — a company whose products have become so essential to the AI boom that its market capitalization now rivals the GDP of mid-sized nations. When Huang says the infrastructure buildout is historic, the CEOs of the companies actually writing the checks are standing behind him, nodding.

Advertisement

Dario Amodei, the chief executive of Anthropic, said Nvidia’s platform “gives us the compute, networking and system design to keep delivering while advancing the safety and reliability our customers depend on.” Sam Altman, the chief executive of OpenAI, said that “with Nvidia Vera Rubin, we’ll run more powerful models and agents at massive scale and deliver faster, more reliable systems to hundreds of millions of people.”

Inside the seven-chip architecture designed to power the age of AI agents

The Vera Rubin platform brings together the Nvidia Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, Spectrum-6 Ethernet switch and the newly integrated Groq 3 LPU — a purpose-built inference accelerator. Nvidia organized these into five interlocking rack-scale systems that function as a unified supercomputer.

The flagship NVL72 rack integrates 72 Rubin GPUs and 36 Vera CPUs connected by NVLink 6. Nvidia says it can train large mixture-of-experts models using one-quarter the GPUs required on Blackwell, a claim that, if validated in production, would fundamentally alter the economics of building frontier AI systems.

The Vera CPU rack packs 256 liquid-cooled processors into a single rack, sustaining more than 22,500 concurrent CPU environments — the sandboxes where AI agents execute code, validate results and iterate. Nvidia describes the Vera CPU as the first processor purpose-built for agentic AI and reinforcement learning, featuring 88 custom-designed Olympus cores and LPDDR5X memory delivering 1.2 terabytes per second of bandwidth at half the power of conventional server CPUs.

Advertisement

The Groq 3 LPX rack, housing 256 inference processors with 128 gigabytes of on-chip SRAM, targets the low-latency demands of trillion-parameter models with million-token contexts. The BlueField-4 STX storage rack provides what Nvidia calls “context memory” — high-speed storage for the massive key-value caches that agentic systems generate as they reason across long, multi-step tasks. And the Spectrum-6 SPX Ethernet rack ties it all together with co-packaged optics delivering 5x greater optical power efficiency than traditional transceivers.

Why Nvidia is betting the future on autonomous AI agents — and rebuilding its stack around them

The strategic logic binding every announcement Monday into a single narrative is Nvidia’s conviction that the AI industry is crossing a threshold. The era of chatbots — AI that responds to a prompt and stops — is giving way to what Huang calls “agentic AI“: systems that reason autonomously for hours or days, write and execute software, call external tools, and continuously improve.

This isn’t just a branding exercise. It represents a genuine architectural shift in how computing infrastructure must be designed. A chatbot query might consume milliseconds of GPU time. An agentic system orchestrating a drug discovery pipeline or debugging a complex codebase might run continuously, consuming CPU cycles to execute code, GPU cycles to reason, and massive storage to maintain context across thousands of intermediate steps. That demands not just faster chips, but a fundamentally different balance of compute, memory, storage and networking.

Advertisement

Nvidia addressed this with the launch of its Agent Toolkit, which includes OpenShell, a new open-source runtime that enforces security and privacy guardrails for autonomous agents. The enterprise adoption list is remarkable: Adobe, Atlassian, Box, Cadence, Cisco, CrowdStrike, Dassault Systèmes, IQVIA, Red Hat, Salesforce, SAP, ServiceNow, Siemens and Synopsys are all integrating the toolkit into their platforms. Nvidia also launched NemoClaw, an open-source stack that lets users install its Nemotron models and OpenShell runtime in a single command to run secure, always-on AI assistants on everything from RTX laptops to DGX Station supercomputers.

The company separately announced Dynamo 1.0, open-source software it describes as the first “operating system” for AI inference at factory scale. Dynamo orchestrates GPU and memory resources across clusters and has already been adopted by AWS, Azure, Google Cloud, Oracle, Cursor, Perplexity, PayPal and Pinterest. Nvidia says it boosted Blackwell inference performance by up to 7x in recent benchmarks.

The Nemotron coalition and Nvidia’s play to shape the open-source AI landscape

If Vera Rubin represents Nvidia’s hardware ambition, the Nemotron Coalition represents its software ambition. Announced Monday, the coalition is a global collaboration of AI labs that will jointly develop open frontier models trained on Nvidia’s DGX Cloud. The inaugural members — Black Forest Labs, Cursor, LangChain, Mistral AI, Perplexity, Reflection AI, Sarvam and Thinking Machines Lab, the startup led by former OpenAI executive Mira Murati — will contribute data, evaluation frameworks and domain expertise.

Advertisement

The first model will be co-developed by Mistral AI and Nvidia and will underpin the upcoming Nemotron 4 family. “Open models are the lifeblood of innovation and the engine of global participation in the AI revolution,” Huang said.

Nvidia also expanded its own open model portfolio significantly. Nemotron 3 Ultra delivers what the company calls frontier-level intelligence with 5x throughput efficiency on Blackwell. Nemotron 3 Omni integrates audio, vision and language understanding. Nemotron 3 VoiceChat supports real-time, simultaneous conversations. And the company previewed GR00T N2, a next-generation robot foundation model that it says helps robots succeed at new tasks in new environments more than twice as often as leading alternatives, currently ranking first on the MolmoSpaces and RoboArena benchmarks.

The open-model push serves a dual purpose. It cultivates the developer ecosystem that drives demand for Nvidia hardware, and it positions Nvidia as a neutral platform provider rather than a competitor to the AI labs building on its chips — a delicate balancing act that grows more complex as Nvidia’s own models grow more capable.

From operating rooms to orbit: how Vera Rubin’s reach extends far beyond the data center

Advertisement

The vertical breadth of Monday’s announcements was almost disorienting. Roche revealed it is deploying more than 3,500 Blackwell GPUs across hybrid cloud and on-premises environments in the U.S. and Europe — the largest announced GPU footprint in the pharmaceutical industry. The company is using the infrastructure for biological foundation models, drug discovery and digital twins of manufacturing facilities, including its new GLP-1 facility in North Carolina. Nearly 90 percent of Genentech’s eligible small-molecule programs now integrate AI, Roche said, with one oncology molecule designed 25 percent faster and a backup candidate delivered in seven months instead of more than two years.

In autonomous vehicles, BYD, Geely, Isuzu and Nissan are building Level 4-ready vehicles on Nvidia’s Drive Hyperion platform. Nvidia and Uber expanded their partnership to launch autonomous vehicles across 28 cities on four continents by 2028, starting with Los Angeles and San Francisco in the first half of 2027. The company introduced Alpamayo 1.5, a reasoning model for autonomous driving already downloaded by more than 100,000 automotive developers, and Nvidia Halos OS, a safety architecture built on ASIL D-certified foundations for production-grade autonomy.

Nvidia also released the first domain-specific physical AI platform for healthcare robotics, anchored by Open-H — the world’s largest healthcare robotics dataset, with over 700 hours of surgical video. CMR Surgical, Johnson & Johnson MedTech and Medtronic are among the adopters.

And then there was space. The Vera Rubin Space Module delivers up to 25x more AI compute for orbital inferencing compared with the H100 GPU. Aetherflux, Axiom Space, Kepler Communications, Planet Labs and Starcloud are building on it. “Space computing, the final frontier, has arrived,” Huang said, deploying the kind of line that, from another executive, might draw eye-rolls — but from the CEO of a company whose chips already power the majority of the world’s AI workloads, lands differently.

Advertisement

The deskside supercomputer and Nvidia’s quiet push into enterprise hardware

Amid the spectacle of trillion-parameter models and orbital data centers, Nvidia made a quieter but potentially consequential move: it launched the DGX Station, a deskside system powered by the GB300 Grace Blackwell Ultra Desktop Superchip that delivers 748 gigabytes of coherent memory and up to 20 petaflops of AI compute performance. The system can run open models of up to one trillion parameters from a desk.

Snowflake, Microsoft Research, Cornell, EPRI and Sungkyunkwan University are among the early users. DGX Station supports air-gapped configurations for regulated industries, and applications built on it move seamlessly to Nvidia’s data center systems without rearchitecting — a design choice that creates a natural on-ramp from local experimentation to large-scale deployment.

Nvidia also updated DGX Spark, its more compact system, with support for clustering up to four units into a “desktop data center” with linear performance scaling. Both systems ship preconfigured with NemoClaw and the Nvidia AI software stack, and support models including Nemotron 3, Google Gemma 3, Qwen3, DeepSeek V3.2, Mistral Large 3 and others.

Advertisement

Adobe and Nvidia separately announced a strategic partnership to develop the next generation of Firefly models using Nvidia’s computing technology and libraries. Adobe will also build a cloud-native 3D digital twin solution for marketing on Nvidia Omniverse and integrate Nemotron capabilities into Adobe Acrobat. The partnership spans creative tools including Photoshop, Premiere Pro, Frame.io and Adobe Experience Platform.

Building the factories that build intelligence: Nvidia’s AI infrastructure blueprint

Perhaps the most telling indicator of where Nvidia sees the industry heading is the Vera Rubin DSX AI Factory reference design — essentially a blueprint for constructing entire buildings optimized to produce AI. The reference design outlines how to integrate compute, networking, storage, power and cooling into a system that maximizes what Nvidia calls “tokens per watt,” along with an Omniverse DSX Blueprint for creating digital twins of these facilities before they are built.

The software stack includes DSX Max-Q for dynamic power provisioning — which Nvidia says enables 30 percent more AI infrastructure within a fixed-power data center — and DSX Flex, which connects AI factories to power-grid services to unlock what the company estimates is 100 gigawatts of stranded grid capacity. Energy leaders Emerald AI, GE Vernova, Hitachi and Siemens Energy are using the architecture. Nscale and Caterpillar are building one of the world’s largest AI factories in West Virginia using the Vera Rubin reference design.

Advertisement

Industry partners Cadence, Dassault Systèmes, Eaton, Jacobs, Schneider Electric, Siemens, PTC, Switch, Trane Technologies and Vertiv are contributing simulation-ready assets and integrating their platforms. CoreWeave is using Nvidia’s DSX Air to run operational rehearsals of AI factories in the cloud before physical delivery.

“In the age of AI, intelligence tokens are the new currency, and AI factories are the infrastructure that generates them,” Huang said. It is the kind of formulation — tokens as currency, factories as mints — that reveals how Nvidia thinks about its place in the emerging economic order.

What Nvidia’s grand vision gets right — and what remains unproven

The scale and coherence of Monday’s announcements are genuinely impressive. No other company in the semiconductor industry — and arguably no other technology company, period — can present an integrated stack spanning custom silicon, systems architecture, networking, storage, inference software, open models, agent frameworks, safety runtimes, simulation platforms, digital twin infrastructure and vertical applications from drug discovery to autonomous driving to orbital computing.

Advertisement

But scale and coherence are not the same as inevitability. The performance claims for Vera Rubin, while dramatic, remain largely unverified by independent benchmarks. The agentic AI thesis that underpins the entire platform — the idea that autonomous, long-running AI agents will become the dominant computing workload — is a bet on a future that has not yet fully materialized. And Nvidia’s expanding role as a provider of models, software, and reference architectures raises questions about how long its hardware customers will remain comfortable depending so heavily on a single supplier for so many layers of their stack.

Competitors are not standing still. AMD continues to close the gap on data center GPU performance. Google’s TPUs power some of the world’s largest AI training runs. Amazon’s Trainium chips are gaining traction inside AWS. And a growing cohort of startups is attacking various pieces of the AI infrastructure puzzle.

Yet none of them showed up at GTC on Monday with endorsements from the CEOs of Anthropic and OpenAI. None of them announced seven new chips in full production simultaneously. And none of them presented a vision this comprehensive for what comes next.

There is a scene that repeats at every GTC: Huang, in his trademark leather jacket, holds up a chip the way a jeweler holds up a diamond, rotating it slowly under the stage lights. It is part showmanship, part sermon. But the congregation keeps growing, the chips keep getting faster, and the checks keep getting larger. Whether Nvidia is building the greatest infrastructure in history or simply the most profitable one may, in the end, be a distinction without a difference.

Advertisement

Source link

Continue Reading

Tech

Meet the 102-year-old teaching seniors how to use smartphones and Windows

Published

on


Simes presides over Computer Pals, a volunteer-run group devoted to helping older adults develop digital literacy. Under his guidance, the club’s lessons range from navigating Windows 11 to distinguishing between legitimate and malicious links online. His authority doesn’t stem from age or nostalgia but from curiosity – an instinct that led…
Read Entire Article
Source link

Continue Reading

Tech

Death by Tariffs: Volvo Discontinuing Entry-Level EX30 EV in the US

Published

on

Volvo is pulling the plug on its smallest and least expensive EV this week. The automaker is winding down US-bound production and imports of the EX30 and EX30 Cross Country over the coming weeks, with the last examples wrapping up the 2026 model year at the end of this summer due to financial and market considerations. In other words, tariffs are up, and sales are down.

It’s a tough time to be selling EVs in the US right now. Volvo joins a growing list of automakers reassessing or outright canceling their electric car ambitions in the US due to market and political conditions over the last year. Earlier this year, Chevrolet announced that it would be ending production of its highly anticipated Chevrolet Bolt revival after just one model year. Last week, Honda announced the cancellation of its upcoming 0-Series of US-built electric cars before even reaching production, and that’s just the tip of the iceberg.

Advertisement

The EX30’s arrival and short stay in the US has been fraught with challenges. The small SUV was first announced in 2023, billed as an affordable electric option starting below $35,000. I was impressed by the EX30 during my first drive review, calling it my most anticipated affordable EV of 2024. Volvo initially planned to keep manufacturing costs down by building the EX30 in China, but Biden administration tariffs forced the automaker to move production of US-bound examples to its plant in Ghent, Belgium. 

volvo ex30 interior with single display

By the time the EX30 arrived in the US, it was thousands of dollars more than initially predicted.

Antuan Goodwin/CNET

Preproduction software issues further delayed the EV’s limited arrival to late 2024 with sales ramping up in early 2025 — just in time to get hit by the Trump administration’s unpredictable new tariffs. Today, the EX30 starts at $40,344 in the US, then climbs to just shy of $50,000 for the dual-motor model with the best tech — a tough sell for a subcompact SUV even at the best of times. In 2025, Volvo reported only 5,409 EX30s sold in the US and a 60.5% decrease in overall electrified vehicle sales versus 2024.

Advertisement

When reached for comment, a representative from Volvo confirmed that, “Volvo Car USA has decided to end sales of the EX30 and EX30 Cross Country in the US market after the 2026 model year.”

The automaker tells me that the EX30 will remain available in global markets and will continue to be imported and sold in Mexico and Canada. Recently, Volvo’s flagship EX90 — which is built at Volvo’s South Carolina factory — ceased 2026 model year exports to Canada, a victim of retaliatory counter tariffs aimed at the US. When asked how this shakeup will affect its roadmap, a Volvo representative told CNET that the company’s goal of a fully electrified global lineup by 2030 remains unchanged. 

Volvo EX30 Thor's Hammer headlights

Volvo only sold 5,409 EX30s in 2025.

Advertisement

Antuan Goodwin/CNET

“Volvo Cars’ commitment to electrification and our customers remains unchanged,” the representative told CNET, “and we look forward to continuing to bring exciting new electrified options to our customers in the US, including the all-new EX60 and upgraded EX90.”

In January, at the debut of the upcoming Volvo EX60, it was my professional opinion that the new mid-range model would be a make-or-break point for the brand’s US ambitions after the tumultuous rollouts of its first two dedicated EV models. With the EX30 soon to be gone and in an increasingly dangerous market where only the strongest models survive, Volvo now finds itself in an even more perilous position.

Source link

Advertisement
Continue Reading

Tech

The most radical act in an age of outrage is to play

Published

on

We are not divided by accident; we are distracted on purpose. The antidote to that manipulation is to reconnect with what makes us human, often through something as simple as play. Spend five minutes scrolling, and you can feel the machinery of social media outrage at work: the pulse of outrage, the invitation to pick a side, the subtle suggestion that if you are not angry, you are not paying attention. Families fracture over headlines, friendships dissolve over algorithms, and disagreement begins to feel like disownment. All the while, the crises never seem to stop.

This emotional volatility is conditioned. News cycles are engineered to provoke because fear keeps us engaged, and engagement keeps us predictable. According to an analysis from the Pew Research Center, nearly 60% of Americans express low confidence in journalists to act in the public’s best interests. Yet even with that distrust, most of us remain immersed in the stream. We doubt it, but we cannot seem to look away because the system is designed to make disengagement feel unsafe.

A society kept in a perpetual state of alarm is easier to manage than one that thinks for itself. Benjamin Franklin warned that those who would give up essential liberty to purchase temporary safety deserve neither. His words echo loudly today. Fear narrows our thinking. It contracts our field of vision. When we are anxious, we trade autonomy for the illusion of protection.

Technology intensifies this pattern. Artificial intelligence drafts our emails, GPS replaces our internal maps, and our phones remember every number we no longer bother to memorize. The issue is not the tools themselves. After all, tools can be magnificent. The danger lies in dependence. When a tool meant to sharpen our minds begins to substitute for them, something subtle shifts. I notice it in myself. I can still recite phone numbers from childhood, numbers I dialed repeatedly. Today, if I lose my phone, I lose access not just to contacts but to competence.

Advertisement

That small panic reveals a deeper truth: unused faculties atrophy. And when faculties atrophy, systems built on compliance thrive. They reward predictability. Anger and fear make us predictable. Creativity, curiosity, and divergent thinking make us harder to steer. Emotional manipulation becomes simpler when imagination shrinks.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol’ founder Boris, and some questionable AI art. It’s free, every week, in your inbox. Sign up now!

So where does sovereignty begin? Not in Washington or Silicon Valley. It begins with self-regulation. I cannot control the global news cycle, but I can control my nervous system. I can decide whether I will outsource my emotional state to the latest headline or cultivate internal stability. For me, that cultivation happens through play.

Advertisement

Play is autotelic, an activity whose reward is the activity itself. When I juggle, the act is the payoff. There is no external validation required. The rhythm draws my attention into the present. What does that mean? My breathing steadies, my body settles, and my mind clears. That shift is neurological, and research supports this.

One study examined the neurobiology of stress resilience and found that positive affect, novelty, and exploratory behaviors, the core elements of play, strengthen neural circuits that protect against chronic stress. In other words, play expands our adaptive capacity. Fear contracts it. Through that lens, play becomes a neurological rebellion against conditioning that thrives on anxiety.

Children demonstrate this instinctively. When they meet on a playground, they don’t need shared beliefs or background; they simply ask, “Do you want to play?” Ideology is irrelevant at that moment. The invitation to move together dissolves barriers that words often inflame. I have watched this happen in real time, tension softening in a park the moment a small footbag (also known as Hacky SackTM) circle forms. Strangers who arrived as bystanders became collaborators within seconds, drawn in by the shared rhythm of the activity. Laughter shifts the emotional frequency of the space, and as the atmosphere lifts, connection becomes easier. Elevated states foster openness, and that openness makes division harder to sustain.

Yet many children today are being funneled into narrow reward loops, where stimulation is constant but growth is limited. Screens deliver rapid bursts of dopamine that feel exciting in the moment but limit genuine exploration. At the same time, helicopter oversight reduces opportunities for real-world challenges, and when those challenges diminish, resilience inevitably weakens. A brain conditioned to expect only curated digital rewards can struggle with ambiguity, frustration, and disagreement, skills that develop only through challenging lived experiences.

Advertisement

We must reclaim our agency. Real-world play, such as tossing a ball, learning to juggle, or building something with friends, reintroduces novelty, problem-solving, and collaboration. It broadens capacity in ways no algorithm can replicate. In the process, it trains adaptability, the very trait children need to navigate a world that will never be perfectly curated for them.

Some may argue that play is trivial in the face of serious global problems. I understand the impulse. Wars, economic uncertainty, and technological disruption are not games. However, a population locked in chronic stress does not solve complex problems well. Chronic fear impairs executive function and creativity. If we want wiser civic engagement, we need citizens who can regulate their own nervous systems.

Play does that. It builds resilience, flexibility, and social connection. It restores a sense of agency because the reward is internal. You are not waiting for a notification to feel validated. You are generating joy through participation. Every time I juggle in public, I signal possibility. Adulthood does not require the abandonment of joy. A playful mind is less susceptible to manipulation because it is not starved for stimulation. It does not need outrage to feel alive.

Once you understand that play is foundational, the next step becomes surprisingly simple: weave small acts of playfulness back into daily life. Laugh daily, move your body, and learn a skill that engages both hands and mind. Turn off the noise long enough to hear your own thoughts. Invite someone to play, even if it feels awkward at first. Protect your autonomy the way previous generations protected their liberties.

Advertisement

We may not control the macro forces swirling around us, but we can control our state. In a culture addicted to outrage, choosing play is an act of defiance. It is how we reclaim clarity, how we reconnect, and how we remember that beneath the noise, we are still human. In today’s climate, the most radical thing you can do is play.

About the Author

Alexander “Zander” Phelps, also known as zPlayCoach, is a play advocate, speaker, and the founder of HACKiDO, the Path of Play. For more than three decades, he has explored movement-based play as a pathway to cognitive vitality, emotional resilience, and human connection. Drawing from lived experience, neuroscience research, and work with schools, rehabilitation programs, corporations, and community groups, Zander teaches accessible practices such as juggling and laughter exercises to help individuals enter the PlayFlowState to regulate stress and rediscover intrinsic joy. Through workshops, public speaking, and community engagement, he continues to champion play as a lifelong practice that strengthens both brain and community.

Source link

Advertisement
Continue Reading

Trending

Copyright © 2025