Indeed, the idea of data centers in orbit has gone from science fiction to a serious spending category. Elon Musk’s SpaceX has acquiredxAI (also Musk’s) and is planning a constellation of space-based data centers. Google, not to be outdone, announced Project Suncatcher in partnership with Planet, planning to launch two satellites equipped with GoogleTensor Processing Unit (TPU) AI chips by early 2027. Startup Starcloud has already filed a proposal with the Federal Communications Commission for an 88,000-satellite constellation for orbital data centers. As Starcloud’s filing suggests, these companies are all proposing fleets of satellites numbering in the thousands, each housing a rack or multiple racks of AI-grade GPUs, interconnected with each other through free-space optical links and communicating back to Earth via microwave links, either directly or through other satellites.
Proponents tout the many wonders of computing in space: abundant solar energy, free cooling, and freedom from Earth-based disturbances like earthquakes, floods, and protesters. But a sober look at the physics of space-based computing paints a much more nuanced picture.
Free cooling is perhaps the biggest misconception. Space is cold, but it also has no atmosphere. That means the best heat-removal mechanisms, conduction and convection, are off the table. The only option is radiation. To prevent a chip from overheating in space, a large, costly surface area is required to dissipate the energy and then radiate it.
Advertisement
Solar energy is abundant, but collecting it with functional solar panels that maintain perfect alignment toward the sun is a complex task requiring extensive attitude control systems. On top of that, ionizing radiation in space from cosmic rays and other sources poses a unique challenge, degrading the solar panels, the radiative coolers, and the chips themselves. Because regular maintenance in space is difficult, redundancy has to be built in at launch, and cost estimates have to account for efficiency degradation over time.
At ABI Research, where I work as an aerospace analyst, we did a rough total-cost-of-ownership comparison between a data center on Earth and one in space. It showed that the cost to launch and run a GPU in space for a year is at least an order of magnitude higher than the same feat in a terrestrial data center. Our model was simple, assuming an Nvidia H100 server rack launched with the requisite-size solar panel and radiator on a spacecraft akin to Starcloud’s pilot launch. We assumed SpaceX’s Starship was used at a highly optimistic launch cost per kilogram of US $44, and a terrestrial energy cost of $0.20 per kilowatt hour. This is a simple back-of-the-envelope calculation, but it does signal something real.
From our perspective, the cost of delivery and space hardening of the payload makes general-purpose space-based data centers difficult to justify economically today, despite the fact that data-center builders in many regions are scrambling for electric power. However, there are niche applications where the much higher costs of computing in space could be justified. Examples include preprocessing data from Earth-observation satellites, real-time detection and tracking of hypersonic missiles, and active collision avoidance in the increasingly crowded low Earth orbit. Even for these, though, contending with fundamental physics will still be a demanding challenge. And a technologically compelling one, too.
The Cooling Challenge in Space
Cooling is where physics separates the science from the fiction. The governing equation for radiative cooling, the only type of cooling available in space, is known as the Stefan-Boltzmann Law. It states that the amount of power you can radiate is proportional to the area of the radiator times its temperature to the fourth power. For a space systems architect, the implications of this law are brutal. In orbit, the only variable we can control is area. This restriction creates a geometric penalty, or a “physics tax,” for cooling in space: The more power you need to reject, the bigger the area of the radiator you need to bring along from Earth.
Advertisement
The only cooling method available in space is radiation, and the radiator area required is derived using the Stephan-Boltzmann law. For a single chip drawing 700 watts, like Nvidia’s popular H100 GPU, the area required to keep it at 20 °C is just under 3 square meters, and it goes down to 1 square meter for an operating temperature of 85 °C. However, as the radiator surface is exposed to ionizing radiation, its emissivity decreases, and after 5 years in space the required area increases by about 40 percent.
To understand how big this baseline area is in practice, I used the Stefan-Boltzmann law to model the heat-rejection area needed to keep a single chip that draws 700 watts of power—such as the H100 GPU chip, an AI stalwart—at a constant 60 °C, usually considered the sweet spot for GPU longevity and stability. I further assumed that the radiator is perfectly facing deep space, at a chilly background temperature of 3 kelvins. By this calculation, a single chip would require 1.4 square meters of radiator surface.
To put this into perspective, consider that a common AI rack can hold approximately 32 GPUs (four H100 server boards). With CPUs, memory, and networking equipment, this rack would draw around 40 kilowatts of power. This single rack includes 2.5 terabytes of memory—enough capacity to serve over 20,000 concurrent users or run 16 simultaneous instances of Llama 3, an open-source AI model. But to cool this thermal load in a vacuum, that single rack would require an 80-square-meter radiator, roughly the size of a pickleball court. For an aggregate 100-megawatt data center, you’d need at least 2,500 of those radiators.
And that’s the best-case scenario. Additional problems are hidden in the low Earth orbit environment itself. Space exposes radiators and their coatings to a chemically hostile brew of ultraviolet light and atomic oxygen, quite the opposite of a clean-room environment. Over a LEO satellite’s typical 5-year lifespan, these elements degrade the radiator’s surface properties and lower its ability to shed heat.
Including this degradation in the model reveals that as the radiator degrades from a “fresh” state to an “end-of-life” state, the physics demands a further penalty. To maintain that same 60 °C operating temperature for the GPU chips, the required surface area jumps from about 1.4 square meters per chip to nearly 2.0 square meters. In other words, the physics tax rises by 40 percent. Therefore, you must launch at least 40 percent more radiator mass, endure higher atmospheric drag, and sacrifice valuable launch volume just to survive the degradation of the thermal coating. This increase adds significantly to the launch cost and further erodes the economics of a space-based data center.
Advertisement
The Silicon Challenge in Space
Solving the heat problem is only part of the battle. The other significant challenge in low Earth orbit is ionizing radiation, which affects the computing hardware itself. Today’s satellites typically use radiation-hardenedprocessors, which are very reliable but also much more expensive, and they perform poorly compared to commercial off-the-shelf processors.
A standard rad-hard chip doesn’t have the processing power to run a modern large language model (LLM). As a result, satellite operators aspiring to launch a data center have no choice but to make a risky compromise: to use hardware meant for terrestrial use. In order to achieve the necessary compute density, orbital data centers must use the same Nvidia H100s or Google TPUs found in terrestrial server farms. The problem is that these chips are “soft” targets in space. High-energy particles can flip bits in memory or cause “latch-ups” in logic that fry the circuit.
One possible option is to shield the computers from radiation with thick, absorbent panels. However, the shielding would add significantly to the already heavy satellites. The other option is to compensate for the radiation damage with redundancy. Indeed, edge computing architects are moving toward software-defined resilience, where instead of one perfectly hardened computer, operators fly a cluster of imperfect, commercial ones whose total cost could be as low as one-tenth to one-hundredth that of the rad-hard model.
This redundant approach is used in many spacecraft, including Artemis II, which recently carried astronauts around the moon, as well as SpaceX’s flight computers and the Hewlett Packard Enterprise edge servers for the International Space Station. By running three (or more) instances of the same calculation on three different nodes and comparing the answers, the system can detect a corrupted processor. If a node fails, the “orchestrator” reboots it while the others continue the mission. While this ensures resiliency, it also means that some fraction of the compute capacity is dedicated to redundancy, further increasing the costs.
Advertisement
The Energy Challenge in Space
An often-touted advantage of space-based data centers is the seemingly unlimited supply of free, clean energy from the sun. Solar energy in orbit is indeed abundant, at 1,361 watts per square meter. Of course, capturing that free energy is made possible only by the very costly launching of large solar panels into orbit. And those solar panels also degrade over time due to radiation exposure, typically losing 1 to 3 percent efficiency per year.
Let’s say a solar array collects 1 MW of power to run an AI cluster. The laws of physics demand that the satellite must eventually radiate 1 MW of waste heat. Because the square area needed to generate the solar power—around 400 W/m2—and to reject the heat—around 450 W/m2—are nearly equivalent, every square meter of power generation now demands approximately another square meter of cooling. The radiator needs to be a structural equal, not merely a passive coating on a surface used for something else.
As Elon Musk recently noted in Davos, the most efficient radiator is one that never sees the sun. By orienting the spacecraft so the solar panels face the sun and the radiators face the deep vacuum of space, efficiency skyrockets for both. But there’s a catch: Maintaining this perfect three-way alignment—panels to sun, radiator to the void, antennas to Earth—requires complex, high-torque attitude control systems. So this configuration means more payload and more computing power. Plus, these control systems are complex components with many failure modes, which is not optimal in a situation where maintenance is difficult.
The Killer Apps for Computing in Space
Given all these challenges of deploying massive radiators for satellites in the hostile environment of space, why build data centers in space at all?
Advertisement
While training or inference on LLMs in space doesn’t seem economical today, there are other, very compelling applications for computing in space. Here are two: solving the downlink bottleneck from Earth-observation satellites and enabling collision-preventing maneuvers in the increasingly crowded low Earth orbit.
The latest Earth-observation satellites, equipped with hyperspectral and synthetic aperture radar sensors, are used for a range of important reconnaissance missions, such as battlefield intelligence, tracking the global shadow fleet of ships carrying contraband, and assessing earthquakes or infrastructure failures down to the millimeter. These systems can generate hundreds of terabytes of raw data per day that must be transmitted to Earth. However, the radio-frequency “pipes” used to downlink the data are congested, and the ground infrastructure cannot absorb the sheer volume of raw data.
Another immediate, mission-critical application for in-space computation is protecting the orbital environment. With over 17,000 satellites in orbit, the overwhelming majority of which are in low Earth orbit, avoiding collisions between these satellites is crucial. As NASA astrophysicist Donald Kessler pointed out back in 1978, a single space collision could cause a cascading effect that renders the entirety of LEO unusable.
According to SpaceX’s recent annual report, the Starlink constellation executes a collision avoidance maneuver every 2 minutes on average. Each maneuver already relies on onboard AI systems but still requires most of the processing to happen on the ground.
SpaceX’s Starlink system currently has over 10,000 satellites in low Earth orbit, each depicted here as a colored dot.
Satellitemap.space
Advertisement
As low Earth orbit gets increasingly populated, collision avoidance will have to break the traditional ground-loop model. In the megaconstellation era of space, the OODA (observe, orient, decide, act) loop must happen onboard, thereby reducing the analysis turnaround from minutes to milliseconds.
The problem is that the flight computers standard on satellites are not built for this level of processing. The complex probability models required for maneuvering cannot currently be implemented by onboard computers in conjunction with their navigation systems. Clearly, more powerful computers are needed.
This is the true economic justification for moving compute to space: to move insight generation there. By placing high-performance computing adjacent to the sensors, we can process terabytes of data in orbit and downlink only the relevant data in real time, and we can do the computations necessary to avoid satellite collisions in real time.
The Future of Computing in Space
So, assuming that some form of computing is inevitable in low Earth orbit in the foreseeable future, how will the heat be handled? The industry is currently experimenting with two main classes of solutions to cope with the Stefan-Boltzmann law.
Advertisement
One creative option is to use origami-inspired radiators, the kind used for the James Webb telescope. Companies are developing flexible, high-conductivity composite radiators that fold into a tight cube for launch and unfurl into enormous yet lightweight thermal wings in orbit.
Another possibility is to use liquid-droplet radiators. This concept proposes removing the rigid radiator structure completely and instead spraying a stream of coolant oil directly into the vacuum of space. The fluid travels through an open loop, exposed to the near-absolute zero of the void, maximizing radiative surface area before being caught by a collector and pumped back into the ship. It sounds like science fiction, but as the heat loads climb into the megawatts, liquid-droplet cooling may be the only way to cheat the mass limits of this exponential reality.
Our rough total-cost-of-ownership model uses optimistic versions of current numbers, such as launch cost, chip cost, and power use. A critic might point out that future technology will improve, both in efficiency, purpose-built designs, and costs.
Sure, the technology is bound to improve. But the critical factor isn’t just launch cost; it’s the computing power per unit mass and electric-power economics. Radiators and solar arrays can consume 65 to 70 percent of total satellite mass, and space-grade photovoltaics run orders of magnitude more expensive than terrestrial equivalents.
Current space-grade solar panels rely on germanium substrates, whose supply is concentrated in China. It will be extremely difficult to scale up availability of these substrates. A transition to radiation-tolerant perovskite solar panels or a similar alternative could change the economics significantly, but that possibility is five years away or more. The technology will get cheaper, but the bottlenecks of power and thermal architecture will remain.
Recognizing the thermal reality of cooling in space forces us to shift how we view satellite operations. We are moving away from the “launch and forget” era toward an era of “autonomous logistics.” As our thermal model demonstrated, the harsh environment of space steadily attacks the hardware. UV radiation degrades thermal coatings; cosmic rays degrade silicon. In a traditional satellite model, when the radiator degrades or the memory fails, the satellite becomes space junk. For a multimillion-dollar data center, that disposal model is potentially ruinous.
To make the economics of orbital computation work, the infrastructure must be serviceable and the rockets to launch them reusable. The orbital domain will require automated servicing vehicles capable of swapping out degraded radiator panels and upgrading fried servers. In these ways, the future of the orbital data centers is dependent on the innovations of an emergent in-space economy.
Advertisement
There’s a good argument to be made that the need for space-based computation is less of a hype cycle and more of an enabler for the new space economy. Look no further than SpaceX’s recent regulatory filings proposing a constellation of up to a million satellites in low Earth orbit. At such a scale, routing all raw data back to Earth is physically impossible; the network itself must become the data center.
However, the winners in this sector will be determined by the systems architects who most cleverly accommodate the thermodynamics and the companies with sufficient vertical integration to take on the massive costs of operating data centers in orbit. Ultimately, the physics tax is universal. Whether managing heat rejection in the vacuum of low Earth orbit or managing power density in a hyperscale facility in Northern Virginia, the constraint is never the silicon. It’s the thermodynamics.
Unlike on Mars where for decades we have had dozens of orbital and ground-based platforms zipping and scurrying about to prod at every bit of emitted radiation, rock type and twitch of dust devils in its thin atmosphere, for other planets and their moons we have to do a lot more speculative interpretation of data. Such was the case with the presumed existence of water plumes on Jupiter’s moon Europa. These now appear to have been a statistical fluke, per research by [L. Roth] et al. in Astronomy & Astrophysics.
As succinctly summarized in the article on this by [Javier Barbuzano] of Sky and Telescope, the original 2013 finding of said water plumes by the same team was based on faint UV emissions from Europa’s southern hemisphere as captured by the Hubble Space Telescope. However, in more recent captures these emissions were not detected again, leading them to reexamine their original analysis of the 2013 data.
One of the main flaws was in the assumption of where Europe was located on Hubble’s 1,000 x 1,000 resolution detector, with the re-analysis showing that they were off by a couple of pixels. A second flaw was quite understandable as since 2013 we have learned that Europa has a thin hydrogen exosphere which interacts with the Sun’s UV radiation. The resulting scattering induces a UV glow which could be mistaken for UV radiation emanating from the moon’s surface.
Advertisement
Even with this one intriguing feature turning out to be a mirage, it doesn’t make Europa any less interesting as it’s still assumed to have vast liquid water oceans. Along with Uranus’ moon Miranda this makes it very worth it to experience more of the sights and sounds of these alien worlds, whether in person or via our robotic friends.
Adam Selipsky is now CEO of Helix Digital Infrastructure. (GeekWire photo)
Former Amazon Web Services CEO Adam Selipsky is returning to the world of cloud infrastructure as co-founder and CEO of Helix Digital Infrastructure, a newly-launched company backed by more than $10 billion.
The company was unveiled Thursday by investment firm KKR, which is partnering with Nvidia, power producer Vistra, and the Kuwait Investment Authority to build infrastructure aimed at supporting the growing demand for artificial intelligence computing. Bloomberg first reported the news in April.
Selipsky brings deep experience in cloud computing and enterprise technology. He joined AWS in 2005, helped build the business during its early years, served as CEO of Seattle-based Tableau Software from 2016 to 2021, and then returned to lead AWS before stepping down in 2024. He joined KKR as a senior technology and AI strategy advisor last September.
Helix plans to develop and operate data centers and related infrastructure, including power generation, fiber connectivity and land development. The company is targeting large technology customers that are racing to expand AI capacity amid increasing constraints around electricity availability, grid access and data center construction.
“Large users of digital infrastructure have an urgent need to reduce complexity and unlock new capacity,” Selipsky said in a statement. “Helix combines significant long-term capital with the capabilities and expertise to deliver holistic AI infrastructure solutions with speed and scale.”
Advertisement
Appearing on CNBC on Thursday morning, Selipsky said the large hyperscalers that will become Helix’s customers need help and that it’s a “misnomer” that complex data center projects are easy to scale.
“It is hard, and it is becoming even harder with AI and the pace of the buildouts and the scale of the buildouts,” Selipsky said. “They absolutely have the capabilities to do a lot of these things in house, and they absolutely need reliable partners.”
He said that more than 25% of announced data center projects are not delivering, and that’s why he’s in the middle of bringing on seasoned operators who can deliver for customers.
KKR and the Kuwait Investment Authority are providing the initial capital backing for the venture. Nvidia is joining as a founding investor and strategic partner, while Vistra will serve as Helix’s preferred power provider.
Advertisement
Waldemar Szlezak, KKR’s global head of digital infrastructure, will serve as chief investment officer. The company said it plans to bring in additional institutional investors over time.
The launch comes as demand for AI computing continues to drive massive investment in data centers, power generation, and other infrastructure needed to support increasingly sophisticated AI systems. It also comes at a time of growing public concern over data center construction in many communities, including the City of Seattle which just instituted a one year emergency ban on major data centers.
We’ve reached out to Helix for additional comment, and we will update this post as we learn more.
Anthropic launched Claude Corps: $150M to place 1,000 AI fellows at 400+ nonprofits. $85K salary, no degree needed. First 100 start October. Apps close July 17.
Anthropic is donating $150 million to place 1,000 AI fellows inside nonprofit organisations across the United States. The programme, called Claude Corps, will pay early-career workers $85,000 plus benefits for a year-long placement where they help nonprofits use Claude more effectively. Applications opened Wednesday and close on July 17.
No college degree is required. Applicants must be 18 or older, hold US work authorisation, and have no more than two years of full-time work experience. The first cohort of 100 fellows starts in October 2026. Subsequent cohorts begin in January and August 2027.
Each of the 400+ host organisations will receive a $10,000 grant and free Claude credits. Anthropic partnered with CodePath, a San Francisco nonprofit that helps first-generation and low-income students enter the tech workforce, to manage recruitment and training.
Advertisement
“We hope this program will expand and become a pillar of our strategy to help humankind realize the benefits of AI while also managing its risks,” said Anthropic President Daniela Amodei.
The programme is modelled loosely on service corps like AmeriCorps and Teach For America, but with a corporate sponsor and a product at its centre. Fellows are trained specifically on Claude. The organisations they serve will build their workflows around Claude. When the fellowship ends, the nonprofits are left with AI infrastructure tied to Anthropic’s ecosystem.
That dual purpose has drawn criticism. Fortune noted the “fox guarding the henhouse” dynamic: a $965 billion AI company is training the nonprofit sector to depend on its own product, funded by a donation that represents less than 0.02% of its valuation. Anthropic frames it as philanthropy. Sceptics see distribution strategy wrapped in a public benefit narrative.
Regardless of the framing, the programme addresses a real gap. Most nonprofits lack the staff, budget, and technical knowledge to adopt AI tools, even when those tools could meaningfully improve operations. Anthropic’s $100M Claude Partner Network, launched earlier, targets enterprises. Claude Corps targets the organisations that cannot afford enterprise partnerships.
Advertisement
The timing is deliberate. Anthropic is preparing for an IPO and positioning itself as the responsible AI company in a field dominated by OpenAI’s commercial aggression and Google’s scale. A $150 million nonprofit fellowship is a narrative play as much as a product play. Whether 1,000 fellows can make a meaningful difference across 400 organisations depends on whether the programme outlasts its PR value. Anthropic’s policy framework, published this week, calls for AI’s benefits to be “broadly shared.” Claude Corps is its first concrete attempt to deliver on that promise.
IPOs can be volatile, especially for retail investors. SpaceX is no exception.
Sundry Photography/Adobe Stock
I just did a quick Google search for SpaceX IPO. How many hundreds of articles are we actually expected to read about this?
Given the buzz around Friday’s big IPO, there are a few misconceptions worth addressing upfront. While many people view SpaceX as a massive, dominant space enterprise, it’s more complicated than that.
Advertisement
“In reality, it’s a very successful but fairly small satellite launch company, bolted onto a stagnant money-losing social media company and a money-incinerating AI company, and then sprinkled with a lot of hype about humankind going interplanetary,” said Robin Wigglesworth, editor of the Financial Times’ finance blog, Alphaville.
In other words, perhaps it’s more akin to a vertically integrated space and communications company with ambitious, high-risk side bets. Sure, at its center, SpaceX is a launch company that designs rockets (like Falcon 9 and Starship) and sells access to space. But around that, it has those related businesses — most notably Starlink, its satellite internet network, and xAI, which SpaceX acquired in February 2026. And since xAI includes the social media platform X and X’s chatbot, Grok, they’re also under the SpaceX umbrella.
X hasn’t been durable in terms of revenue. And, like most cash-burning AI enterprises, xAI is expensive to run and is reporting very large losses.
One could say the SpaceX ecosystem revolves around a single goal: building the infrastructure needed for global connectivity and, eventually, space settlement. But a major concern is that SpaceX’s overall package is driven more by hype and momentum than by its proven profitability.
Advertisement
Wigglesworth said the biggest immediate risk is straightforward: The stock could drop soon after it begins trading. That outcome would affect both the company and investors, though it wouldn’t necessarily signal broader economic trouble. As he noted, IPOs “do badly all the time.”
In the first few weeks after the IPO, price movements may be misleading. The opening day can be volatile, with banks helping stabilize prices and strong retail demand potentially pushing shares higher. We’ll also see index funds start to buy in, which can help nudge the price up a bit.
However, as Wigglesworth pointed out, the more meaningful test will come after a month, when the market determines whether there is sustained demand “for a company trading at some of the juiciest valuation multiples we’ve seen in history.”
So here’s another misconception to address: If SpaceX is popular, it’s safe to buy, right?
Advertisement
I didn’t have to read too many articles to get an answer to that.
“Popularity and renown are bad indicators for what makes a successful investment,” Wigglesworth told me. “Even good companies can be bad investments at a dumb price.”
AI-generated code is growing faster than security oversight mechanisms
Manual reviews struggle to keep pace with machine-generated software
Security leaders fear insecure coding patterns spreading through development pipelines
Artificial intelligence coding assistants have spread across development teams faster than security frameworks can adapt to.
New Salt Security research has claimed 90% of security leaders now report active concerns about risks posed by AI-generated software.
However, organizations continue embracing AI tools because they accelerate coding tasks, reduce time spent on repetitive work, and increase software delivery speed.
Latest Videos From
Human review cannot handle AI speed
Security leaders believe that development practices designed before AI became mainstream may no longer provide sufficient oversight.
Advertisement
Nearly a third (29%) of respondents identified insecure coding patterns as the primary risk introduced by AI assistants.
These systems learn from massive training datasets that contain their own flaws and outdated practices.
An AI tool can generate code that appears fully functional while quietly reproducing vulnerabilities a human might have caught.
Advertisement
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
This problem resembles how antivirus software must constantly update its definitions because new threats emerge faster than signature databases can grow.
The difference here is that no central authority tracks every insecure pattern an AI might replicate – as despite the widespread anxiety that AI introduces, more than one-third of organisations still depend on manual code reviews before any launch.
Reliance on human checking becomes structurally problematic when AI produces code at volumes no team can inspect thoroughly.
Advertisement
That method worked when developers wrote software at human speed, but it fails when AI accelerates output dramatically.
Reviewer fatigue sets in quickly, teams apply standards inconsistently, and security requirements get interpreted differently across departments.
AI coding assistants are fundamentally changing how software is built, but governance has not kept pace,” said Roey Eliyahu, CEO and co-founder at Salt Security.
Advertisement
“Most organisations recognise the risks, but many are still trying to manage AI-generated code using security processes designed for a pre-AI world.”
This approach does not scale any better than using a single email inbox to handle millions of daily messages without filtering or automation.
Enterprise complexity makes enforcement harder
Larger organisations with more than 500 employees face governance challenges that smaller firms simply do not encounter.
Advertisement
Distributed teams use different tools, follow varied workflows, and apply security standards with inconsistent rigour across regions.
The risk of developer overreliance on AI assistants grows proportionally with team size and delivery pressure.
Security agencies, including government cybersecurity bodies, have previously warned that AI systems expand attack surfaces and complicate accountability structures significantly.
Without better visibility into where AI-generated code enters the pipeline, governance remains guesswork dressed up as process.
Advertisement
Treating AI coding assistants as components of the software supply chain — similar to vetting any third-party malware risk — offers a more realistic path forward than hoping manual review will somehow catch up.
Well there’s your problem. (Credit: Bits und Bolts, YouTube)
Recently [Bits und Bolts] stumbled over a pair of Dragon 3000 branded 3dfx Voodoo 2 cards in his unfixed cards pile, and decided that the best course of action was to not only fix them, but also run them in SLI for some sweet Unreal Tournament action. Naturally, these cards being in the broken cards pile meant that he first had to figure out why they were broken and fix all issues.
The advantage of having two identical Voodoo 2 cards is of course that any missing components, like some resistors on one card, could be referenced on the other card. Beyond that it was mostly a matter of reflowing clearly corroded pins on the ICs and replacing damaged resistors and resistor arrays before the first tests could be run.
Using the mojo utility it was easy enough to spot that there were still some lingering issues, with clear issues visible in 3D games as well. These were tracked down to a dodgy pin on one of the texture mapping units (TMUs) that needed some more reflowing, and a very sneaky resistor array that was cracked but not obviously so until prodded with a multimeter.
With both cards now making happy noises when individually tested, it was time to go full SLI, fire up the Pentium 2 system and enjoy the glory of 24 MB of VRAM at high resolutions in Unreal Tournament. Considering that the bloke who had sent in these cards had found them while cleaning up a shed, it’s quite amazing how little rework was needed to once again party like it’s 1999.
GenAI image generators like Stable Diffusion do not draw a picture pixel by pixel from left to right. They start with noise and iteratively refine the entire image in parallel until it converges, in a process known as diffusion. For years, applying that same principle to text generation had remained out of reach at scale.
Standard language models work like a typewriter: one token at a time, left to right, with no ability to revise a committed output. That pattern works in the cloud, where batch sizes keep GPUs saturated. For local inference or low-concurrency deployments, the GPU is idle most of the time.
Google’s DiffusionGemma, released this week, is an open source experimental model that applies diffusion to text generation at production scale. Built on the Gemma 4 backbone and released under the Apache 2.0 license, it is the first diffusion language model natively supported in the open source vLLM inference platform. It generates a 256-token block in parallel rather than sequentially, with every token position attending to every other. Google says DiffusionGemma generates text up to 4x faster than standard models on GPUs. At batch size 1 on a single Nvidia H100, the FP8 version reaches 1,008 tokens per second. On H200, it hits 1,288 — roughly six times a standard autoregressive baseline, according to vLLM benchmark results published today.
Despite the speed gains, Google did not oversell the release. The company’s launch post acknowledged directly that DiffusionGemma’s overall output quality is lower than standard Gemma 4, adding “For applications that demand maximum quality, we recommend deploying standard Gemma 4.”
Advertisement
What DiffusionGemma does
DiffusionGemma does not generate tokens in order. It starts with a block of 256 random placeholder tokens, effectively a blank canvas, and runs multiple refinement passes over the entire block at once. On each pass, it evaluates every position and locks in the ones it is most confident about. Uncertain positions get randomized and reconsidered on the next pass, with the model using what it resolved in the previous round to inform the next attempt. The block converges progressively until enough positions stabilize to anchor the rest.
Two things follow from that architecture.
Self-correction. An autoregressive model that commits to a wrong token is stuck with it, because subsequent tokens are already conditioned on the mistake. DiffusionGemma can identify low-confidence positions and re-evaluate them on the next pass.
Bidirectional context. Every position attends to every other position in the block simultaneously, including tokens that appear later in the sequence. That makes the model structurally better suited to constrained generation tasks where left-to-right generation fails.
Google demonstrated both properties with a fine-tuned Sudoku solver. The base model solved zero puzzles. After fine-tuning on a Sudoku dataset, it reached an 80% success rate and converged in 12 denoising steps rather than 48. The efficiency gain came directly from the model’s ability to self-correct and stop early.
How it was built
DiffusionGemma runs as a 26B Mixture of Experts model that activates only 3.8B parameters during inference. Quantized, it fits within 18GB VRAM on consumer hardware including the Nvidia RTX 4090 and 5090. Google and NVIDIA also optimized for enterprise Hopper and Blackwell servers using NVFP4 kernels.
Advertisement
The vLLM integration required new work because DiffusionGemma does not fit the standard serving model. A typical vLLM batch applies the same attention type to every request. DiffusionGemma requests alternate between causal and bidirectional attention as they cycle through prompt reading, canvas refinement and block commit. The team built per-request attention switching into both the Triton and FlashAttention 4 backends and reused the existing speculative decoding path for the refinement loop.
The new ModelState interface the team built for this integration is designed to support additional diffusion models in vLLM as they emerge.
Where the speed wins and where it does not
DiffusionGemma’s speed advantage is real but conditional. Where it applies depends entirely on deployment context.
The numbers. At batch size 1 on a single H100, vLLM’s published benchmarks put the FP8 model at roughly five times a standard autoregressive baseline. On H200, roughly six times. Those peak figures reflect optimal conditions: single user, dedicated hardware, FP8 quantization.
Advertisement
Where it wins. Local inference, single-user applications and low-concurrency serving. In those conditions the GPU has spare compute and memory bandwidth is the bottleneck. DiffusionGemma’s parallel block generation fills that gap.
Where it does not. High-throughput cloud serving. When a server is batching hundreds of concurrent requests, autoregressive models already saturate available compute and DiffusionGemma’s parallel decoding provides diminishing returns.
The quality ceiling. Guilherme O’Tina, an AI researcher, put a finer point on it on X. “Local artifacts vs hallucinations are different problems and that decides where this actually wins,” O’Tina wrote.
How it compares
Diffusion language models are not new. Researchers have built them at smaller scales for several years, and Inception Labs’ Mercury Coder applied the approach commercially to coding tasks in 2025. What DiffusionGemma adds is scale — a 26B MoE backbone, native vLLM serving and a general-purpose instruction-tuned model rather than a domain-specific one.
The more useful comparison for engineers evaluating this against existing inference tooling is speculative decoding, and the distinction matters. Speculative decoding keeps a standard autoregressive target model and uses a smaller draft model to guess several tokens ahead. The target model verifies them in one pass. If sampling is correct, the output distribution stays identical to the target. The architecture is unchanged.
Advertisement
Andrew Kuncevich, an ML and AI researcher focused on production AI systems, put it directly on X. “DiffusionGemma is different. It does not just guess future tokens. It creates a noisy 256-token canvas and repeatedly denoises the whole block in parallel. So it’s not just a decoding trick — it’s a different generation paradigm,” Kuncevich wrote.
Compared to standard Gemma 4, the trade is speed for quality. Google’s benchmark data shows DiffusionGemma below standard Gemma 4 on general output quality metrics, with the gap varying by task.
On structured constrained tasks, including code infilling, template generation and problems requiring bidirectional constraint propagation, the architecture has a structural advantage that fine-tuning can surface, as the Sudoku result demonstrates. On open-ended generation, standard Gemma 4 remains the stronger option.
What this means for enterprises
DiffusionGemma serves via a standard vLLM OpenAI-compatible endpoint with no diffusion-specific pipeline changes required.
This is not a general-purpose model upgrade.
Advertisement
For teams running local or low-concurrency inference, the architecture choice just expanded. Until now, cutting generation latency on dedicated GPU hardware meant using a smaller model and accepting the quality trade-off. DiffusionGemma offers a third path at the same parameter footprint, on consumer hardware, with same-day vLLM support.
For constrained generation workloads, bidirectional attention is worth evaluating. Code infilling, structured data generation and tasks where correct output depends on context not yet generated are where this architecture has a structural edge.
The ModelState interface built for this integration is designed to generalize as additional diffusion models emerge.
The quality trade-off is real and Google acknowledges it. For teams running local inference on dedicated GPU hardware, this is worth testing.
Xiaomi’s MiMo AI team has open-sourced MiMo Code V0.1.0, a terminal-native AI coding assistant that the Chinese electronics giant says outperforms Anthropic’s Claude Code on key agentic coding benchmarks, especially on long-horizon, multi-step tasks (200+ steps) — at least, according to its own internal beta release and survey of 576 developers.
It’s also bundling limited-time free access to MiMo-V2.5, its multimodal flagship model with a million-token context window, requiring no registration to get started.
The release was announced June 10, 2026 in a post on the social network X from the official @XiaomiMiMo account, which described the tool as “more than an AI coding assistant in your terminal — it’s the smartest coding partner you’ll ever work with.”
MiMo Code is available now on GitHub under an MIT license, and installs with a single terminal command (curl -fsSL https://mimo.xiaomi.com/install | bash) on macOS and Linux or via npm (npm install -g @mimo-ai/cli) on Windows.
Advertisement
The project is a fork of the open-source OpenCode agent, which Xiaomi has extended with its own memory architecture, workflow modes, and model harness.
The end of AI coding agents’ amnesia?
As any avid vibe coder would surely attest, AI coding agents degrade over long working sessions: as the context window fills, earlier decisions, conventions, and task state get compacted away or lost entirely, forcing developers to re-explain their projects.
Xiaomi argues this approach is doomed at scale. “What we need is not better compression, but an explicit storage-and-retrieval mechanism that decides what information should be written into persistent structures, and when it should be recalled,” the MiMo team noted in their launch blog.
MiMo Code attacks this with a cross-session memory system, powered under the hood by SQLite FTS5 full-text search, that spans four layers: project memory (a persistent MEMORY.md file), session checkpoints, scratch notes, and per-task progress logs.
Advertisement
The note-taking is key, here: Rather than forcing the primary coding agent to pause its work to take notes, the system deploys an independent “checkpoint-writer” subagent.
Think of it the primary coding agent as a construction contractor working to build a massive mansion alongside a dedicated architect, the checkpoint-writer subagent. While the main agent focuses on building out the physical structure, the subagent updates the blueprints in real time, noting decisions, issues, and the actual lay of the land as the construction project progresses.
When the context window approaches its limits — the contractor gets lost in the half-built mansion — it can consult the subagent and find its place again. In the case of MiMo Code, the system simply rebuilds the environment from structured checkpoints with the relevant context, ensuring no loss of operational momentum.
Two self-improvement mechanisms round out the system: a /dream command that periodically (roughly every seven days) reviews historical sessions, deduplicates them, and compresses them into long-term memory, and a “distill” function that mines past sessions for repeated workflows that can be automated, following a similar approach taken recently by OpenAI and Anthropic with their various models.
Advertisement
Impressive performance on software engineering (SWE) benchmarks
According to benchmark figures published in Xiaomi’s technical blog post, MiMo Code paired with MiMo-V2.5-Pro outperformed Claude Code paired with Claude Sonnet 4.6 on all three evaluations tested:
MiMo Code vs. Claude Code benchmark performance. Credit: Xiaomi
SWE-bench Verified: 82% vs. 79%
SWE-bench Pro: 62% vs. 55%
Terminal Bench 2: 73% vs. 69%
The harness itself accounts for a measurable share of the gain. Running the same MiMo-V2.5-Pro model in both harnesses, MiMo Code scored 62% on SWE-bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus 68% — roughly five points each, attributable purely to the agent system rather than the model.
Xiaomi notably did not publish comparisons against OpenAI’s Codex or Google’s Gemini CLI — Claude Code is the sole named competitor throughout its materials, a telling choice of benchmark target.
Advertisement
Independent reference points suggest why. On the official Terminal-Bench 2.0 leaderboard maintained at tbench.ai, OpenAI’s Codex CLI running GPT-5.5 scores 82.2% — roughly nine points above MiMo Code’s self-reported 73% — and OpenAI’s own GPT-5.5 announcement claims 82.7% on the same benchmark.
On SWE-Bench Pro, however, the picture flips: OpenAI reports GPT-5.5 at 58.6%, below MiMo Code + MiMo-V2.5-Pro’s claimed 62%. (MiMo Code does not yet appear on either official leaderboard, and cross-comparing self-run numbers against leaderboard submissions carries the usual configuration caveats.)
Perhaps more interesting than the offline benchmarks: Xiaomi says it ran a human double-blind A/B evaluation during its internal beta, covering 576 developers working in 474 real private repositories, producing 1,213 judged head-to-head pairs against Claude Code using the same target model.
Under 200 execution steps, the two systems split roughly 50/50 — but past 200 steps, MiMo Code’s win rate rose above 65%, supporting the company’s thesis that its memory and state-management architecture pays off specifically on long-horizon work.
Advertisement
Xiaomi itself concedes the standard benchmarks “still measure one-shot problem-solving ability” and don’t capture the tool’s multi-session design goals.
As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claims are consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.
Easy integration with existing developer systems and voice control
From a user experience standpoint, MiMo Code is designed to live where developers already work. It operates directly in the terminal, reading and writing files, running commands, and managing Git.
Out of the box, the tool requires zero configuration, connecting automatically to “MiMo Auto”—a free-for-a-limited-time channel powered by Xiaomi’s multimodal MiMo V2.5 model, which boasts a massive million-token context window. For developers migrating from existing environments, the transition is frictionless: MiMo Code automatically imports MCP servers, custom skills, and API configurations from Claude Code.
Advertisement
Other noteworthy features include:
Compose mode: Pressing Tab switches the agent into a specification-driven workflow in which the developer describes a high-level goal and the system autonomously executes the full development cycle — design, planning, coding, testing, and review — following what Xiaomi describes as a “heavy planning upfront, stable verification later” strategy.
Voice control: Built on Xiaomi’s MiMo-ASR speech recognition with TenVAD voice activity detection, developers can dictate and modify instructions verbally and speak commands like “send” and “execute” for fully hands-free operation (available for logged-in users).
According to Xiaomi, the gains from the agent harness itself are measurable. Running the same underlying MiMo model in both harnesses, the company says MiMo Code scored 62% on SWE-Bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus Claude Code’s 68% — roughly five percentage points better on each, attributable purely to the agent system rather than the model.
As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claim is consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.
Aggressively affordable
The bigger lure for many developers may be what’s bundled in.
Advertisement
MiMo Code ships with “MiMo Auto,” a zero-configuration channel offering free, limited-time access to MiMo-V2.5 — the natively multimodal model Xiaomi released in late April 2026, a sparse mixture-of-experts design with 310 billion total parameters (just 15 billion active per inference) and a 1 million token context window, which the company positions as matching Anthropic’s Claude Sonnet 4.6 in multimodal agentic work.
The larger MiMo-V2.5-Pro — a 1.02-trillion-parameter mixture-of-experts model with 42 billion active parameters and a hybrid-attention architecture — led the open-source field on Xiaomi’s ClawEval agentic benchmark with a 63.8% success rate while consuming only about 70,000 tokens per trajectory, roughly 40–60% fewer than Anthropic’s Claude Opus 4.6, Google’s Gemini 3.1 Pro, or OpenAI’s GPT-5.4 needed for comparable results.
Notably, the V2.5-Pro’s post-training was explicitly designed to instill “harness awareness” — training the model to manage its own memory and context within agent scaffolds like Claude Code or OpenCode — making a Xiaomi-built harness optimized around that capability a logical next step.
Advertisement
Pricing is similarly aggressive: MiMo-V2.5 starts at $0.40 per million input tokens and $2.00 per million output tokens, while V2.5-Pro runs $1.00/$3.00 per million (input/output) up to 256K context, doubling beyond that, with cache hits dropping input costs to as little as $0.20–$0.40 per million, making it among the cheapest frontier models available globally.
For developers who don’t want Xiaomi’s models at all, MiMo Code also supports third-party backends — including token plans from DeepSeek, Moonshot’s Kimi, and Zhipu’s GLM — along with any OpenAI-compatible API, mirroring the bring-your-own-model flexibility of its OpenCode parent.
Terminal AI coding agent wars go global
MiMo Code lands in an increasingly crowded field of terminal-based coding agents: Anthropic’s Claude Code, OpenAI’s Codex CLI, Google’s Gemini CLI, and open-source players like OpenCode and Aider.
What’s new is the entrant. Xiaomi — the world’s third-largest smartphone maker, with a fast-growing EV business — has been methodically building its MiMo AI division since the release of the MiMo-7B reasoning model in April 2025, following with the MiMo-VL vision-language series, MiMo-V2-Flash, the 1-trillion-parameter MiMo-V2-Pro in March 2026, and the V2.5 flagship family in April.
The effort is led by Fuli Luo, a veteran of DeepSeek’s disruptive R1 project, who has characterized Xiaomi’s frontier push as a “quiet ambush” — and backed it with a 100-trillion free token grant for builders announced alongside the V2.5 launch.
Advertisement
The playbook is familiar from DeepSeek, Alibaba’s Qwen, MiniMax, and Moonshot AI’s Kimi series: release genuinely capable models and tooling under permissive licenses at a fraction of U.S. lab pricing, and convert the resulting developer mindshare into a durable ecosystem.
By pairing an open-source agent harness with a free frontier-class model, Xiaomi is effectively eliminating both the licensing and the usage cost of entry — at least for now.
What it means for enterprises and technical decision-makers
For engineering leaders, MiMo Code is a low-risk, potentially high-value evaluation candidate: MIT-style licensing permits modification and commercial integration, the OpenCode lineage means the architecture is inspectable, and the bring-your-own-model support means it can be pointed at an internally approved endpoint rather than Xiaomi’s cloud.
The persistent memory system addresses a real and widely felt pain point in agentic development workflows — one that competitors are also racing to solve.
Advertisement
The countervailing considerations: the “free for a limited time” model access is by definition temporary and routes code context through Xiaomi’s servers, which will be a non-starter for organizations with strict data-residency or IP policies; the benchmark edge over Claude Code is self-reported; and a V0.1.0 release number signals exactly what it suggests about maturity.
Teams subject to U.S. government procurement restrictions on Chinese technology vendors should also weigh that context before adopting.
Many mid range phones stick to familiar shapes and modest power reserves, yet Tecno stepped forward with the Pova 8 5G carrying both a giant battery and an unexpected visual flourish on the rear. That flourish takes the form of a small dot matrix panel tucked into the camera module. What looks like a third lens from a distance actually serves as a compact LED grid capable of displaying simple animations and patterns. Tecno named it the Alive Matrix Display, and it activates for incoming calls, new notifications, charging progress, or even active gaming moments. Around 49 different animations come preloaded, with options to personalize the behavior and appearance.
Owners can watch the lights on the camera island pulse or evolve into shapes that correspond to the situation, transforming what would otherwise be a rather standard video setup into something considerably more dynamic. The rear panel that snaps on features a sequence of geometric lines that give it a semi-transparent appearance, and it comes in a range of colors, all of which help the lights show through when switched on. The front panel includes a 6.76-inch screen with a 144Hz refresh rate, allowing videos and games to run smoothly. That screen is also bright enough to be seen outside, and the built-in eye strain reduction is especially handy if you plan on using it for extended periods of time.
Google Pixel 10a is a durable, everyday phone with more[1]; snap brilliant photography on a simple, powerful camera, get 30+ hours out of a full…
Unlocked Android phone gives you the flexibility to change carriers and choose your own data plan; it works with Google Fi, Verizon, T-Mobile, AT&T…
Pixel 10a is sleek and durable, with a super smooth finish, scratch-resistant Corning Gorilla Glass 7i display, and IP68 water and dust protection[4]
Under the hood, a MediaTek Dimensity 7100 CPU handles all of the daily tasks and mild gaming demands, with some specialty chips helping to boost signal strength in areas where it is a little weak. A large graphite layer provides cooling, and the phone remains comfortable to handle even after hours of gaming. In terms of storage and memory, the launch models hit a good balance for most people: not too much, but enough to avoid feeling limited. The camera setup is quite standard, with a 50-megapixel Sony sensor that supports autofocus and zooming, as well as a second lens for group shots. The selfie camera is decent for video calls, but let’s be honest, the lights on the phone’s back are the main attraction.
Advertisement
Another important feature is the power delivery system, which incorporates an 8000 milliamp hour battery with a certified multi-day runtime in regular use, as well as 45 watt wired charging that can charge the battery to 50% in 35 minutes. If necessary, you can even use the phone to charge your wired earbuds or another phone. As an added benefit, it appears that the battery will still perform effectively after thousands of charge cycles.
The phone runs Android 16 with Tecno’s HiOS 16 on top, and the company promises to keep the software updated for an extended period of time. There are also some AI-powered extras, such as photo cleanup and video summaries, as well as noise reduction during calls, which will only be available in specific areas. If you buy in a supported region, you will also receive additional cloud storage. When it comes to making sure the phone survives the rigors of everyday life, Tecno has it covered. The phone is resistant to dust and water splashes, and it’s been built to withstand a few accidental drops and bumps. Even though it has a pretty healthy battery, it’s only 9 millimeters thick, though it’s a little heavier due to the power inside.
The starting price in India is approximately 30,000 rupees ($314) for the lower memory version, which will be available from all major online retailers within the next week or so. So, if you’re searching for a phone with a long battery life and a nice design, this one might be worth considering, even if it’s not the most powerful camera phone on the market or made of highest-quality materials. [Source]
If you’re a CrossOver user on Intel or use 32-bit gaming bottles, your time is up with version 27. 64-bit bottles and Apple Silicon are now required.
Gaming on Mac has always been a bit of a wasteland, but that doesn’t stop some folks from trying. The CrossOver app for Mac brings Windows games to the platform, and it gets better with each update.
However, the latest update, CrossOver 27, will have to make some sacrifices to make development a little more streamlined. It is getting ARM64 builds for both Mac and Linux, but CrossOver 27 will only work on macOS Sonoma or newer.
There’s also a final warning about those who still may be using 32-bit gaming bottles. Users are urged to move their 32-bit games to 64-bit bottles, or they will no longer function.
The developer did note that this should affect a small percentage of users overall. Around 97% of CrossOver users are running macOS Sonoma or newer.
Removing legacy support will allow the development team to focus on UI and optimization for one set of computers instead of maintaining Intel-compatible systems. It also means that a new user interface will debut at some point in a future release.
If you are on an Intel machine or running an older version of macOS, the good news is that CrossOver 26 won’t suddenly combust. Simply don’t pay for the new version or attempt to upgrade and everything will work as is, hopefully.
Advertisement
However, note that if you do keep CrossOver 26, your games could run into compatibility issues if they are updated. Also, newer operating systems may cause problems with the older software.
Eventually, your only choice might be to finally move to Apple Silicon.
You must be logged in to post a comment Login