Connect with us
DAPA Banner
DAPA Coin
DAPA
COIN PAYMENT ASSET
PRIVACY · BLOCKDAG · HOMOMORPHIC ENCRYPTION · RUST
ElGamal Encrypted MINE DAPA
🚫 GENESIS SOLD OUT
DAPAPAY COMING

Tech

Are your hybrid meetings doing more harm than good? New survey finds many of us ‘forget’ about remote colleagues

Published

on


  • Hybrid meetings can leave remote workers feeling excluded, Jabra study finds
  • Unsuitable and dated setups cause regular meeting delays and technical failures
  • Better meeting room kit and clear meeting purposes could improve engagement

Around half of remote participants say they’re forgotten, talked over or excluded during hybrid meetings, a new study from Jabra has revealed, indicating that hybrid in-person and remote meetings might not be as effective as we’d thought.

The issue is particularly evident when multiple participants are in a physical room, with others joining online. But more than that, women (16%) and junior workers (26%) are more likely to feel they’re being excluded.

Source link

Advertisement
Continue Reading
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Tech

Amazon claims data centers are 7-times more water-efficient than rivals as Seattle pauses new builds

Published

on

Pipes carrying reclaimed water for cooling at an Amazon Web Services data center. (AWS Photo)

Amazon Web Services on Thursday announced that efforts to curb water use at its data centers have made it seven times more water-efficient than the industry average.

The company says it’s 75% of the way toward its goal of being water positive by 2030, meaning for each gallon consumed at a data center, it will return a greater volume to the same community where it was drawn.

Data center operators are trying to address concerns about water and energy usage as AI adoption drives massive expansion of the facilities.

Even in Amazon’s backyard, resistance is growing. Seattle’s city council this week unanimously approved a one-year emergency moratorium on new large data centers inside city limits.

AWS executives said the reality of these facilities can differ from public perception.

Advertisement

“As we’ve been engaging with our local communities, they’ve been very pleasantly surprised about how little water we are using,” Kerry Person, AWS vice president of Data Center Operations, told GeekWire. “We’re starting to share more and more of this information publicly to really just educate folks.”

Data centers use a variety of strategies to keep their electronics cool. Those include fans, air that’s cooled using evaporated water, air conditioning and direct liquid cooling. The approaches involve resource tradeoffs: air conditioning draws more electricity but saves water, while evaporative cooling is less energy-intensive but consumes more water.

AWS uses fans to cool its facilities about 90% of the time, drawing in outside air, blowing it past server racks and releasing it back outside. The company switches to evaporative cooling when outside temperatures exceed roughly 85 degrees. Another water savings was gained by researching the maximum temperatures its electronics can tolerate, and running machines under warmer conditions.

@media (max-width: 600px) {
aside.callout { float:none !important; max-width:100% !important; margin-left:0 !important; margin-right:0 !important; }
aside.callout .callout-img { display:none !important; }
}

That allows the company to use 0.12 liters of water per kilowatt-hour of operations, compared to an industry average of 0.84 liters. The rate applies to both Amazon-owned facilities and leased data center space internationally, and has been verified by outside auditors.

Advertisement

While it touts its own accomplishments, Amazon also notes that the global data center industry uses less water than many may realize, accounting for 0.5% of all industrial water use worldwide.

Other tech companies are likewise implementing water-saving strategies and policies. Earlier this year, Microsoft pledged a 40% improvement in water efficiency by 2030 and committed to replenishing more water than it uses in each district where it operates. It also started installing closed-loop systems where water flows past heat-generating processing chips, drawing off heat that it carries to chillers. Then the cooled water starts the journey all over again.

But public concerns persist, particularly in regions facing water shortages. In 2025, Bloomberg reported that nearly two-thirds of the U.S. data centers that were built or are under development in the past three years are located in water-stressed areas.

Simon Hans Edasi, a Seattle-area data scientist and geospatial researcher, has examined data center locations in Washington state relative to water availability, energy access and other factors. He raised concerns about Amazon’s planned $4.8 billion campus in Burbank, near the Columbia River. The industry overall is moving “deeper into arid eastern Washington,” Edasi said.

Advertisement

Without addressing that specific project, Will Hewes, Amazon’s water stewardship lead, said the company focuses on three things at each location: drawing as little water as possible, using recycled water sourced from treatment plants rather than drinking water supplies, and partnering with local organizations to replenish water back into the area.

“For any of those water-stressed basins where we’re operating, we’re making sure that in each of those we’re also putting more back,” Hewes said.

Replenishment efforts vary by location. They can include programs such as helping farmers use wastewater from data centers for irrigation, or working with building managers to fix water loss from running toilets and leaky faucets.

AWS consumed about 2.5 billion gallons of water for its data centers worldwide last year. Through replenishment efforts, the company reports returning 3 gallons for every 4 that it used.

Advertisement

Source link

Continue Reading

Tech

Mechanical Stability For Your Coils

Published

on

If you work with radio, the chances are that before too long you’ll be winding an inductor. At radio frequencies these won’t be big chunky transformer style chokes, but often air-cored affairs supported by their own rigidity. As grizzled old radio amateurs will tell you though, relying on such a coil for stability is a fool’s errand. It will shift inductance from the slightest movement, thermal expansion, or even sound. Luckily [SolderSmoke] is here to remind us of the trusty fix, in the form of Q-dope, or a polystyrene solution that dries to form a rigid low-dielectric coating.

Where this is being written it wasn’t on the market so it was more usual to use nail lacquer, but reading the piece it seems American hams swore by the stuff. That’s in the past tense because it seems it’s no longer on the market. Even there though help is at hand, because dissolving packaging polystyrene in solvent yields an acceptable substitute. There’s even an 11-year-old how-to video linked from the SolderSmoke post, should you fancy making some of your own. We suggest you proceed with caution though, polymers dissolved in solvents sounds a lot like home-made napalm, and probably puts out fumes you don’t want to breathe.

Meanwhile should you fancy experiments of your own with inductors, we’ve got you covered.

Advertisement

Source link

Continue Reading

Tech

Evidence For Water Vapor Plumes On Europa Vanishes In Re-Analysis

Published

on

Unlike on Mars where for decades we have had dozens of orbital and ground-based platforms zipping and scurrying about to prod at every bit of emitted radiation, rock type and twitch of dust devils in its thin atmosphere, for other planets and their moons we have to do a lot more speculative interpretation of data. Such was the case with the presumed existence of water plumes on Jupiter’s moon Europa. These now appear to have been a statistical fluke, per research by [L. Roth] et al. in Astronomy & Astrophysics.

As succinctly summarized in the article on this by [Javier Barbuzano] of Sky and Telescope, the original 2013 finding of said water plumes by the same team was based on faint UV emissions from Europa’s southern hemisphere as captured by the Hubble Space Telescope. However, in more recent captures these emissions were not detected again, leading them to reexamine their original analysis of the 2013 data.

One of the main flaws was in the assumption of where Europe was located on Hubble’s 1,000 x 1,000 resolution detector, with the re-analysis showing that they were off by a couple of pixels. A second flaw was quite understandable as since 2013 we have learned that Europa has a thin hydrogen exosphere which interacts with the Sun’s UV radiation. The resulting scattering induces a UV glow which could be mistaken for UV radiation emanating from the moon’s surface.

Advertisement

Even with this one intriguing feature turning out to be a mirage, it doesn’t make Europa any less interesting as it’s still assumed to have vast liquid water oceans. Along with Uranus’ moon Miranda this makes it very worth it to experience more of the sights and sounds of these alien worlds, whether in person or via our robotic friends.

Source link

Advertisement
Continue Reading

Tech

Former AWS CEO Adam Selipsky to lead new $10B AI data center venture

Published

on

Adam Selipsky is now CEO of Helix Digital Infrastructure. (GeekWire photo)

Former Amazon Web Services CEO Adam Selipsky is returning to the world of cloud infrastructure as co-founder and CEO of Helix Digital Infrastructure, a newly-launched company backed by more than $10 billion.

The company was unveiled Thursday by investment firm KKR, which is partnering with Nvidia, power producer Vistra, and the Kuwait Investment Authority to build infrastructure aimed at supporting the growing demand for artificial intelligence computing. Bloomberg first reported the news in April.

Selipsky brings deep experience in cloud computing and enterprise technology. He joined AWS in 2005, helped build the business during its early years, served as CEO of Seattle-based Tableau Software from 2016 to 2021, and then returned to lead AWS before stepping down in 2024. He joined KKR as a senior technology and AI strategy advisor last September.

Helix plans to develop and operate data centers and related infrastructure, including power generation, fiber connectivity and land development. The company is targeting large technology customers that are racing to expand AI capacity amid increasing constraints around electricity availability, grid access and data center construction.

“Large users of digital infrastructure have an urgent need to reduce complexity and unlock new capacity,” Selipsky said in a statement. “Helix combines significant long-term capital with the capabilities and expertise to deliver holistic AI infrastructure solutions with speed and scale.”

Advertisement

Appearing on CNBC on Thursday morning, Selipsky said the large hyperscalers that will become Helix’s customers need help and that it’s a “misnomer” that complex data center projects are easy to scale.

“It is hard, and it is becoming even harder with AI and the pace of the buildouts and the scale of the buildouts,” Selipsky said. “They absolutely have the capabilities to do a lot of these things in house, and they absolutely need reliable partners.”

He said that more than 25% of announced data center projects are not delivering, and that’s why he’s in the middle of bringing on seasoned operators who can deliver for customers.

KKR and the Kuwait Investment Authority are providing the initial capital backing for the venture. Nvidia is joining as a founding investor and strategic partner, while Vistra will serve as Helix’s preferred power provider.

Advertisement

Waldemar Szlezak, KKR’s global head of digital infrastructure, will serve as chief investment officer. The company said it plans to bring in additional institutional investors over time.

The launch comes as demand for AI computing continues to drive massive investment in data centers, power generation, and other infrastructure needed to support increasingly sophisticated AI systems. It also comes at a time of growing public concern over data center construction in many communities, including the City of Seattle which just instituted a one year emergency ban on major data centers.

We’ve reached out to Helix for additional comment, and we will update this post as we learn more.

Source link

Advertisement
Continue Reading

Tech

Anthropic is spending $150M to embed 1,000 AI fellows inside nonprofits. No degree required.

Published

on

TL;DR

Anthropic launched Claude Corps: $150M to place 1,000 AI fellows at 400+ nonprofits. $85K salary, no degree needed. First 100 start October. Apps close July 17.

Anthropic is donating $150 million to place 1,000 AI fellows inside nonprofit organisations across the United States. The programme, called Claude Corps, will pay early-career workers $85,000 plus benefits for a year-long placement where they help nonprofits use Claude more effectively. Applications opened Wednesday and close on July 17.

No college degree is required. Applicants must be 18 or older, hold US work authorisation, and have no more than two years of full-time work experience. The first cohort of 100 fellows starts in October 2026. Subsequent cohorts begin in January and August 2027.

Each of the 400+ host organisations will receive a $10,000 grant and free Claude credits. Anthropic partnered with CodePath, a San Francisco nonprofit that helps first-generation and low-income students enter the tech workforce, to manage recruitment and training.

Advertisement

We hope this program will expand and become a pillar of our strategy to help humankind realize the benefits of AI while also managing its risks,” said Anthropic President Daniela Amodei.

The programme is modelled loosely on service corps like AmeriCorps and Teach For America, but with a corporate sponsor and a product at its centre. Fellows are trained specifically on Claude. The organisations they serve will build their workflows around Claude. When the fellowship ends, the nonprofits are left with AI infrastructure tied to Anthropic’s ecosystem.

That dual purpose has drawn criticism. Fortune noted the “fox guarding the henhouse” dynamic: a $965 billion AI company is training the nonprofit sector to depend on its own product, funded by a donation that represents less than 0.02% of its valuation. Anthropic frames it as philanthropy. Sceptics see distribution strategy wrapped in a public benefit narrative.

Regardless of the framing, the programme addresses a real gap. Most nonprofits lack the staff, budget, and technical knowledge to adopt AI tools, even when those tools could meaningfully improve operations. Anthropic’s $100M Claude Partner Network, launched earlier, targets enterprises. Claude Corps targets the organisations that cannot afford enterprise partnerships.

Advertisement

The timing is deliberate. Anthropic is preparing for an IPO and positioning itself as the responsible AI company in a field dominated by OpenAI’s commercial aggression and Google’s scale. A $150 million nonprofit fellowship is a narrative play as much as a product play. Whether 1,000 fellows can make a meaningful difference across 400 organisations depends on whether the programme outlasts its PR value. Anthropic’s policy framework, published this week, calls for AI’s benefits to be “broadly shared.” Claude Corps is its first concrete attempt to deliver on that promise.

Source link

Advertisement
Continue Reading

Tech

SpaceX’s IPO Live: What Elon Musk’s Public Offering Means for Tech, the Market and You

Published

on

close up of SpaceX HQ building

IPOs can be volatile, especially for retail investors. SpaceX is no exception. 

Sundry Photography/Adobe Stock

I just did a quick Google search for SpaceX IPO. How many hundreds of articles are we actually expected to read about this? 

Given the buzz around Friday’s big IPO, there are a few misconceptions worth addressing upfront. While many people view SpaceX as a massive, dominant space enterprise, it’s more complicated than that. 

Advertisement

“In reality, it’s a very successful but fairly small satellite launch company, bolted onto a stagnant money-losing social media company and a money-incinerating AI company, and then sprinkled with a lot of hype about humankind going interplanetary,” said Robin Wigglesworth, editor of the Financial Times’ finance blog, Alphaville. 

In other words, perhaps it’s more akin to a vertically integrated space and communications company with ambitious, high-risk side bets. Sure, at its center, SpaceX is a launch company that designs rockets (like Falcon 9 and Starship) and sells access to space. But around that, it has those related businesses — most notably Starlink, its satellite internet network, and xAI, which SpaceX acquired in February 2026. And since xAI includes the social media platform X and X’s chatbot, Grok, they’re also under the SpaceX umbrella. 

X hasn’t been durable in terms of revenue. And, like most cash-burning AI enterprises, xAI is expensive to run and is reporting very large losses. 

One could say the SpaceX ecosystem revolves around a single goal: building the infrastructure needed for global connectivity and, eventually, space settlement. But a major concern is that SpaceX’s overall package is driven more by hype and momentum than by its proven profitability. 

Advertisement

Wigglesworth said the biggest immediate risk is straightforward: The stock could drop soon after it begins trading. That outcome would affect both the company and investors, though it wouldn’t necessarily signal broader economic trouble. As he noted, IPOs “do badly all the time.” 

In the first few weeks after the IPO, price movements may be misleading. The opening day can be volatile, with banks helping stabilize prices and strong retail demand potentially pushing shares higher. We’ll also see index funds start to buy in, which can help nudge the price up a bit. 

However, as Wigglesworth pointed out, the more meaningful test will come after a month, when the market determines whether there is sustained demand “for a company trading at some of the juiciest valuation multiples we’ve seen in history.”

So here’s another misconception to address: If SpaceX is popular, it’s safe to buy, right? 

Advertisement

I didn’t have to read too many articles to get an answer to that. 

“Popularity and renown are bad indicators for what makes a successful investment,” Wigglesworth told me. “Even good companies can be bad investments at a dumb price.” 

Source link

Advertisement
Continue Reading

Tech

Nearly every security chief fears AI-generated code as development teams race ahead of outdated oversight systems

Published

on


  • AI-generated code is growing faster than security oversight mechanisms
  • Manual reviews struggle to keep pace with machine-generated software
  • Security leaders fear insecure coding patterns spreading through development pipelines

Artificial intelligence coding assistants have spread across development teams faster than security frameworks can adapt to.

New Salt Security research has claimed 90% of security leaders now report active concerns about risks posed by AI-generated software.

Source link

Advertisement
Continue Reading

Tech

Repairing A Pair Of Voodoo 2 GPUs For Some SLI Action

Published

on

Well there's your problem. (Credit: Bits und Bolts, YouTube)
Well there’s your problem. (Credit: Bits und Bolts, YouTube)

Recently [Bits und Bolts] stumbled over a pair of Dragon 3000 branded 3dfx Voodoo 2 cards in his unfixed cards pile, and decided that the best course of action was to not only fix them, but also run them in SLI for some sweet Unreal Tournament action. Naturally, these cards being in the broken cards pile meant that he first had to figure out why they were broken and fix all issues.

The advantage of having two identical Voodoo 2 cards is of course that any missing components, like some resistors on one card, could be referenced on the other card. Beyond that it was mostly a matter of reflowing clearly corroded pins on the ICs and replacing damaged resistors and resistor arrays before the first tests could be run.

Using the mojo utility it was easy enough to spot that there were still some lingering issues, with clear issues visible in 3D games as well. These were tracked down to a dodgy pin on one of the texture mapping units (TMUs) that needed some more reflowing, and a very sneaky resistor array that was cracked but not obviously so until prodded with a multimeter.

With both cards now making happy noises when individually tested, it was time to go full SLI, fire up the Pentium 2 system and enjoy the glory of 24 MB of VRAM at high resolutions in Unreal Tournament. Considering that the bloke who had sent in these cards had found them while cleaning up a shed, it’s quite amazing how little rework was needed to once again party like it’s 1999.

Advertisement

Source link

Advertisement
Continue Reading

Tech

Google’s DiffusionGemma generates 256 tokens in parallel and self-corrects as it goes

Published

on

GenAI image generators like Stable Diffusion do not draw a picture pixel by pixel from left to right. They start with noise and iteratively refine the entire image in parallel until it converges, in a process known as diffusion. For years, applying that same principle to text generation had remained out of reach at scale.

Standard language models work like a typewriter: one token at a time, left to right, with no ability to revise a committed output. That pattern works in the cloud, where batch sizes keep GPUs saturated. For local inference or low-concurrency deployments, the GPU is idle most of the time.

Google’s DiffusionGemma, released this week, is an open source experimental model that applies diffusion to text generation at production scale. Built on the Gemma 4 backbone and released under the Apache 2.0 license, it is the first diffusion language model natively supported in the open source vLLM inference platform. It generates a 256-token block in parallel rather than sequentially, with every token position attending to every other. Google says DiffusionGemma generates text up to 4x faster than standard models on GPUs. At batch size 1 on a single Nvidia H100, the FP8 version reaches 1,008 tokens per second. On H200, it hits 1,288 — roughly six times a standard autoregressive baseline, according to vLLM benchmark results published today.

Despite the speed gains, Google did not oversell the release. The company’s launch post acknowledged directly that DiffusionGemma’s overall output quality is lower than standard Gemma 4, adding “For applications that demand maximum quality, we recommend deploying standard Gemma 4.”

Advertisement

What DiffusionGemma does

DiffusionGemma does not generate tokens in order. It starts with a block of 256 random placeholder tokens, effectively a blank canvas, and runs multiple refinement passes over the entire block at once. On each pass, it evaluates every position and locks in the ones it is most confident about. Uncertain positions get randomized and reconsidered on the next pass, with the model using what it resolved in the previous round to inform the next attempt. The block converges progressively until enough positions stabilize to anchor the rest.

Two things follow from that architecture.

  • Self-correction. An autoregressive model that commits to a wrong token is stuck with it, because subsequent tokens are already conditioned on the mistake. DiffusionGemma can identify low-confidence positions and re-evaluate them on the next pass.

  • Bidirectional context. Every position attends to every other position in the block simultaneously, including tokens that appear later in the sequence. That makes the model structurally better suited to constrained generation tasks where left-to-right generation fails.

Google demonstrated both properties with a fine-tuned Sudoku solver. The base model solved zero puzzles. After fine-tuning on a Sudoku dataset, it reached an 80% success rate and converged in 12 denoising steps rather than 48. The efficiency gain came directly from the model’s ability to self-correct and stop early.

How it was built

DiffusionGemma runs as a 26B Mixture of Experts model that activates only 3.8B parameters during inference. Quantized, it fits within 18GB VRAM on consumer hardware including the Nvidia RTX 4090 and 5090. Google and NVIDIA also optimized for enterprise Hopper and Blackwell servers using NVFP4 kernels.

Advertisement

The vLLM integration required new work because DiffusionGemma does not fit the standard serving model. A typical vLLM batch applies the same attention type to every request. DiffusionGemma requests alternate between causal and bidirectional attention as they cycle through prompt reading, canvas refinement and block commit. The team built per-request attention switching into both the Triton and FlashAttention 4 backends and reused the existing speculative decoding path for the refinement loop.

The new ModelState interface the team built for this integration is designed to support additional diffusion models in vLLM as they emerge.

Where the speed wins and where it does not

DiffusionGemma’s speed advantage is real but conditional. Where it applies depends entirely on deployment context.

The numbers. At batch size 1 on a single H100, vLLM’s published benchmarks put the FP8 model at roughly five times a standard autoregressive baseline. On H200, roughly six times. Those peak figures reflect optimal conditions: single user, dedicated hardware, FP8 quantization.

Advertisement
DiffusionGemma vLLM chart

Where it wins. Local inference, single-user applications and low-concurrency serving. In those conditions the GPU has spare compute and memory bandwidth is the bottleneck. DiffusionGemma’s parallel block generation fills that gap.

Where it does not. High-throughput cloud serving. When a server is batching hundreds of concurrent requests, autoregressive models already saturate available compute and DiffusionGemma’s parallel decoding provides diminishing returns.

The quality ceiling. Guilherme O’Tina, an AI researcher, put a finer point on it on X. “Local artifacts vs hallucinations are different problems and that decides where this actually wins,” O’Tina wrote.

How it compares

Diffusion language models are not new. Researchers have built them at smaller scales for several years, and Inception Labs’ Mercury Coder applied the approach commercially to coding tasks in 2025. What DiffusionGemma adds is scale — a 26B MoE backbone, native vLLM serving and a general-purpose instruction-tuned model rather than a domain-specific one.

The more useful comparison for engineers evaluating this against existing inference tooling is speculative decoding, and the distinction matters. Speculative decoding keeps a standard autoregressive target model and uses a smaller draft model to guess several tokens ahead. The target model verifies them in one pass. If sampling is correct, the output distribution stays identical to the target. The architecture is unchanged.

Advertisement

Andrew Kuncevich, an ML and AI researcher focused on production AI systems, put it directly on X. “DiffusionGemma is different. It does not just guess future tokens. It creates a noisy 256-token canvas and repeatedly denoises the whole block in parallel. So it’s not just a decoding trick — it’s a different generation paradigm,” Kuncevich wrote.

Compared to standard Gemma 4, the trade is speed for quality. Google’s benchmark data shows DiffusionGemma below standard Gemma 4 on general output quality metrics, with the gap varying by task.

DiffusionGemma intelligence vs latency

On structured constrained tasks, including code infilling, template generation and problems requiring bidirectional constraint propagation, the architecture has a structural advantage that fine-tuning can surface, as the Sudoku result demonstrates. On open-ended generation, standard Gemma 4 remains the stronger option.

What this means for enterprises

DiffusionGemma serves via a standard vLLM OpenAI-compatible endpoint with no diffusion-specific pipeline changes required.

This is not a general-purpose model upgrade.

Advertisement

For teams running local or low-concurrency inference, the architecture choice just expanded. Until now, cutting generation latency on dedicated GPU hardware meant using a smaller model and accepting the quality trade-off. DiffusionGemma offers a third path at the same parameter footprint, on consumer hardware, with same-day vLLM support.

For constrained generation workloads, bidirectional attention is worth evaluating. Code infilling, structured data generation and tasks where correct output depends on context not yet generated are where this architecture has a structural edge.

The ModelState interface built for this integration is designed to generalize as additional diffusion models emerge.

The quality trade-off is real and Google acknowledges it. For teams running local inference on dedicated GPU hardware, this is worth testing.

Advertisement

Source link

Continue Reading

Tech

Xiaomi’s new open source, agentic AI coding harness MiMo Code beats Claude Code at ultra-long, 200+ step tasks

Published

on

Xiaomi’s MiMo AI team has open-sourced MiMo Code V0.1.0, a terminal-native AI coding assistant that the Chinese electronics giant says outperforms Anthropic’s Claude Code on key agentic coding benchmarks, especially on long-horizon, multi-step tasks (200+ steps) — at least, according to its own internal beta release and survey of 576 developers.

It’s also bundling limited-time free access to MiMo-V2.5, its multimodal flagship model with a million-token context window, requiring no registration to get started.

The release was announced June 10, 2026 in a post on the social network X from the official @XiaomiMiMo account, which described the tool as “more than an AI coding assistant in your terminal — it’s the smartest coding partner you’ll ever work with.”

MiMo Code is available now on GitHub under an MIT license, and installs with a single terminal command (curl -fsSL https://mimo.xiaomi.com/install | bash) on macOS and Linux or via npm (npm install -g @mimo-ai/cli) on Windows.

Advertisement

The project is a fork of the open-source OpenCode agent, which Xiaomi has extended with its own memory architecture, workflow modes, and model harness.

The end of AI coding agents’ amnesia?

As any avid vibe coder would surely attest, AI coding agents degrade over long working sessions: as the context window fills, earlier decisions, conventions, and task state get compacted away or lost entirely, forcing developers to re-explain their projects.

Xiaomi argues this approach is doomed at scale. “What we need is not better compression, but an explicit storage-and-retrieval mechanism that decides what information should be written into persistent structures, and when it should be recalled,” the MiMo team noted in their launch blog.

MiMo Code attacks this with a cross-session memory system, powered under the hood by SQLite FTS5 full-text search, that spans four layers: project memory (a persistent MEMORY.md file), session checkpoints, scratch notes, and per-task progress logs.

Advertisement

The note-taking is key, here: Rather than forcing the primary coding agent to pause its work to take notes, the system deploys an independent “checkpoint-writer” subagent.

Think of it the primary coding agent as a construction contractor working to build a massive mansion alongside a dedicated architect, the checkpoint-writer subagent. While the main agent focuses on building out the physical structure, the subagent updates the blueprints in real time, noting decisions, issues, and the actual lay of the land as the construction project progresses.

When the context window approaches its limits — the contractor gets lost in the half-built mansion — it can consult the subagent and find its place again. In the case of MiMo Code, the system simply rebuilds the environment from structured checkpoints with the relevant context, ensuring no loss of operational momentum.

Two self-improvement mechanisms round out the system: a /dream command that periodically (roughly every seven days) reviews historical sessions, deduplicates them, and compresses them into long-term memory, and a “distill” function that mines past sessions for repeated workflows that can be automated, following a similar approach taken recently by OpenAI and Anthropic with their various models.

Advertisement

Impressive performance on software engineering (SWE) benchmarks

According to benchmark figures published in Xiaomi’s technical blog post, MiMo Code paired with MiMo-V2.5-Pro outperformed Claude Code paired with Claude Sonnet 4.6 on all three evaluations tested:

MiMo Code vs. Claude Code benchmark performance

MiMo Code vs. Claude Code benchmark performance. Credit: Xiaomi

  • SWE-bench Verified: 82% vs. 79%

  • SWE-bench Pro: 62% vs. 55%

  • Terminal Bench 2: 73% vs. 69%

The harness itself accounts for a measurable share of the gain. Running the same MiMo-V2.5-Pro model in both harnesses, MiMo Code scored 62% on SWE-bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus 68% — roughly five points each, attributable purely to the agent system rather than the model.

Xiaomi notably did not publish comparisons against OpenAI’s Codex or Google’s Gemini CLI — Claude Code is the sole named competitor throughout its materials, a telling choice of benchmark target.

Advertisement

Independent reference points suggest why. On the official Terminal-Bench 2.0 leaderboard maintained at tbench.ai, OpenAI’s Codex CLI running GPT-5.5 scores 82.2% — roughly nine points above MiMo Code’s self-reported 73% — and OpenAI’s own GPT-5.5 announcement claims 82.7% on the same benchmark.

On SWE-Bench Pro, however, the picture flips: OpenAI reports GPT-5.5 at 58.6%, below MiMo Code + MiMo-V2.5-Pro’s claimed 62%. (MiMo Code does not yet appear on either official leaderboard, and cross-comparing self-run numbers against leaderboard submissions carries the usual configuration caveats.)

Perhaps more interesting than the offline benchmarks: Xiaomi says it ran a human double-blind A/B evaluation during its internal beta, covering 576 developers working in 474 real private repositories, producing 1,213 judged head-to-head pairs against Claude Code using the same target model.

Under 200 execution steps, the two systems split roughly 50/50 — but past 200 steps, MiMo Code’s win rate rose above 65%, supporting the company’s thesis that its memory and state-management architecture pays off specifically on long-horizon work.

Advertisement

Xiaomi itself concedes the standard benchmarks “still measure one-shot problem-solving ability” and don’t capture the tool’s multi-session design goals.

As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claims are consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.

Easy integration with existing developer systems and voice control

From a user experience standpoint, MiMo Code is designed to live where developers already work. It operates directly in the terminal, reading and writing files, running commands, and managing Git.

Out of the box, the tool requires zero configuration, connecting automatically to “MiMo Auto”—a free-for-a-limited-time channel powered by Xiaomi’s multimodal MiMo V2.5 model, which boasts a massive million-token context window. For developers migrating from existing environments, the transition is frictionless: MiMo Code automatically imports MCP servers, custom skills, and API configurations from Claude Code.

Advertisement

Other noteworthy features include:

  • Compose mode: Pressing Tab switches the agent into a specification-driven workflow in which the developer describes a high-level goal and the system autonomously executes the full development cycle — design, planning, coding, testing, and review — following what Xiaomi describes as a “heavy planning upfront, stable verification later” strategy.

  • Voice control: Built on Xiaomi’s MiMo-ASR speech recognition with TenVAD voice activity detection, developers can dictate and modify instructions verbally and speak commands like “send” and “execute” for fully hands-free operation (available for logged-in users).

According to Xiaomi, the gains from the agent harness itself are measurable. Running the same underlying MiMo model in both harnesses, the company says MiMo Code scored 62% on SWE-Bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus Claude Code’s 68% — roughly five percentage points better on each, attributable purely to the agent system rather than the model.

As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claim is consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.

Aggressively affordable

The bigger lure for many developers may be what’s bundled in.

Advertisement

MiMo Code ships with “MiMo Auto,” a zero-configuration channel offering free, limited-time access to MiMo-V2.5 — the natively multimodal model Xiaomi released in late April 2026, a sparse mixture-of-experts design with 310 billion total parameters (just 15 billion active per inference) and a 1 million token context window, which the company positions as matching Anthropic’s Claude Sonnet 4.6 in multimodal agentic work.

As VentureBeat reported when the MiMo-V2.5 family launched in April, the models are MIT-licensed and among the most efficient and affordable available for agentic tasks.

The larger MiMo-V2.5-Pro — a 1.02-trillion-parameter mixture-of-experts model with 42 billion active parameters and a hybrid-attention architecture — led the open-source field on Xiaomi’s ClawEval agentic benchmark with a 63.8% success rate while consuming only about 70,000 tokens per trajectory, roughly 40–60% fewer than Anthropic’s Claude Opus 4.6, Google’s Gemini 3.1 Pro, or OpenAI’s GPT-5.4 needed for comparable results.

Notably, the V2.5-Pro’s post-training was explicitly designed to instill “harness awareness” — training the model to manage its own memory and context within agent scaffolds like Claude Code or OpenCode — making a Xiaomi-built harness optimized around that capability a logical next step.

Advertisement

Pricing is similarly aggressive: MiMo-V2.5 starts at $0.40 per million input tokens and $2.00 per million output tokens, while V2.5-Pro runs $1.00/$3.00 per million (input/output) up to 256K context, doubling beyond that, with cache hits dropping input costs to as little as $0.20–$0.40 per million, making it among the cheapest frontier models available globally.

Model

Input

Output

Advertisement

Total Cost

Source

MiMo-V2.5 Flash

$0.10

Advertisement

$0.30

$0.40

Xiaomi MiMo

deepseek-v4-flash

Advertisement

$0.14

$0.28

$0.42

DeepSeek

Advertisement

deepseek-v4-pro

$0.435

$0.87

$1.305

Advertisement

DeepSeek

MiniMax-M3

$0.30

$1.20

Advertisement

$1.50

MiniMax

Gemini 3.1 Flash-Lite

$0.25

Advertisement

$1.50

$1.75

Google

Qwen3.7-Plus

Advertisement

$0.40

$1.60

$2.00

Alibaba Cloud

Advertisement

MiMo-V2.5

$0.40

$2.00

$2.40

Advertisement

Xiaomi MiMo

Grok 4.3 (low context)

$1.25

$2.50

Advertisement

$3.75

xAI

MiMo-V2.5 Pro (≤256K)

$1.00

Advertisement

$3.00

$4.00

Xiaomi MiMo

GLM-5

Advertisement

$1.00

$3.20

$4.20

Z.ai

Advertisement

Kimi-K2.6

$0.95

$4.00

$4.95

Advertisement

Moonshot/Kimi

GLM-5.1

$1.40

$4.40

Advertisement

$5.80

Z.ai

Grok 4.3 (high context)

$2.50

Advertisement

$5.00

$7.50

xAI

MiMo-V2.5 Pro (>256K)

Advertisement

$2.00

$6.00

$8.00

Xiaomi MiMo

Advertisement

Qwen3.7-Max

$2.50

$7.50

$10.00

Advertisement

Alibaba Cloud

Gemini 3.5 Flash

$1.50

$9.00

Advertisement

$10.50

Google

Gemini 3.1 Pro Preview (≤200K)

$2.00

Advertisement

$12.00

$14.00

Google

GPT-5.4

Advertisement

$2.50

$15.00

$17.50

OpenAI

Advertisement

Gemini 3.1 Pro Preview (>200K)

$4.00

$18.00

$22.00

Advertisement

Google

Claude Opus 4.8

$5.00

$25.00

Advertisement

$30.00

Anthropic

GPT-5.5

$5.00

Advertisement

$30.00

$35.00

OpenAI

Claude Fable 5 / Claude Mythos 5

Advertisement

$10.00

$50.00

$60.00

Anthropic

Advertisement

For developers who don’t want Xiaomi’s models at all, MiMo Code also supports third-party backends — including token plans from DeepSeek, Moonshot’s Kimi, and Zhipu’s GLM — along with any OpenAI-compatible API, mirroring the bring-your-own-model flexibility of its OpenCode parent.

Terminal AI coding agent wars go global

MiMo Code lands in an increasingly crowded field of terminal-based coding agents: Anthropic’s Claude Code, OpenAI’s Codex CLI, Google’s Gemini CLI, and open-source players like OpenCode and Aider.

What’s new is the entrant. Xiaomi — the world’s third-largest smartphone maker, with a fast-growing EV business — has been methodically building its MiMo AI division since the release of the MiMo-7B reasoning model in April 2025, following with the MiMo-VL vision-language series, MiMo-V2-Flash, the 1-trillion-parameter MiMo-V2-Pro in March 2026, and the V2.5 flagship family in April.

The effort is led by Fuli Luo, a veteran of DeepSeek’s disruptive R1 project, who has characterized Xiaomi’s frontier push as a “quiet ambush” — and backed it with a 100-trillion free token grant for builders announced alongside the V2.5 launch.

Advertisement

The playbook is familiar from DeepSeek, Alibaba’s Qwen, MiniMax, and Moonshot AI’s Kimi series: release genuinely capable models and tooling under permissive licenses at a fraction of U.S. lab pricing, and convert the resulting developer mindshare into a durable ecosystem.

By pairing an open-source agent harness with a free frontier-class model, Xiaomi is effectively eliminating both the licensing and the usage cost of entry — at least for now.

What it means for enterprises and technical decision-makers

For engineering leaders, MiMo Code is a low-risk, potentially high-value evaluation candidate: MIT-style licensing permits modification and commercial integration, the OpenCode lineage means the architecture is inspectable, and the bring-your-own-model support means it can be pointed at an internally approved endpoint rather than Xiaomi’s cloud.

The persistent memory system addresses a real and widely felt pain point in agentic development workflows — one that competitors are also racing to solve.

Advertisement

The countervailing considerations: the “free for a limited time” model access is by definition temporary and routes code context through Xiaomi’s servers, which will be a non-starter for organizations with strict data-residency or IP policies; the benchmark edge over Claude Code is self-reported; and a V0.1.0 release number signals exactly what it suggests about maturity.

Teams subject to U.S. government procurement restrictions on Chinese technology vendors should also weigh that context before adopting.

Source link

Advertisement
Continue Reading

Trending

Copyright © 2025