Connect with us
DAPA Banner
DAPA Coin
DAPA
COIN PAYMENT ASSET
PRIVACY · BLOCKDAG · HOMOMORPHIC ENCRYPTION · RUST
ElGamal Encrypted MINE DAPA
🚫 GENESIS SOLD OUT
DAPAPAY COMING

Tech

What AI benchmarks miss about real-world performance

Published

on

Presented by F5


Enterprise AI teams have spent years solving for compute, securing GPU allocations, negotiating cloud capacity, and benchmarking training throughput. The assumption embedded in that work is that the path between storage and compute will keep up. In production, that assumption increasingly does not hold. Real traffic introduces latency spikes, network jitter, and node degradation that controlled benchmarks fail to capture, resulting in pipelines that perform well in the lab but stall in deployment. A growing response is AI data delivery, deploying an application delivery controller (ADC) or application delivery and security platform (ADSP) in front of storage as a resilient and secure control point.

“Provisioning solves for capacity but not for delivery, and that is where the constraint now hides,” says Hunter Smit, senior manager of product marketing at F5. “Enterprises buy enough GPUs and enough storage, then assume the path between them will keep up, but AI traffic is bursty, highly concurrent, and random in its reads in ways ordinary storage networking was never built to absorb.”

The production gap benchmarks don’t show

Standard benchmark methodology compounds the problem, says Paul Pindell, principal solutions architect for technology alliances at F5.

Advertisement

“Benchmark testing is usually built to produce the best possible performance or security result, not the most realistic one,” he says. “With S3, latency is a known factor in degrading performance, so meaningful testing has to introduce consistent latency into the path.”

Most benchmark environments never do that, which means the performance numbers enterprises rely on for infrastructure decisions are drawn from conditions that production systems will never replicate. To test this assumption, F5 and MinIO conducted throughput testing under degraded network conditions.

“What stood out was how quickly S3 throughput falls off once you introduce latency,” Pindell says. “Even modest latency takes a real bite out of it, and as latency climbs toward long-haul distances, the degradation gets severe.”

The testing also showed latency mattered far more than jitter as a driver of throughput loss, which inverted what the team had expected going in. The upshot for enterprise architects is that S3 object storage deployments cannot be designed around clean-room assumptions; they have to be engineered for the degraded network conditions they will actually face.

Advertisement

The cost of fragile data paths

“In AI infrastructure, people naturally focus on GPUs because they’re the most visible and expensive resource,” says Tanu Mutreja, senior director of product management at F5. “But in production environments, GPUs generate only as much value as the data path that feeds them.”

That path runs through storage, networking, databases, security, and orchestration layers, often stitched together from multiple vendors. Customers experience none of those seams; they experience the output of the whole system.

When the data path degrades, the effects compound. GPU underutilization is the most immediate and visible symptom, but Mutreja pointed to a wider set of consequences: degraded inference performance, poor-quality AI outputs, higher egress costs from unnecessary data replication, and growing operational complexity.

“At scale, data-path efficiency becomes a strategic business lever rather than technical optimization,” she says. “When the data path is engineered well, GPUs remain productive, AI applications stay responsive and trustworthy, operations scale efficiently, and organizations maximize the return on their AI investments.”

Advertisement

AI workloads are structurally more exposed to these failures than traditional enterprise applications. Databases, ERP systems, and web services absorb transient storage delays through caching and buffering. AI workloads running across massively parallel GPU clusters have no equivalent protection. As Mutreja noted, even minor latency spikes or bandwidth bottlenecks can cascade across large GPU clusters, simultaneously hitting utilization, training efficiency, and the customer experience.

Treating the storage edge as a control point

For decades, storage and intelligence operated as sequential concerns in enterprise architecture: data was stored first, then analyzed downstream. Mutreja argued that this model no longer fits the demands of AI.

“Competitive advantage is determined not only by the volume of data, but also by relevance, lineage, security, and performant delivery of data,” she says. “Across the industry, from NVIDIA and AWS to enterprise storage providers, the movement is toward embedding intelligence directly into data infrastructure rather than stacking it on top.”

F5’s integration with MinIO instantiates this approach at the layer where storage and compute actually interact. As part of the F5 ADSP, BIG-IP sits in the data path, continuously monitoring the health of MinIO’s distributed storage nodes and directing requests only to those that remain available.

Advertisement

The operational impact of that capability becomes clear when nodes degrade, which is expected in distributed storage clusters. Without intelligent routing, clients that land on an unhealthy node must retry and may land on another degraded node, dragging down overall performance.

“F5 makes sure traffic only goes to healthy nodes, or even the least busy ones, so S3 client traffic is always processed in the most efficient way,” Pindell says.

Governance across distributed environments

The challenge grows at scale, when AI pipelines stretch across multiple locations, clouds, or edge environments.

“Once an AI pipeline crosses regions and clouds, the question stops being about performance and becomes about control,” Smit says. “You are operating under different rules in every jurisdiction, and digital sovereignty is now a design constraint. Where your data is allowed to live, who is permitted to touch it, and which borders it cannot cross now shapes the architecture before anyone talks about speed.”

Advertisement

That pressure is driving a visible trend of enterprises repatriating AI workloads from public cloud onto infrastructure they own and govern directly. The architecture Smit described resolves this by decoupling applications from any single storage location and placing a unified control point between them that enforces consistent policy across all of them.

“Sovereignty, resilience, and cost stop being trade-offs you manage one region at a time,” he explains. “They become a capability you run as a system.”

Storage-to-compute path as a managed control point

To solve for these issues, enterprise teams need to stop treating the storage-to-compute path as a direct connection and start treating it as a managed control point, Smit says. SecureIQLab’s independent validation of F5 BIG-IP in storage deployments has confirmed the approach delivers resilience without surrendering throughput.

“Insert a full-proxy ADC between the two, and the path becomes observable, programmable, and failure-aware, with health-based routing, quality of service, and security enforced inline,” he explains. “That single move converts data delivery from an assumption into an engineered discipline, which is what keeps GPUs fed when conditions degrade.”

Advertisement

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.

Source link

Continue Reading
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Tech

Nearly every security chief fears AI-generated code as development teams race ahead of outdated oversight systems

Published

on


  • AI-generated code is growing faster than security oversight mechanisms
  • Manual reviews struggle to keep pace with machine-generated software
  • Security leaders fear insecure coding patterns spreading through development pipelines

Artificial intelligence coding assistants have spread across development teams faster than security frameworks can adapt to.

New Salt Security research has claimed 90% of security leaders now report active concerns about risks posed by AI-generated software.

Source link

Advertisement
Continue Reading

Tech

Repairing A Pair Of Voodoo 2 GPUs For Some SLI Action

Published

on

Well there's your problem. (Credit: Bits und Bolts, YouTube)
Well there’s your problem. (Credit: Bits und Bolts, YouTube)

Recently [Bits und Bolts] stumbled over a pair of Dragon 3000 branded 3dfx Voodoo 2 cards in his unfixed cards pile, and decided that the best course of action was to not only fix them, but also run them in SLI for some sweet Unreal Tournament action. Naturally, these cards being in the broken cards pile meant that he first had to figure out why they were broken and fix all issues.

The advantage of having two identical Voodoo 2 cards is of course that any missing components, like some resistors on one card, could be referenced on the other card. Beyond that it was mostly a matter of reflowing clearly corroded pins on the ICs and replacing damaged resistors and resistor arrays before the first tests could be run.

Using the mojo utility it was easy enough to spot that there were still some lingering issues, with clear issues visible in 3D games as well. These were tracked down to a dodgy pin on one of the texture mapping units (TMUs) that needed some more reflowing, and a very sneaky resistor array that was cracked but not obviously so until prodded with a multimeter.

With both cards now making happy noises when individually tested, it was time to go full SLI, fire up the Pentium 2 system and enjoy the glory of 24 MB of VRAM at high resolutions in Unreal Tournament. Considering that the bloke who had sent in these cards had found them while cleaning up a shed, it’s quite amazing how little rework was needed to once again party like it’s 1999.

Advertisement

Source link

Advertisement
Continue Reading

Tech

Google’s DiffusionGemma generates 256 tokens in parallel and self-corrects as it goes

Published

on

GenAI image generators like Stable Diffusion do not draw a picture pixel by pixel from left to right. They start with noise and iteratively refine the entire image in parallel until it converges, in a process known as diffusion. For years, applying that same principle to text generation had remained out of reach at scale.

Standard language models work like a typewriter: one token at a time, left to right, with no ability to revise a committed output. That pattern works in the cloud, where batch sizes keep GPUs saturated. For local inference or low-concurrency deployments, the GPU is idle most of the time.

Google’s DiffusionGemma, released this week, is an open source experimental model that applies diffusion to text generation at production scale. Built on the Gemma 4 backbone and released under the Apache 2.0 license, it is the first diffusion language model natively supported in the open source vLLM inference platform. It generates a 256-token block in parallel rather than sequentially, with every token position attending to every other. Google says DiffusionGemma generates text up to 4x faster than standard models on GPUs. At batch size 1 on a single Nvidia H100, the FP8 version reaches 1,008 tokens per second. On H200, it hits 1,288 — roughly six times a standard autoregressive baseline, according to vLLM benchmark results published today.

Despite the speed gains, Google did not oversell the release. The company’s launch post acknowledged directly that DiffusionGemma’s overall output quality is lower than standard Gemma 4, adding “For applications that demand maximum quality, we recommend deploying standard Gemma 4.”

Advertisement

What DiffusionGemma does

DiffusionGemma does not generate tokens in order. It starts with a block of 256 random placeholder tokens, effectively a blank canvas, and runs multiple refinement passes over the entire block at once. On each pass, it evaluates every position and locks in the ones it is most confident about. Uncertain positions get randomized and reconsidered on the next pass, with the model using what it resolved in the previous round to inform the next attempt. The block converges progressively until enough positions stabilize to anchor the rest.

Two things follow from that architecture.

  • Self-correction. An autoregressive model that commits to a wrong token is stuck with it, because subsequent tokens are already conditioned on the mistake. DiffusionGemma can identify low-confidence positions and re-evaluate them on the next pass.

  • Bidirectional context. Every position attends to every other position in the block simultaneously, including tokens that appear later in the sequence. That makes the model structurally better suited to constrained generation tasks where left-to-right generation fails.

Google demonstrated both properties with a fine-tuned Sudoku solver. The base model solved zero puzzles. After fine-tuning on a Sudoku dataset, it reached an 80% success rate and converged in 12 denoising steps rather than 48. The efficiency gain came directly from the model’s ability to self-correct and stop early.

How it was built

DiffusionGemma runs as a 26B Mixture of Experts model that activates only 3.8B parameters during inference. Quantized, it fits within 18GB VRAM on consumer hardware including the Nvidia RTX 4090 and 5090. Google and NVIDIA also optimized for enterprise Hopper and Blackwell servers using NVFP4 kernels.

Advertisement

The vLLM integration required new work because DiffusionGemma does not fit the standard serving model. A typical vLLM batch applies the same attention type to every request. DiffusionGemma requests alternate between causal and bidirectional attention as they cycle through prompt reading, canvas refinement and block commit. The team built per-request attention switching into both the Triton and FlashAttention 4 backends and reused the existing speculative decoding path for the refinement loop.

The new ModelState interface the team built for this integration is designed to support additional diffusion models in vLLM as they emerge.

Where the speed wins and where it does not

DiffusionGemma’s speed advantage is real but conditional. Where it applies depends entirely on deployment context.

The numbers. At batch size 1 on a single H100, vLLM’s published benchmarks put the FP8 model at roughly five times a standard autoregressive baseline. On H200, roughly six times. Those peak figures reflect optimal conditions: single user, dedicated hardware, FP8 quantization.

Advertisement
DiffusionGemma vLLM chart

Where it wins. Local inference, single-user applications and low-concurrency serving. In those conditions the GPU has spare compute and memory bandwidth is the bottleneck. DiffusionGemma’s parallel block generation fills that gap.

Where it does not. High-throughput cloud serving. When a server is batching hundreds of concurrent requests, autoregressive models already saturate available compute and DiffusionGemma’s parallel decoding provides diminishing returns.

The quality ceiling. Guilherme O’Tina, an AI researcher, put a finer point on it on X. “Local artifacts vs hallucinations are different problems and that decides where this actually wins,” O’Tina wrote.

How it compares

Diffusion language models are not new. Researchers have built them at smaller scales for several years, and Inception Labs’ Mercury Coder applied the approach commercially to coding tasks in 2025. What DiffusionGemma adds is scale — a 26B MoE backbone, native vLLM serving and a general-purpose instruction-tuned model rather than a domain-specific one.

The more useful comparison for engineers evaluating this against existing inference tooling is speculative decoding, and the distinction matters. Speculative decoding keeps a standard autoregressive target model and uses a smaller draft model to guess several tokens ahead. The target model verifies them in one pass. If sampling is correct, the output distribution stays identical to the target. The architecture is unchanged.

Advertisement

Andrew Kuncevich, an ML and AI researcher focused on production AI systems, put it directly on X. “DiffusionGemma is different. It does not just guess future tokens. It creates a noisy 256-token canvas and repeatedly denoises the whole block in parallel. So it’s not just a decoding trick — it’s a different generation paradigm,” Kuncevich wrote.

Compared to standard Gemma 4, the trade is speed for quality. Google’s benchmark data shows DiffusionGemma below standard Gemma 4 on general output quality metrics, with the gap varying by task.

DiffusionGemma intelligence vs latency

On structured constrained tasks, including code infilling, template generation and problems requiring bidirectional constraint propagation, the architecture has a structural advantage that fine-tuning can surface, as the Sudoku result demonstrates. On open-ended generation, standard Gemma 4 remains the stronger option.

What this means for enterprises

DiffusionGemma serves via a standard vLLM OpenAI-compatible endpoint with no diffusion-specific pipeline changes required.

This is not a general-purpose model upgrade.

Advertisement

For teams running local or low-concurrency inference, the architecture choice just expanded. Until now, cutting generation latency on dedicated GPU hardware meant using a smaller model and accepting the quality trade-off. DiffusionGemma offers a third path at the same parameter footprint, on consumer hardware, with same-day vLLM support.

For constrained generation workloads, bidirectional attention is worth evaluating. Code infilling, structured data generation and tasks where correct output depends on context not yet generated are where this architecture has a structural edge.

The ModelState interface built for this integration is designed to generalize as additional diffusion models emerge.

The quality trade-off is real and Google acknowledges it. For teams running local inference on dedicated GPU hardware, this is worth testing.

Advertisement

Source link

Continue Reading

Tech

Xiaomi’s new open source, agentic AI coding harness MiMo Code beats Claude Code at ultra-long, 200+ step tasks

Published

on

Xiaomi’s MiMo AI team has open-sourced MiMo Code V0.1.0, a terminal-native AI coding assistant that the Chinese electronics giant says outperforms Anthropic’s Claude Code on key agentic coding benchmarks, especially on long-horizon, multi-step tasks (200+ steps) — at least, according to its own internal beta release and survey of 576 developers.

It’s also bundling limited-time free access to MiMo-V2.5, its multimodal flagship model with a million-token context window, requiring no registration to get started.

The release was announced June 10, 2026 in a post on the social network X from the official @XiaomiMiMo account, which described the tool as “more than an AI coding assistant in your terminal — it’s the smartest coding partner you’ll ever work with.”

MiMo Code is available now on GitHub under an MIT license, and installs with a single terminal command (curl -fsSL https://mimo.xiaomi.com/install | bash) on macOS and Linux or via npm (npm install -g @mimo-ai/cli) on Windows.

Advertisement

The project is a fork of the open-source OpenCode agent, which Xiaomi has extended with its own memory architecture, workflow modes, and model harness.

The end of AI coding agents’ amnesia?

As any avid vibe coder would surely attest, AI coding agents degrade over long working sessions: as the context window fills, earlier decisions, conventions, and task state get compacted away or lost entirely, forcing developers to re-explain their projects.

Xiaomi argues this approach is doomed at scale. “What we need is not better compression, but an explicit storage-and-retrieval mechanism that decides what information should be written into persistent structures, and when it should be recalled,” the MiMo team noted in their launch blog.

MiMo Code attacks this with a cross-session memory system, powered under the hood by SQLite FTS5 full-text search, that spans four layers: project memory (a persistent MEMORY.md file), session checkpoints, scratch notes, and per-task progress logs.

Advertisement

The note-taking is key, here: Rather than forcing the primary coding agent to pause its work to take notes, the system deploys an independent “checkpoint-writer” subagent.

Think of it the primary coding agent as a construction contractor working to build a massive mansion alongside a dedicated architect, the checkpoint-writer subagent. While the main agent focuses on building out the physical structure, the subagent updates the blueprints in real time, noting decisions, issues, and the actual lay of the land as the construction project progresses.

When the context window approaches its limits — the contractor gets lost in the half-built mansion — it can consult the subagent and find its place again. In the case of MiMo Code, the system simply rebuilds the environment from structured checkpoints with the relevant context, ensuring no loss of operational momentum.

Two self-improvement mechanisms round out the system: a /dream command that periodically (roughly every seven days) reviews historical sessions, deduplicates them, and compresses them into long-term memory, and a “distill” function that mines past sessions for repeated workflows that can be automated, following a similar approach taken recently by OpenAI and Anthropic with their various models.

Advertisement

Impressive performance on software engineering (SWE) benchmarks

According to benchmark figures published in Xiaomi’s technical blog post, MiMo Code paired with MiMo-V2.5-Pro outperformed Claude Code paired with Claude Sonnet 4.6 on all three evaluations tested:

MiMo Code vs. Claude Code benchmark performance

MiMo Code vs. Claude Code benchmark performance. Credit: Xiaomi

  • SWE-bench Verified: 82% vs. 79%

  • SWE-bench Pro: 62% vs. 55%

  • Terminal Bench 2: 73% vs. 69%

The harness itself accounts for a measurable share of the gain. Running the same MiMo-V2.5-Pro model in both harnesses, MiMo Code scored 62% on SWE-bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus 68% — roughly five points each, attributable purely to the agent system rather than the model.

Xiaomi notably did not publish comparisons against OpenAI’s Codex or Google’s Gemini CLI — Claude Code is the sole named competitor throughout its materials, a telling choice of benchmark target.

Advertisement

Independent reference points suggest why. On the official Terminal-Bench 2.0 leaderboard maintained at tbench.ai, OpenAI’s Codex CLI running GPT-5.5 scores 82.2% — roughly nine points above MiMo Code’s self-reported 73% — and OpenAI’s own GPT-5.5 announcement claims 82.7% on the same benchmark.

On SWE-Bench Pro, however, the picture flips: OpenAI reports GPT-5.5 at 58.6%, below MiMo Code + MiMo-V2.5-Pro’s claimed 62%. (MiMo Code does not yet appear on either official leaderboard, and cross-comparing self-run numbers against leaderboard submissions carries the usual configuration caveats.)

Perhaps more interesting than the offline benchmarks: Xiaomi says it ran a human double-blind A/B evaluation during its internal beta, covering 576 developers working in 474 real private repositories, producing 1,213 judged head-to-head pairs against Claude Code using the same target model.

Under 200 execution steps, the two systems split roughly 50/50 — but past 200 steps, MiMo Code’s win rate rose above 65%, supporting the company’s thesis that its memory and state-management architecture pays off specifically on long-horizon work.

Advertisement

Xiaomi itself concedes the standard benchmarks “still measure one-shot problem-solving ability” and don’t capture the tool’s multi-session design goals.

As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claims are consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.

Easy integration with existing developer systems and voice control

From a user experience standpoint, MiMo Code is designed to live where developers already work. It operates directly in the terminal, reading and writing files, running commands, and managing Git.

Out of the box, the tool requires zero configuration, connecting automatically to “MiMo Auto”—a free-for-a-limited-time channel powered by Xiaomi’s multimodal MiMo V2.5 model, which boasts a massive million-token context window. For developers migrating from existing environments, the transition is frictionless: MiMo Code automatically imports MCP servers, custom skills, and API configurations from Claude Code.

Advertisement

Other noteworthy features include:

  • Compose mode: Pressing Tab switches the agent into a specification-driven workflow in which the developer describes a high-level goal and the system autonomously executes the full development cycle — design, planning, coding, testing, and review — following what Xiaomi describes as a “heavy planning upfront, stable verification later” strategy.

  • Voice control: Built on Xiaomi’s MiMo-ASR speech recognition with TenVAD voice activity detection, developers can dictate and modify instructions verbally and speak commands like “send” and “execute” for fully hands-free operation (available for logged-in users).

According to Xiaomi, the gains from the agent harness itself are measurable. Running the same underlying MiMo model in both harnesses, the company says MiMo Code scored 62% on SWE-Bench Pro versus 57% for Claude Code, and 73% on Terminal Bench 2 versus Claude Code’s 68% — roughly five percentage points better on each, attributable purely to the agent system rather than the model.

As always, these are vendor self-reported numbers that haven’t been independently verified, and head-to-head harness comparisons are sensitive to configuration. But the claim is consistent with a broader industry pattern: scaffolding and harness engineering are becoming as important as raw model capability in agentic coding performance.

Aggressively affordable

The bigger lure for many developers may be what’s bundled in.

Advertisement

MiMo Code ships with “MiMo Auto,” a zero-configuration channel offering free, limited-time access to MiMo-V2.5 — the natively multimodal model Xiaomi released in late April 2026, a sparse mixture-of-experts design with 310 billion total parameters (just 15 billion active per inference) and a 1 million token context window, which the company positions as matching Anthropic’s Claude Sonnet 4.6 in multimodal agentic work.

As VentureBeat reported when the MiMo-V2.5 family launched in April, the models are MIT-licensed and among the most efficient and affordable available for agentic tasks.

The larger MiMo-V2.5-Pro — a 1.02-trillion-parameter mixture-of-experts model with 42 billion active parameters and a hybrid-attention architecture — led the open-source field on Xiaomi’s ClawEval agentic benchmark with a 63.8% success rate while consuming only about 70,000 tokens per trajectory, roughly 40–60% fewer than Anthropic’s Claude Opus 4.6, Google’s Gemini 3.1 Pro, or OpenAI’s GPT-5.4 needed for comparable results.

Notably, the V2.5-Pro’s post-training was explicitly designed to instill “harness awareness” — training the model to manage its own memory and context within agent scaffolds like Claude Code or OpenCode — making a Xiaomi-built harness optimized around that capability a logical next step.

Advertisement

Pricing is similarly aggressive: MiMo-V2.5 starts at $0.40 per million input tokens and $2.00 per million output tokens, while V2.5-Pro runs $1.00/$3.00 per million (input/output) up to 256K context, doubling beyond that, with cache hits dropping input costs to as little as $0.20–$0.40 per million, making it among the cheapest frontier models available globally.

Model

Input

Output

Advertisement

Total Cost

Source

MiMo-V2.5 Flash

$0.10

Advertisement

$0.30

$0.40

Xiaomi MiMo

deepseek-v4-flash

Advertisement

$0.14

$0.28

$0.42

DeepSeek

Advertisement

deepseek-v4-pro

$0.435

$0.87

$1.305

Advertisement

DeepSeek

MiniMax-M3

$0.30

$1.20

Advertisement

$1.50

MiniMax

Gemini 3.1 Flash-Lite

$0.25

Advertisement

$1.50

$1.75

Google

Qwen3.7-Plus

Advertisement

$0.40

$1.60

$2.00

Alibaba Cloud

Advertisement

MiMo-V2.5

$0.40

$2.00

$2.40

Advertisement

Xiaomi MiMo

Grok 4.3 (low context)

$1.25

$2.50

Advertisement

$3.75

xAI

MiMo-V2.5 Pro (≤256K)

$1.00

Advertisement

$3.00

$4.00

Xiaomi MiMo

GLM-5

Advertisement

$1.00

$3.20

$4.20

Z.ai

Advertisement

Kimi-K2.6

$0.95

$4.00

$4.95

Advertisement

Moonshot/Kimi

GLM-5.1

$1.40

$4.40

Advertisement

$5.80

Z.ai

Grok 4.3 (high context)

$2.50

Advertisement

$5.00

$7.50

xAI

MiMo-V2.5 Pro (>256K)

Advertisement

$2.00

$6.00

$8.00

Xiaomi MiMo

Advertisement

Qwen3.7-Max

$2.50

$7.50

$10.00

Advertisement

Alibaba Cloud

Gemini 3.5 Flash

$1.50

$9.00

Advertisement

$10.50

Google

Gemini 3.1 Pro Preview (≤200K)

$2.00

Advertisement

$12.00

$14.00

Google

GPT-5.4

Advertisement

$2.50

$15.00

$17.50

OpenAI

Advertisement

Gemini 3.1 Pro Preview (>200K)

$4.00

$18.00

$22.00

Advertisement

Google

Claude Opus 4.8

$5.00

$25.00

Advertisement

$30.00

Anthropic

GPT-5.5

$5.00

Advertisement

$30.00

$35.00

OpenAI

Claude Fable 5 / Claude Mythos 5

Advertisement

$10.00

$50.00

$60.00

Anthropic

Advertisement

For developers who don’t want Xiaomi’s models at all, MiMo Code also supports third-party backends — including token plans from DeepSeek, Moonshot’s Kimi, and Zhipu’s GLM — along with any OpenAI-compatible API, mirroring the bring-your-own-model flexibility of its OpenCode parent.

Terminal AI coding agent wars go global

MiMo Code lands in an increasingly crowded field of terminal-based coding agents: Anthropic’s Claude Code, OpenAI’s Codex CLI, Google’s Gemini CLI, and open-source players like OpenCode and Aider.

What’s new is the entrant. Xiaomi — the world’s third-largest smartphone maker, with a fast-growing EV business — has been methodically building its MiMo AI division since the release of the MiMo-7B reasoning model in April 2025, following with the MiMo-VL vision-language series, MiMo-V2-Flash, the 1-trillion-parameter MiMo-V2-Pro in March 2026, and the V2.5 flagship family in April.

The effort is led by Fuli Luo, a veteran of DeepSeek’s disruptive R1 project, who has characterized Xiaomi’s frontier push as a “quiet ambush” — and backed it with a 100-trillion free token grant for builders announced alongside the V2.5 launch.

Advertisement

The playbook is familiar from DeepSeek, Alibaba’s Qwen, MiniMax, and Moonshot AI’s Kimi series: release genuinely capable models and tooling under permissive licenses at a fraction of U.S. lab pricing, and convert the resulting developer mindshare into a durable ecosystem.

By pairing an open-source agent harness with a free frontier-class model, Xiaomi is effectively eliminating both the licensing and the usage cost of entry — at least for now.

What it means for enterprises and technical decision-makers

For engineering leaders, MiMo Code is a low-risk, potentially high-value evaluation candidate: MIT-style licensing permits modification and commercial integration, the OpenCode lineage means the architecture is inspectable, and the bring-your-own-model support means it can be pointed at an internally approved endpoint rather than Xiaomi’s cloud.

The persistent memory system addresses a real and widely felt pain point in agentic development workflows — one that competitors are also racing to solve.

Advertisement

The countervailing considerations: the “free for a limited time” model access is by definition temporary and routes code context through Xiaomi’s servers, which will be a non-starter for organizations with strict data-residency or IP policies; the benchmark edge over Claude Code is self-reported; and a V0.1.0 release number signals exactly what it suggests about maturity.

Teams subject to U.S. government procurement restrictions on Chinese technology vendors should also weigh that context before adopting.

Source link

Advertisement
Continue Reading

Tech

Tecno Pova 8 Brings a Dot Matrix Light Show to Its Camera Island

Published

on

Tecno Pova 8 5G Smartphone
Many mid range phones stick to familiar shapes and modest power reserves, yet Tecno stepped forward with the Pova 8 5G carrying both a giant battery and an unexpected visual flourish on the rear. That flourish takes the form of a small dot matrix panel tucked into the camera module. What looks like a third lens from a distance actually serves as a compact LED grid capable of displaying simple animations and patterns. Tecno named it the Alive Matrix Display, and it activates for incoming calls, new notifications, charging progress, or even active gaming moments. Around 49 different animations come preloaded, with options to personalize the behavior and appearance.



Owners can watch the lights on the camera island pulse or evolve into shapes that correspond to the situation, transforming what would otherwise be a rather standard video setup into something considerably more dynamic. The rear panel that snaps on features a sequence of geometric lines that give it a semi-transparent appearance, and it comes in a range of colors, all of which help the lights show through when switched on. The front panel includes a 6.76-inch screen with a 144Hz refresh rate, allowing videos and games to run smoothly. That screen is also bright enough to be seen outside, and the built-in eye strain reduction is especially handy if you plan on using it for extended periods of time.

Sale


Google Pixel 10a – Unlocked Android Smartphone – 7 Years of Pixel Drops, 30+ Hours Battery, Camera Coach…
  • Google Pixel 10a is a durable, everyday phone with more[1]; snap brilliant photography on a simple, powerful camera, get 30+ hours out of a full…
  • Unlocked Android phone gives you the flexibility to change carriers and choose your own data plan; it works with Google Fi, Verizon, T-Mobile, AT&T…
  • Pixel 10a is sleek and durable, with a super smooth finish, scratch-resistant Corning Gorilla Glass 7i display, and IP68 water and dust protection[4]

Under the hood, a MediaTek Dimensity 7100 CPU handles all of the daily tasks and mild gaming demands, with some specialty chips helping to boost signal strength in areas where it is a little weak. A large graphite layer provides cooling, and the phone remains comfortable to handle even after hours of gaming. In terms of storage and memory, the launch models hit a good balance for most people: not too much, but enough to avoid feeling limited. The camera setup is quite standard, with a 50-megapixel Sony sensor that supports autofocus and zooming, as well as a second lens for group shots. The selfie camera is decent for video calls, but let’s be honest, the lights on the phone’s back are the main attraction.

Advertisement


Another important feature is the power delivery system, which incorporates an 8000 milliamp hour battery with a certified multi-day runtime in regular use, as well as 45 watt wired charging that can charge the battery to 50% in 35 minutes. If necessary, you can even use the phone to charge your wired earbuds or another phone. As an added benefit, it appears that the battery will still perform effectively after thousands of charge cycles.

Tecno Pova 8 5G Smartphone
The phone runs Android 16 with Tecno’s HiOS 16 on top, and the company promises to keep the software updated for an extended period of time. There are also some AI-powered extras, such as photo cleanup and video summaries, as well as noise reduction during calls, which will only be available in specific areas. If you buy in a supported region, you will also receive additional cloud storage. When it comes to making sure the phone survives the rigors of everyday life, Tecno has it covered. The phone is resistant to dust and water splashes, and it’s been built to withstand a few accidental drops and bumps. Even though it has a pretty healthy battery, it’s only 9 millimeters thick, though it’s a little heavier due to the power inside.

Tecno Pova 8 5G Smartphone
The starting price in India is approximately 30,000 rupees ($314) for the lower memory version, which will be available from all major online retailers within the next week or so. So, if you’re searching for a phone with a long battery life and a nice design, this one might be worth considering, even if it’s not the most powerful camera phone on the market or made of highest-quality materials.
[Source]

Source link

Advertisement
Continue Reading

Tech

CrossOver 27 removes legacy support for Intel

Published

on

If you’re a CrossOver user on Intel or use 32-bit gaming bottles, your time is up with version 27. 64-bit bottles and Apple Silicon are now required.

Gaming on Mac has always been a bit of a wasteland, but that doesn’t stop some folks from trying. The CrossOver app for Mac brings Windows games to the platform, and it gets better with each update.

However, the latest update, CrossOver 27, will have to make some sacrifices to make development a little more streamlined. It is getting ARM64 builds for both Mac and Linux, but CrossOver 27 will only work on macOS Sonoma or newer.

It’s also limited to Apple Silicon Macs since Apple phased them out between macOS Sonoma and macOS Golden Gate.

Advertisement

There’s also a final warning about those who still may be using 32-bit gaming bottles. Users are urged to move their 32-bit games to 64-bit bottles, or they will no longer function.

The developer did note that this should affect a small percentage of users overall. Around 97% of CrossOver users are running macOS Sonoma or newer.

Removing legacy support will allow the development team to focus on UI and optimization for one set of computers instead of maintaining Intel-compatible systems. It also means that a new user interface will debut at some point in a future release.

If you are on an Intel machine or running an older version of macOS, the good news is that CrossOver 26 won’t suddenly combust. Simply don’t pay for the new version or attempt to upgrade and everything will work as is, hopefully.

Advertisement

However, note that if you do keep CrossOver 26, your games could run into compatibility issues if they are updated. Also, newer operating systems may cause problems with the older software.

Eventually, your only choice might be to finally move to Apple Silicon.

Source link

Advertisement
Continue Reading

Tech

Study Links Smartphones With Declining Fertility Rates

Published

on

Two recent studies argue that smartphones may have contributed to falling birthrates by reducing in-person social interaction, sexual frequency, and other conditions tied to unintended pregnancies. “One of the studies published in May is called ‘The Collapse of Teen Fertility in the Digital Era‘ and the other, published just Monday, is titled ‘Is the iPhone Birth Control? Causal Evidence from AT&T’s 2007-2011 Carrier Monopoly,’” reports KTLA. “Both were chronicled in a New York Times piece by political writer Sabrina Tavernise on Monday.” Slashdot reader sabbede submitted the story. From the report: The one from May, authored by two University of Cincinnati professors, posits that teen fertility “collapsed globally” starting around 2007 — the same year the first iPhone was released. “Smart phones changed how teens spend time with each other … this change in turn drove the collapse in teen fertility,” the study’s abstract reads. “Once enough teens are on the phone, being on the phone is where the peer network is; in-person time falls sharply, and with it the unstructured contact in which most unintended teen conceptions occur.” The study claimed that countries “across the income and policy spectrum” were affected by the teen fertility drop, and that researchers used data from multiple countries, including the U.S., England and Wales, to rule out “country-specific contraceptive access and welfare reform stories.” “This model predicts that the shift towards the phone-mediated equilibrium affects multiple aspects of teen behavior,” the abstract continues, concluding that “the same instrument that produces a collapse in teen fertility produces a surge in teen suicides.”

The study published on Monday looks more closely at the United States, explaining that nationwide general fertility rates have fallen 22% since 2007. “[This is] a sustained decline not readily explained by economic conditions, contraceptive use, housing or childcare costs, or other commonly cited factors,” the National Bureau of Economic Researchers study states. “We assess the potential role of a different shock: the diffusion of the smartphone.” As mentioned before, the first iPhone was rolled out in 2007, and this study makes use of that timeframe as “a natural experiment” by using data from 2007 through 2011, when iPhones were only sold on AT&T. “From June 2007 through February 2011, the device was sold only on AT&T, allowing us to identify its effect from variation in AT&T’s mobile broadband coverage,” the study says. “Entropy-balanced Poisson and synthetic difference-in-differences event studies imply that access to the iPhone reduced births by 4.5-8.0% at ages 15-19 and 3.2-6.6% at ages 20-24, with statistically significant but smaller declines among older cohorts. Placebo analyses applied to Verizon and Sprint’s pre-2011 coverage footprint are null.

Taken together, these cohort effects imply that the diffusion of the iPhone deepened the decline in births among women under 30 while suppressing the rise in births among older women.” “Overall, the diffusion of the iPhone explains 33-52% of the decline in the general fertility rate among women aged 15-44,” researchers continued. “National-survey evidence on time use and sexual behavior is consistent with the iPhone reducing in-person interactions, increasing pornography use and reducing sexual frequency.”

Source link

Advertisement
Continue Reading

Tech

Poland To Jail Online Streamers of Violent Crime For Up To 5 Years

Published

on

Polish lawmakers have voted to criminalize “trash streaming,” with up to five years in prison for online broadcasts of serious crimes such as rape or murder, animal cruelty, humiliating violence, gambling promotion, or even simulated depictions of those acts. Reuters reports: The move is part of a broader push by Poland to tighten regulation of online content. Recent measures include banning the use of mobile phones by children under 16 in schools and introducing stricter age verification rules to access pornography. Under the new provisions, broadcasting crimes punishable by more than five years in prison, including murder or rape, will itself be classed as a separate offence punishable by up to five years behind bars.

The law also covers content showing cruelty to animals, violence aimed at humiliating others, and the promotion of gambling. The same penalties will apply to individuals who simulate or falsely portray the commission of such crimes while streaming, lawmakers said.

Source link

Continue Reading

Tech

Inside Lime’s Seattle warehouse, where 15,000 bikes and scooters are prepped for a World Cup surge

Published

on

Rows of Lime’s new LimeBike electric bicycles are ready to be deployed in Seattle from the company’s warehouse in Georgetown on Wednesday. (GeekWire Photo / Kurt Schlosser)

As Seattle is set to welcome the world this month for FIFA World Cup matches, the city’s sole micromobility operator is getting ready, too — and GeekWire got an inside look at how.

In a sprawling warehouse south of downtown in Georgetown — a space that serves as the base of operations for the company’s presence in the Seattle region — Lime is rolling out new devices, doing regular maintenance on its existing fleet, and preparing a suite of services designed to handle a surge in ridership.

Lime is the only operator in the City of Seattle’s bike- and scooter-share program, a position it assumed earlier this year following the exit of competitors including Bird. The company operates a fleet of 15,000 devices in the city — 7,000 scooters, 4,000 LimeGliders, 3,300 Gen 4 e-bikes and 700 of its newest LimeBikes — and recorded 2.3 million rides in Seattle in the first quarter of this year, up roughly 50% from the same period last year.

World Cup matches and associated activities around Lumen Field and other parts of Seattle could meet or exceed what Lime saw on its biggest ridership day ever in the city — the February Super Bowl championship parade that generated more than 60,000 trips.

“We’re excited to support Seattle during such a major moment for the city, and to help residents and visitors get where they need to go throughout the summer,” said Parker Dawson, senior regional lead of government relations at Lime.

Advertisement

What’s in store

Boxes of helmets in Lime’s Seattle warehouse, ready to be given away during promotional events this summer. (GeekWire Photo / Kurt Schlosser)

To handle the expected influx of riders, Lime is rolling out several new services and operational upgrades for the duration of the World Cup and other major summer events, including:

  • Valet parking: For the first time in Seattle, Lime will station staff at designated parking locations near the stadium district to end rides on behalf of riders who can’t connect to cell service in crowded areas. “If you go down to the stadium area and there’s potentially hundreds of thousands of fans taking up all of the cell service, this allows our team to actually end the ride for you,” said Brent Vigneault, general manager of Lime’s Pacific Northwest operations.
  • Fan Pass: A new discounted ride pass offers up to 90 minutes of riding for $12.99 — more than 70% lower than standard pricing — available now through July 19.
  • Geofencing: Event-specific virtual boundaries will direct riders to designated parking zones and help manage pedestrian-heavy areas on game days.
  • Fleet rebalancing: Using GPS data from past events, Lime will shift vehicles across the city to meet demand spikes around the stadium district and downtown corridors.
  • Helmet giveaways: Lime has already distributed 2,500 free helmets this year and plans to give out an additional 3,000 during major events this summer. Helmets will be available at all valet parking locations after matches.

New tech put to the test

Bikes and scooters staged for maintenance and quality control checks in Lime’s Seattle warehouse. (GeekWire Photo / Kurt Schlosser)

Thousands of bike and scooter riders — many of whom might be new to Seattle — could pose significant challenges when it comes to where to ride and where to park.

Lime’s new Lime Vision technology is designed to address part of that equation by alerting sidewalk scooter riders to find a safer path. Cameras mounted on the front of scooters, in tandem with artificial intelligence, will detect where a rider is traveling. When bad behavior is detected, the scooter emits an audible alert and sends a real-time notification to the Lime app, warning the rider to move to a safer location.

GeekWire tested Lime Vision on Wednesday by riding a scooter from the street to a sidewalk near Lime’s warehouse. After a few seconds, the scooter realized where we were and said, “Avoid sidewalks.”

@media (max-width: 600px) {
aside.callout { float:none !important; max-width:100% !important; margin-left:0 !important; margin-right:0 !important; }
aside.callout .callout-img { display:none !important; }
}

Lime has deployed 3,500 new Gen 4.1 scooters in Seattle over the past four weeks, all equipped with Lime Vision. Vigneault said the system is already having an impact, with riders audibly called out and observed moving off sidewalks in response.

“Our model will continuously learn through experience,” he said. “The more miles ridden on these vehicles throughout the city, the better the model will get at detecting sidewalks and hopefully pushing people into bike lanes.”

Advertisement

For now, Lime is focused on getting people off sidewalks rather than penalizing them for it. Asked whether repeat sidewalk riders could eventually face account suspensions or other disciplinary action, Vigneault said the company is still assessing.

“Right now we’re trying to move with the carrot instead of the stick,” he said.

Lime Vision isn’t the only technology Lime is using to encourage better behavior. The company’s parking system, called Capture, now uses AI to analyze photos taken by riders at the end of a trip, providing real-time feedback if a vehicle is parked in a problematic location — blocking a pathway or an ADA access route, for example — and preventing the rider from ending the ride until the vehicle is moved.

The Seattle Department of Transportation, meanwhile, has painted more than 230 physical parking corrals downtown to give riders clearly marked places to land.

Advertisement

Keeping the fleet rolling

A mechanic removes the rear wheel from a Lime bike at the company’s Seattle warehouse. (GeekWire Photo / Kurt Schlosser)

The Georgetown facility — and its small army of technicians — handles maintenance for Lime’s entire Seattle-region fleet, which spans not just the city but Bothell, Redmond, Woodinville, Everett and Shoreline.

Vehicles are pulled from the field when GPS data, tire pressure monitors or poor rider ratings flag a problem, then brought in for diagnosis, repair and a quality check by a second set of eyes before heading back out.

Every part on every Lime vehicle is modular and swappable — seats, handlebars, motors, tires, phone holders — meaning a single worn component doesn’t take an entire device out of commission. When a vehicle does reach end of life, Lime strips it for reusable parts before recycling what remains.

“We try to keep our vehicles out as long as possible and make sure that we’re not wasting materials,” Vigneault said.

Batteries are swapped in the field to minimize downtime, but any other maintenance comes through the warehouse, where mechanics are cross-trained on every vehicle type. Lime tracks the full history of every device — every ride, every repair, every mile — meaning some vehicles have been rolling through Seattle streets for years, swapping out parts along the way.

Advertisement

Even graffiti — an occupational hazard for any fleet of public-use vehicles — gets scrubbed off as part of the standard quality-control process before a device goes back out.

“We want our vehicles looking clean,” Vigneault said. “No one wants to sit on a vehicle that’s covered in graffiti.”

Source link

Advertisement
Continue Reading

Tech

Amperity founders take on co-CEO roles, say they’ll carry the ‘soul’ of the startup forward

Published

on

Amperity co-founders and co-CEOs Kabir Shahani (left) and Derek Slager. (Amperity photo).

Amperity is putting its founders back in charge.

The Seattle-based customer data startup announced this week that co-founders Derek Slager and Kabir Shahani will serve as co-CEOs, taking over leadership of the company less than two years after Amperity hired former Salesforce executive Tony Alika Owens to lead the business.

The leadership change marks a significant shift for one of Seattle’s most prominent enterprise software startups as it looks to capitalize on growing demand for AI-powered customer data tools.

In LinkedIn posts announcing the transition, Slager and Shahani said they will lead the company into what they described as a major opportunity created by the rise of artificial intelligence. Longtime CFO Amy Kelleran Pelly will expand her responsibilities and become president while retaining her CFO role.

“I’ve watched this technology go from interesting to transformative in real time, with a front row seat at the center of where it matters most: customer data,” Slager wrote. “Amperity has built an incredible foundation over the past decade. This is exactly the infrastructure the AI era runs on.”

Advertisement

Amperity recruited Owens, a veteran Salesforce executive, as CEO in 2024. At the time, the company said Owens would help guide its next phase of growth as brands increasingly sought ways to unify customer data across marketing, commerce and customer service operations.

In a statement provided to GeekWire, Amperity said that Owens’ departure was planned and a “mutual transition.” It added, “Tony leaves Amperity stronger than he found it, and we’re grateful for his leadership and contributions to the company.”

In 2022, Shahani stepped down as CEO, telling GeekWire at the time that he left voluntarily for personal reasons. The company did not publicly disclose additional details at the time. Slager continued serving as chief technology officer.

Shahani, who resides in New York, also is the co-founder of 3-year-old Seattle marketing tech startup Adora.

Advertisement

Founded in 2016, Amperity built its business around helping large consumer brands unify customer information from multiple systems into a single profile. Customers include brands such as Virgin Atlantic, Brooks Running and Dick’s Sporting Goods. Slager and Shahani also previously worked together at Appature, which they sold to IMS Health in 2013.

Amperity has raised more than $180 million from investors including HighSage Ventures, Tiger Global, Declaration Partners, Madrona and others. It boasted a valuation of more than $1 billion after raising capital in 2021. The company declined to comment on its financial performance, or future fundraising plans.

Advertisement

Amperity is ranked #37 on the GeekWire 200, a list of the top privately-held tech companies in the Pacific Northwest. It employs more than 200 employees in Seattle, New York, the United Kingdom, Australia and Argentina.

Shahani said via email that having the company’s co-CEOs in two of its major hubs — Slager in Seattle and him in New York — is a real advantage.

“We view this as the right leadership structure for Amperity’s next chapter,” he said. “Derek and I bring highly complementary strengths, and we’re excited to lead the company together along with our newly appointed President, Amy Pelly.”

Amperity co-founders Kabir Shahani (left) and Derek Slager in 2017. They are now co-CEOs of the Seattle startup. (Amperity Photo)

With AI reshaping how companies use customer information, Amperity’s founders are betting that the technology shift creates a new growth opportunity for the startup they launched a decade ago.

“We’re carrying the soul of Amperity forward and aiming it at our biggest opportunity yet,” Shahani wrote.

Advertisement

Source link

Continue Reading

Trending

Copyright © 2025