Is China picking back up the open source AI baton?
Z.ai, also known as Zhupai AI, a Chinese AI startup best known for its powerful, open source GLM family of models, has unveiled GLM-5.1 today under a permissive MIT License, allowing for enterprises to download, customize and use it for commercial purposes. They can do so on Hugging Face.
The new GLM-5.1 is designed to work autonomously for up to eight hours on a single task, marking a definitive shift from vibe coding to agentic engineering.
Advertisement
The release represents a pivotal moment in the evolution of artificial intelligence. While competitors have focused on increasing reasoning tokens for better logic, Z.ai is optimizing for productive horizons.
GLM-5.1 is a 754-billion parameter Mixture-of-Experts model engineered to maintain goal alignment over extended execution traces that span thousands of tool calls.
“agents could do about 20 steps by the end of last year,” wrote z.ai leader Lou on X. “glm-5.1 can do 1,700 rn. autonomous work time may be the most important curve after scaling laws. glm-5.1 will be the first point on that curve that the open-source community can verify with their own hands. hope y’all like it^^”
In a market increasingly crowded with fast models, Z.ai is betting on the marathon runner. The company, which listed on the Hong Kong Stock Exchange in early 2026 with a market capitalization of $52.83 billion, is using this release to cement its position as the leading independent developer of large language models in the region.
Advertisement
Technology: the staircase pattern of optimization
GLM-5.1s core technological breakthrough isn’t just its scale, though its 754 billion parameters and 202,752 token context window are formidable, but its ability to avoid the plateau effect seen in previous models.
In traditional agentic workflows, a model typically applies a few familiar techniques for quick initial gains and then stalls. Giving it more time or more tool calls usually results in diminishing returns or strategy drift.
Z.ai research demonstrates that GLM-5.1 operates via what they call a staircase pattern, characterized by periods of incremental tuning within a fixed strategy punctuated by structural changes that shift the performance frontier.
In Scenario 1 of their technical report, the model was tasked with optimizing a high-performance vector database, a challenge known as VectorDBBench.
Advertisement
VectorDBBench graphic from z.ai for GLM-5.1. Credit: z.ai
The model is provided with a Rust skeleton and empty implementation stubs, then uses tool-call-based agents to edit code, compile, test, and profile. While previous state-of-the-art results from models like Claude Opus 4.6 reached a performance ceiling of 3,547 queries per second, GLM-5.1 ran through 655 iterations and over 6,000 tool calls. The optimization trajectory was not linear but punctuated by structural breakthroughs.
At iteration 90, the model shifted from full-corpus scanning to IVF cluster probing with f16 vector compression, which reduced per-vector bandwidth from 512 bytes to 256 bytes and jumped performance to 6,400 queries per second.
By iteration 240, it autonomously introduced a two-stage pipeline involving u8 prescoring and f16 reranking, reaching 13,400 queries per second. Ultimately, the model identified and cleared six structural bottlenecks, including hierarchical routing via super-clusters and quantized routing using centroid scoring via VNNI. These efforts culminated in a final result of 21,500 queries per second, roughly six times the best result achieved in a single 50-turn session.
Advertisement
This demonstrates a model that functions as its own research and development department, breaking complex problems down and running experiments with real precision.
The model also managed complex execution tightening, lowering scheduling overhead and improving cache locality. During the optimization of the Approximate Nearest Neighbor search, the model proactively removed nested parallelism in favor of a redesign using per-query single-threading and outer concurrency.
When the model encountered iterations where recall fell below the 95 percent threshold, it diagnosed the failure, adjusted its parameters, and implemented parameter compensation to recover the necessary accuracy. This level of autonomous correction is what separates GLM-5.1 from models that simply generate code without testing it in a live environment.
Kernelbench: pushing the machine learning frontier
The model’s endurance was further tested in KernelBench Level 3, which requires end-to-end optimization of complete machine learning architectures like MobileNet, VGG, MiniGPT, and Mamba.
Advertisement
In this setting, the goal is to produce a faster GPU kernel than the reference PyTorch implementation while maintaining identical outputs. Each of the 50 problems runs in an isolated Docker container with one H100 GPU and is limited to 1,200 tool-use turns. Correctness and performance are evaluated against a PyTorch eager baseline in separate CUDA contexts.
The results highlight a significant performance gap between GLM-5.1 and its predecessors. While the original GLM-5 improved quickly but leveled off early at a 2.6x speedup, GLM-5.1 sustained its optimization efforts far longer. It eventually delivered a 3.6x geometric mean speedup across 50 problems, continuing to make useful progress well past 1,000 tool-use turns.
Although Claude Opus 4.6 remains the leader in this specific benchmark at 4.2x, GLM-5.1 has meaningfully extended the productive horizon for open-source models.
This capability is not simply about having a longer context window; it requires the model to maintain goal alignment over extended execution, reducing strategy drift, error accumulation, and ineffective trial and error. One of the key breakthroughs is the ability to form an autonomous experiment, analyze, and optimize loop, where the model can proactively run benchmarks, identify bottlenecks, adjust strategies, and continuously improve results through iterative refinement.
Advertisement
All solutions generated during this process were independently audited for benchmark exploitation, ensuring the optimizations did not rely on specific benchmark behaviors but worked with arbitrary new inputs while keeping computation on the default CUDA stream.
Product strategy: subscription and subsidies
GLM-5.1 is positioned as an engineering-grade tool rather than a consumer chatbot. To support this, Z.ai has integrated it into a comprehensive Coding Plan ecosystem designed to compete directly with high-end developer tools.
The product offering is divided into three subscription tiers, all of which include free Model Context Protocol tools for vision analysis, web search, web reader, and document reading.
The Lite tier at $27 USD per quarter is positioned for lightweight workloads and offers three times the usage of a comparable Claude Pro plan. The Pro tier at $81 per quarter is designed for complex workloads, offering five times the Lite plan usage and 40 to 60 percent faster execution.
Advertisement
The Max tier at $216 per quarter is aimed at advanced developers with high-volume needs, ensuring guaranteed performance during peak hours.
For those using the API directly or through platforms like OpenRouter or Requesty, Z.ai has priced GLM-5.1 at $1.40 per one million input tokens and $4.40 per million output tokens. There’s also a cache discount available for $0.26 per million input tokens.
Notably, the model consumes quota at three times the standard rate during peak hours, which are defined as 14:00 to 18:00 Beijing Time daily, though a limited-time promotion through April 2026 allows off-peak usage to be billed at a standard 1x rate. Complementing the flagship is the recently debuted GLM-5 Turbo.
Advertisement
While 5.1 is the marathon runner, Turbo is the sprinter, proprietary and optimized for fast inference and tasks like tool use and persistent automation.
At a cost of $1.20 per million input / $4 per million output, it is more expensive than the base GLM-5 but comes in at more affordable than the new GLM-5.1, positioning it as a commercially attractive option for high-speed, supervised agent runs.
The model is also packaged for local deployment, supporting inference frameworks including vLLM, SGLang, and xLLM. Comprehensive deployment instructions are available at the official GitHub repository, allowing developers to run the 754 billion parameter MoE model on their own infrastructure.
For enterprise teams, the model includes advanced reasoning capabilities that can be accessed via a thinking parameter in API requests, allowing the model to show its step-by-step internal reasoning process before providing a final answer.
Advertisement
Benchmarks: a new global standard
The performance data for GLM-5.1 suggests it has leapfrogged several established Western models in coding and engineering tasks.
SWE-Bench Pro benchmark comparison chart showing GLM-5.1 leading other major models. Credit: z.ai
On SWE-Bench Pro, which evaluates a model’s ability to resolve real-world GitHub issues using an instruction prompt and a 200,000 token context window, GLM-5.1 achieved a score of 58.4. For context, this outperforms GPT-5.4 at 57.7, Claude Opus 4.6 at 57.3, and Gemini 3.1 Pro at 54.2.
Beyond standardized coding tests, the model showed significant gains in reasoning and agentic benchmarks. It scored 63.5 on Terminal-Bench 2.0 when evaluated with the Terminus-2 framework and reached 66.5 when paired with the Claude Code harness.
Advertisement
On CyberGym, it achieved a 68.7 score based on a single-run pass over 1,507 tasks, demonstrating a nearly 20-point lead over the previous GLM-5 model. The model also performed strongly on the MCP-Atlas public set with a score of 71.8 and achieved a 70.6 on the T3-Bench.
In the reasoning domain, it scored 31.0 on Humanitys Last Exam, which jumped to 52.3 when the model was allowed to use external tools. On the AIME 2026 math competition benchmark, it reached 95.3, while scoring 86.2 on GPQA-Diamond for expert-level science reasoning.
The most impressive anecdotal benchmark was the Scenario 3 test: building a Linux-style desktop environment from scratch in eight hours.
Unlike previous models that might produce a basic taskbar and a placeholder window before declaring the task complete, GLM-5.1 autonomously filled out a file browser, terminal, text editor, system monitor, and even functional games.
Advertisement
It iteratively polished the styling and interaction logic until it had delivered a visually consistent, functional web application. This serves as a concrete example of what becomes possible when a model is given the time and the capability to keep refining its own work.
Licensing and the open segue
The licensing of these two models tells a larger story about the current state of the global AI market. GLM-5.1 has been released under the MIT License, with its model weights made publicly available on Hugging Face and ModelScope.
This follows the Z.ai historical strategy of using open-source releases to build developer goodwill and ecosystem reach. However, GLM-5 Turbo remains proprietary and closed-source. This reflects a growing trend among leading AI labs toward a hybrid model: using open-source models for broad distribution while keeping execution-optimized variants behind a paywall.
Industry analysts note that this shift arrives amidst a rebalancing in the Chinese market, where heavyweights like Alibaba are also beginning to segment their proprietary work from their open releases.
Advertisement
Z.ai CEO Zhang Peng appears to be navigating this by ensuring that while the flagship’s core intelligence is open to the community, the high-speed execution infrastructure remains a revenue-driving asset.
The company is not explicitly promising to open-source GLM-5 Turbo itself, but says the findings will be folded into future open releases. This segmented strategy helps drive adoption while allowing the company to build a sustainable business model around its most commercially relevant work.
Community and user reactions: crushing a week’s work
The developer community response to the GLM-5.1 release has been overwhelmingly focused on the model’s reliability in production-grade environments.
User reviews suggest a high degree of trust in the model’s autonomy.
Advertisement
One developer noted that GLM-5.1 shocked them with how good it is, stating it seems to do what they want more reliably than other models with less reworking of prompts needed. Another developer mentioned that the model’s overall workflow from planning to project execution performs excellently, allowing them to confidently entrust it with complex tasks.
Specific case studies from users highlight significant efficiency gains.
A user from Crypto Economy News reported that a task involving preprocessing code, feature selection logic, and hyperparameter tuning solutions, which originally would have taken a week, was completed in just two days. Since getting the GLM Coding plan, other developers have noted being able to operate more freely and focus on core development without worrying about resource shortages hindering progress.
On social media, the launch announcement generated over 46,000 views in its first hour, with users captivated by the eight-hour autonomous claim. The sentiment among early adopters is that Z.ai has successfully moved past the hallucination-heavy era of AI into a period where models can be trusted to optimize themselves through repeated iteration.
Advertisement
The ability to build four applications rapidly through correct prompting and structured planning has been cited by multiple users as a game-changing development for individual developers.
The implications of long-horizon work
The release of GLM-5.1 suggests that the next frontier of AI competition will not be measured in tokens per second, but in autonomous duration.
If a model can work for eight hours without human intervention, it fundamentally changes the software development lifecycle.
However, Z.ai acknowledges that this is only the beginning. Significant challenges remain, such as developing reliable self-evaluation for tasks where no numeric metric exists to optimize against.
Advertisement
Escaping local optima earlier when incremental tuning stops paying off is another major hurdle, as is maintaining coherence over execution traces that span thousands of tool calls.
For now, Z.ai has placed a marker in the sand. With GLM-5.1, they have delivered a model that doesn’t just answer questions, but finishes projects. The model is already compatible with a wide range of developer tools including Claude Code, OpenCode, Kilo Code, Roo Code, Cline, and Droid.
For developers and enterprises, the question is no longer, “what can I ask this AI?” but “what can I assign to it for the next eight hours?”
The focus of the industry is clearly shifting toward systems that can reliably execute multi-step work with less supervision. This transition to agentic engineering marks a new phase in the deployment of artificial intelligence within the global economy.
We rated the DJI Mic Mini as the best small wireless mic when it was launched in 2024, and it now has a successor in the shape of the Mic Mini 2. Both are 5-star products for content creators wanting an affordable, lightweight, and simple mic for better audio on the go.
Advertisement
If you already own a Mic Mini, there’s very little reason to upgrade to the Mic Mini 2 because performance is practically the same; both mics feature clear 24-bit audio, two-level noise reduction, a transmission range up to 400m, healthy battery life, and a lightweight 11g build.
So what exactly is new? I’ve pinpointed the key differences below, chief among them being much better pricing this time around, plus a new bundle for mobile creators.
Article continues below
Advertisement
Surprisingly, however, DJI also revealed in its Mic Mini 2 press release that a Mic Mini 2S is in the pipeline for later this year. The ‘S’ version will add welcome upgrades missing in current models: internal recording, plus the capacity to sync up to four mics with one receiver. That’s all we know about the Mic Mini 2S for now, but it sounds like it’ll be worth the wait for Mic Mini upgraders.
Let’s now see how the Mic Mini and Mic Mini 2 compare. You can find out more about each product in our full reviews via the links above.
Advertisement
1. Design
Image 1 of 3
(Image credit: DJI)
(Image credit: DJI)
(Image credit: DJI)
Both products are tiny, discreet, and lightweight — the Mic Mini 2 weighs just 11g by itself (not including the magnetic attachment). However, there’s one new design trick in the Mic Mini 2 that could be worth an upgrade, depending on the user: magnetic covers.
The Mic Mini 2 has a magnetic surface that accepts covers, with a wide range of colors available, as you can see in the lead image of this article. There are further limited edition covers too (see above). There’s a selection of covers included in the 2 TX + 1RX + Charging Case bundle (details below), while additional covers can be purchased separately.
Sign up for breaking news, reviews, opinion, top tech deals, and more.
Advertisement
If you’re style-conscious and like the sound of a wireless mic that matches the color of your outfit, then this new feature could be worth the upgrade alone. However, if you don’t mind the standard black or white options, then this upgrade could feel like a bit of a gimmick.
2. Voice tone presets
DJI has added three voice tone presets to the Mic Mini 2: regular, rich, and bright. The idea is that each preset optimizes audio quality based on the recording environment. However, our reviewer found that there was so little difference between the sound in each preset that it’s barely worth the upgrade.
So if customizable colors aren’t your bag, nor do the voice tone presets entice, there’s essentially no reason to upgrade to the Mic Mini 2 from the Mic Mini. However, for those buying new, the biggest reason to be excited is a significant price cut, along with a new bundle designed for mobile creators.
Advertisement
3. Pricing and bundles
Image 1 of 6
The complete kit, housed in a charging case(Image credit: DJI)
The standard receiver in the priciest bundle. Design-wise, it’s a better fit for proper cameras rather than phones(Image credit: DJI)
Here’s the bundle for mobile(Image credit: DJI)
It includes a mobile receiver which is a much better fit for phones(Image credit: DJI)
(Image credit: DJI)
(Image credit: DJI)
When the Mic Mini was unveiled in 2024, the 2 TX + 1RX + Charging Case bundle cost $169 / £145 / AU$245. It seems hard to believe, then, that the equivalent Mic Mini 2 bundle costs just £89 / AU$149 — that’s a huge price cut, likely due to increased competition. As is the way with DJI currently, there’s no US pricing or availability at launch.
There’s also a new bundle designed for solo mobile creators, which comprises one mic, one mobile receiver (see our DJI Mic series mobile receiver hands-on), and a charging case, available for just £49 / AU$89. This kit includes a sleek receiver that slots into your phone’s USB-C, whereas the bundle above includes the standard receiver, which is much clunkier when connected to a phone, and is a better fit for proper cameras.
Advertisement
The price for the mobile bundle is the same price that DJI was asking for a single mic when it launched the original Mic Mini. Sadly, it’s not possible to buy a solo Mic Mini 2 mic just yet.
All that being said, the price of the original Mic Mini complete kit mentioned above has continually dropped during its two-year life, and can now be found for as little as £65 / AU$124 — that’s less than the new version with its additional colored covers.
For me, those Mic Mini 2 prices are super competitive. Yes, the second-gen model is a tiny upgrade (customizable covers aside), but it led the way for value in an increasingly competitive space and is my new favorite small wireless mic. Whether or not the Mic Mini 2S reveal rains on the Mic Mini 2 parade, we’ll have to wait and see.
‘Human lives are already being lost’: Open letter signed by hundreds of Google employees requests CEO reject ‘unethical and dangerous’ US military AI use
Google employees sign open letter to CEO over concerns of military AI use
AI developers do not want their technology used for ‘classified purposes’
Google is currently negotiating a contract with the Pengaton
Over 600 Google employees have signed a letter calling on CEO Sundar Pichai to reject any uses of its AI technology for military purposes.
The open letter highlights the serious ethical concerns the staff have, stating, “Human lives are already being lost and civil liberties put at risk at home and abroad from misuses of the technology we are playing a key role in building.”
“As people working on AI, we know that these systems can centralize power and that they do make mistakes,” the letter said. “We feel that our proximity to this technology creates a responsibility to highlight and prevent its most unethical and dangerous uses.”
The new OpenAI contract with the Pentagon was full of holes that would have allowed the same use of ChatGPT that Anthropic feared for Claude. The contract was amended to state that OpenAI’s models would not be used for “deliberate tracking, surveillance, or monitoring of U.S. persons or nationals, including through the procurement or use of commercially acquired personal or identifiable information.”
Shortly after, Sam Altman told his employees that the Pentagon has said OpenAI does not “get to make operational decisions” on how the military uses AI technologies.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Now, Google employees are joining the growing number of AI company employees and members of the public opposed to the military use of AI tools. “Making the wrong call right now would cause irreparable damage to Google’s reputation, business and role in the world,” the letter states.
Advertisement
Following protests involving Google staff in 2018, the company amended its AI Principles to state that it would not deploy its AI tools where they were “likely to cause harm,” and would not “design or deploy” AI tools for surveillance or weapons. These clauses were quietly removed from its AI Principles on 4 February 2025.
On Apr 28, LinkedIn unveiled its 2026 Top Companies list, naming the 15 best places to work in Singapore.
The rankings are based on LinkedIn’s own data, with companies assessed on various elements of career progression, including factors like how well they help employees progress in their careers and build new skills.
Here are this year’s top companies to grow your career in Singapore, according to LinkedIn:
1. DBS Bank
Image Credit: DBS Bank
Claiming the top spot once again is DBS Bank, Southeast Asia’s largest bank. The financial giant is currently hiring for over 200 roles here, including:
Microsoft is a technology company that develops software, hardware and cloud‑based services. Singapore serves as a key regional hub for its Asia‑Pacific operations, supporting customers across consumer, enterprise and public sector markets.
It is also the parent company of Activision Blizzard, GitHub, Skype, LinkedIn and others. LinkedIn and its employees are excluded from Microsoft’s score.
The company is looking for new hires for these positions:
Goldman Sachs is a financial services firm that provides investment banking, asset management and financial advisory services. It has offices across Asia, including Singapore, serving corporations, governments and institutional investors in the region.
These are some of the jobs the firm is hiring for:
Originally founded in Switzerland, Roche is a multinational healthcare company that focuses on research and development of medical solutions for major disease areas such as oncology, immunology, and neuroscience.
The fifth largest bank in the world, JPMorgan Chase & Company, first opened in Singapore back in 1964 and has established itself as a global financial services firm across 17 markets in the Asian Pacific region.
The firm is looking for fresh faces for these roles:
A heavyweight in the global IT industry, HP is a technology company that manufactures a range of monitors, laptops and desktops. It also produces and offers services around printers and 3D printers.
The tech company is currently looking to fill these roles:
You can browse through HP’s full job listings here.
7. Standard Chartered
Image Credit: Standard Chartered
Another notable bank on the list, Standard Chartered offers banking services across 52 markets worldwide.
The bank’s on the lookout for people to fill these positions:
Advertisement
You can look at Standard Chartered’s full job list here.
8. MSD
Image Credit: MSD
Known as Mereck in the United States and Canada, MSD is a pharmaceutical company that specialises in producing prescription medicines, vaccines and animal health products.
Genting Berhad is a diversified company with businesses in leisure, hospitality, energy and plantations.
The group’s Singapore subsidiary, Genting Singapore Limited, has a significant presence in the city-state linked to its regional leisure and hospitality activities.
Advertisement
It is currently hiring for these roles in Singapore:
Barclays is a financial services company providing banking, lending, investment and wealth management services. It serves individuals, businesses and institutional clients through retail and corporate banking operations.
These are some of the roles it is hiring for in Singapore:
The company behind the all-familiar iPhone, Apple, first opened its facility in Singapore in 1981 and has since grown its presence in the city-state with three outlets in Orchard, Marina Bay Sands and Jewel Changi.
Apple has close to 100 openings listed on LinkedIn as of writing, including:
Micron Technology is a semiconductor company that designs and manufactures memory and storage products. These components are used in computers, mobile devices, data centres and other electronic systems.
The firm is currently hiring for these positions:
Click here to view Micron Technology’s full job list.
14. Rockwell Automation
Image Credit: Shutterstock.com
Rockwell Automation is an industrial technology company that provides hardware, software and services for manufacturing and production operations. Its products help businesses automate processes and manage industrial systems.
Citi operates as a full-service bank in Singapore. It provides individuals, corporations, governments, investors and institutions with a range of financial products and banking services.
The bank’s on the lookout for people to fill these positions:
Enterprise teams that fine-tune their RAG embedding models for better precision may be unintentionally degrading the retrieval quality those pipelines depend on, according to new research from Redis.
The paper, “Training for Compositional Sensitivity Reduces Dense Retrieval Generalization,” tested what happens when teams train embedding models for compositional sensitivity. That is the ability to catch sentences that look nearly identical but mean something different — “the dog bit the man” versus “the man bit the dog,” or a negation flip that reverses a statement’s meaning entirely. That training consistently broke dense retrieval generalization, how well a model retrieves correctly across broad topics and domains it wasn’t specifically trained on. Performance dropped by 8 to 9 percent on smaller models and by 40 percent on a current mid-size embedding model teams are actively using in production.
The findings have direct implications for enterprise teams building agentic AI pipelines, where retrieval quality determines what context flows into an agent’s reasoning chain. A retrieval error in a single-stage pipeline returns a wrong answer. The same error in an agentic pipeline can trigger a cascade of wrong actions downstream.
Srijith Rajamohan, AI Research Leader at Redis and one of the paper’s authors, said the finding challenges a widespread assumption about how embedding-based retrieval actually works.
Advertisement
“There’s this general notion that when you use semantic search or similar semantic similarity, we get correct intent. That’s not necessarily true,” Rajamohan told VentureBeat. “A close or high semantic similarity does not actually mean an exact intent.”
The geometry behind the retrieval tradeoff
Embedding models work by compressing an entire sentence into a single point in a high-dimensional space, then finding the closest points to a query at retrieval time. That works well for broad topical matching — documents about similar subjects end up near each other. The problem is that two sentences with nearly identical words but opposite meanings also end up near each other, because the model is working from word content rather than structure.
That is what the research quantified. When teams fine-tune an embedding model to push structurally different sentences apart — teaching it that a negation flip which reverses a statement’s meaning is not the same as the original — the model uses representational space it was previously using for broad topical recall. The two objectives compete for the same vector.
The research also found the regression is not uniform across failure types. Negation and spatial flip errors improved measurably with structured training. Binding errors — where a model confuses which modifier applies to which word, such as which party a contract obligation falls on — barely moved. For enterprise teams, that means the precision problem is harder to fix in exactly the cases where getting it wrong has the most consequences.
Advertisement
The reason most teams don’t catch it is that fine-tuning metrics measure the task being trained for, not what happens to general retrieval across unrelated topics. A model can show strong improvement on near-miss rejection during training while quietly regressing on the broader retrieval job it was hired to do. The regression only surfaces in production.
Rajamohan said the instinct most teams reach for — moving to a larger embedding model — does not address the underlying architecture.
“You can’t scale your way out of this,” he said. “It’s not a problem you can solve with more dimensions and more parameters.”
Why the standard alternatives all fall short
The natural instinct when retrieval precision fails is to layer on additional approaches. The research tested several of them and found each fails in a different way.
Advertisement
Hybrid search. Combining embedding-based retrieval with keyword search is already standard practice for closing precision gaps. But Rajamohan said keyword search cannot catch the failure mode this research identifies, because the problem is not missing words — it is misread structure.
“If you have a sentence like ‘Rome is closer than Paris’ and another that says ‘Paris is closer than Rome,’ and you do an embedding retrieval followed by a text search, you’re not going to be able to tell the difference,” he said. “The same words exist in both sentences.”
MaxSim reranking. Some teams add a second scoring layer that compares individual query words against individual document words rather than relying on the single compressed vector. This approach, known as MaxSim or late interaction and used in systems like ColBERT, did improve relevance benchmark scores in the research. But it completely failed to reject structural near-misses, assigning them near-identity similarity scores.
The problem is that relevance and identity are different objectives. MaxSim is optimized for the former and blind to the latter. A team that adds MaxSim and sees benchmark improvement may be solving a different problem than the one they have.
Advertisement
Cross-encoders. These work by feeding the query and candidate document into the model simultaneously, letting it compare every word against every word before making a decision. That full comparison is what makes them accurate — and what makes them too expensive to run at production scale. Rajamohan said his team investigated them. They work in the lab and break under real query volumes.
Contextual memory. Also sometimes referred to as agentic memory, these systems are increasingly cited as the path beyond RAG, but Rajamohan said moving to that type of architecture does not eliminate the structural retrieval problem. Those systems still depend on retrieval at query time, which means the same failure modes apply. The main difference is looser latency requirements, not a precision fix.
The two-stage fix the research validated
The common thread across every failed approach is the same: a single scoring mechanism trying to handle both recall and precision at once. The research validated a different architecture: stop trying to do both jobs with one vector, and assign each job to a dedicated stage.
Stage one: recall. The first stage works exactly as standard dense retrieval does today — the embedding model compresses documents into vectors and retrieves the closest matches to a query. Nothing changes here. The goal is to cast a wide net and bring back a set of strong candidates quickly. Speed and breadth are what matter at this stage, not perfect precision.
Advertisement
Stage two: precision. The second stage is where the fix lives. Rather than scoring candidates with a single similarity number, a small learned Transformer model examines the query and each candidate at the token level — comparing individual words against individual words to detect structural mismatches like negation flips or role reversals. This is the verification step the single-vector approach cannot perform.
The results. Under end-to-end training, the Transformer verifier outperformed every other approach the research tested on structural near-miss rejection. It was the only approach that reliably caught the failure modes the single-vector system missed.
The tradeoff. Adding a verification stage costs latency. The latency cost depends on how much verification a team runs. For precision-sensitive workloads like legal or accounting applications, full verification at every query is warranted. For general-purpose search, lighter verification may be sufficient.
The research grew out of a real production problem. Enterprise customers running semantic caching systems were getting fast but semantically incorrect responses back — the retrieval system was treating similar-sounding queries as identical even when their meaning differed. The two-stage architecture is Redis’s proposed fix, with incorporation into its LangCache product on the roadmap but not yet available to customers.
Advertisement
What this means for enterprise teams
The research does not require enterprise teams to rebuild their retrieval pipelines from scratch. But it does ask them to pressure-test assumptions most teams have never examined — about what their embedding models are actually doing, which metrics are worth trusting and where the real precision gaps live in production.
Recognize the tradeoff before tuning around it. Rajamohan said the first practical step is understanding the regression exists. He evaluates any LLM-based retrieval system on three criteria: correctness, completeness and usefulness. Correctness failures cascade directly into the other two, which means a retrieval system that scores well on relevance benchmarks but fails on structural near-misses is producing a false sense of production readiness.
RAG is not obsolete — but know what it can’t do. Rajamohan pushed back firmly on claims that RAG has been superseded. “That’s a massive oversimplification,” he said. “RAG is a very simple pipeline that can be productionized by almost anyone with very little lift.” The research does not argue against RAG as an architecture. It argues against assuming a single-stage RAG pipeline with a fine-tuned embedding model is production-ready for precision-sensitive workloads.
The fix is real but not free. For teams that do need higher precision, Rajamohan said the two-stage architecture is not a prohibitive implementation lift, but adding a verification stage costs latency. “It’s a mitigation problem,” he said. “Not something we can actually solve.”
Mistral AI, the Paris-based artificial intelligence company valued at €11.7 billion ($13.8 billion), today released Workflows in public preview — a production-grade orchestration layer designed to move enterprise AI systems out of proofs of concept and into the business processes that generate revenue.
The product, which launches as part of Mistral’s Studio platform, is the company’s clearest articulation yet of a thesis that is quietly reshaping the enterprise AI market: that the bottleneck for organizations adopting AI is no longer the model itself, but the infrastructure required to run it reliably at scale.
“What we’re seeing today is that organizations are struggling to go beyond isolated proofs of concept,” Elisa Salamanca, who leads go-to-market for Mistral’s enterprise products, told VentureBeat in an exclusive interview ahead of the launch. “The gap is operational. Workflows is the infrastructure to run AI systems reliably across business-critical processes.”
The release arrives at a pivotal moment for both Mistral and the broader AI industry. The dedicated agentic AI market has been valued at approximately $10.9 billion in 2026 and is projected to reach $199 billion by 2034. Yet despite that staggering growth trajectory, industry research points to a stark reality: over 40% of agentic AI projects will be aborted by 2027 due to high costs, unclear value, and complexity. Mistral is betting that Workflows can help its enterprise customers avoid becoming one of those statistics.
Advertisement
Mistral’s new orchestration layer separates execution from control to keep enterprise data private
At its core, Workflows provides a structured system for defining, executing, and monitoring multi-step AI processes — from simple sequential tasks to complex, stateful operations that blend deterministic business rules with the probabilistic outputs of large language models.
Salamanca described Workflows as containing several key components. The first is a development kit that allows engineers to build orchestration logic in just a few lines of Python code. “We have also been able to expose MCP servers,” she explained, referring to the Model Context Protocol standard for connecting AI systems to external tools, “so that they can actually do this with agent authoring.”
The second — and arguably more technically significant — component is an architecture that separates orchestration from execution. “We’re decorrelating the orchestration from the execution,” Salamanca said. “Execution can happen close to the customer’s data — their critical systems — and orchestration can happen on the cloud or wherever they want to run it.” This means the data never has to leave the customer’s perimeter, a design decision with enormous implications for regulated industries where data sovereignty is non-negotiable. “Enterprises do not have to worry about us having access to the data,” she added.
The third pillar is observability. According to Mistral’s blog post announcing the release, every branch, retry, and state change within a workflow is recorded in Studio with native support for OpenTelemetry. Salamanca noted that this is not an afterthought: “You can easily see what decisions have been taken by the workflow, by the agent, and you can deep dive into where problems are happening.”
Advertisement
Workflows is fully customizable across models — engineers can select which model handles which step and can inject arbitrary code, allowing them to blend deterministic pipelines with agentic sections. The system also supports connectors that integrate directly with CRMs, ticketing systems, support platforms, and other enterprise tools, with built-in authentication and secrets management.
Why Mistral chose a code-first approach over low-code drag-and-drop builders
Unlike some competitors offering drag-and-drop workflow builders, Mistral has deliberately targeted developers and engineers rather than business users. “There are a couple of solutions out there that have click-and-drag, drag-and-drop solutions for workflows,” Salamanca acknowledged. “This is not the approach that we’ve been taking. We’ve been really focused towards developers and critical systems that will not scale if you’re doing these drag-and-drop workflows.”
The decision is part of a broader philosophy at Mistral: that enterprise AI systems handling mission-critical operations — cargo releases, compliance reviews, financial transactions — require the precision and version control that only code can provide. Business users are not excluded from the picture, but their role is downstream. Once engineers write a workflow in Python, it can be published to Le Chat, Mistral’s chatbot platform, so anyone in the organization can trigger it. Every step remains tracked and auditable in Studio.
Under the hood, Workflows runs on Temporal’s durable execution engine — a platform whose $5 billion valuation reflects how its durable execution capabilities, originally built for cloud workflow orchestration, have become essential infrastructure for AI agents requiring reliable, long-running, stateful processes. Temporal’s customers include OpenAI, Snap, Netflix, and JPMorgan Chase, and its technology powers orchestration at companies like Stripe and Salesforce.
Advertisement
Mistral extended Temporal’s core engine for AI-specific workloads by adding streaming, payload handling, multi-tenancy, and observability that the base engine does not provide out of the box. “Workflows is built on top of Temporal,” Salamanca confirmed. “We added all the AI requirements to make these AI workflows reliable. It provides out of the box durability, retries, state management. Whenever there’s a failure, it starts again wherever it stopped.” Originally spun out of Uber’s Cadence project, Temporal transparently handles retries, state persistence, and timeouts, providing durable execution across failures. In late 2025, Temporal joined the newly formed Agentic AI Foundation as a Gold Member and announced an official OpenAI Agents SDK integration. By building on this infrastructure rather than creating a proprietary alternative, Mistral inherits battle-tested reliability while focusing its own engineering efforts on the AI-specific layer that sits above it.
From cargo ships to KYC reviews, customers are already running millions of daily executions
Mistral is not launching Workflows as a concept — the company says customers are already running the product in production, processing millions of executions daily across three primary use cases.
The first is cargo release automation in the logistics sector. Global shipping still runs on paperwork, and a single cargo release can involve customs declarations, dangerous goods classifications, safety inspections, and regulatory checks spanning multiple jurisdictions. Salamanca described the scope of the problem: “Their global shipping today runs on paperwork. They have to involve customs declaration, Dangerous Goods classification, safety inspections, regulatory checks, and Workflows is now powering that with our models and business rules inside.”
Critically, the system keeps humans in the loop at the right moments. According to Mistral’s blog, the human approval step in a workflow is a single line of code — wait_for_input() — that pauses the workflow indefinitely with no compute consumption, notifies the reviewer, and resumes exactly where it left off once approval is given. “Humans are still in the loop, but they’re in the loop at the right time,” Salamanca said. “They just get the validation — I don’t have to go into multiple tools — and the shipment gets released.”
Advertisement
The second production use case is document compliance checking for financial institutions, specifically Know Your Customer reviews. These reviews are manual, repetitive, and traditionally require hours of analyst time per case. Salamanca said Workflows now processes these reviews in minutes and provides outputs in an auditable manner — a requirement for meeting regulatory obligations.
The third example involves customer support in the banking sector. “You’d have millions of users actually asking to have credit cards blocked, or feedbacks on their account situation, on their credit feedbacks,” Salamanca said. With Workflows, incoming support tickets are analyzed, categorized by intent and urgency, and routed automatically. Each routing decision is visible and traceable in Studio, and when the system gets a categorization wrong, the team can correct it at the workflow level without retraining the model.
How Workflows fits into Mistral’s three-layer enterprise AI platform strategy
Workflows does not exist in isolation. It is the middle layer of a three-part enterprise platform that Mistral has been assembling at a rapid clip throughout 2026.
At the bottom sits Forge, the custom model training platform Mistral launched in March at Nvidia’s GTC conference. Forge allows organizations to build, customize, and continuously improve AI models using their own proprietary data. At the top sits Vibe, Mistral’s coding agent platform that provides the user-facing interaction layer — available on web, mobile, or desktop.
Advertisement
Salamanca connected the three explicitly: “We just released Forge. It enables you to create your own models. But the question is, how do you put these models to do valuable work for your enterprise? That’s where Workflows comes in, because this is the orchestration piece — how you blend in deterministic rules and agentic capabilities. And then if you really want to have your end users interact with these AI patterns, it’s where Vibe comes into play.”
Forge is already seeing strong traction, Salamanca said, across two distinct patterns of enterprise demand. “First, they wanted to really build completely dedicated models to solve unique problems — transformers-based architecture for time series in the financial sector, adding new types of modalities to the LLMs,” she explained. “And the second motion was about customers with really specific tasks they want to solve. Reinforcement learning really caught their attention as to how they can use Forge and Forge RL to actually have models do these tasks very well.”
This layered architecture — model customization, workflow orchestration, and end-user interfaces — positions Mistral as something more ambitious than a model provider. It is building a full-stack enterprise AI platform, a strategy that pits it directly against not just other AI labs like OpenAI and Anthropic, but also against the hyperscale cloud providers. The company’s product portfolio now ranges, as Salamanca put it, “from compute to end-user interfaces,” including data centers in Europe, document processing with its OCR model, and audio capabilities through its Voxtral models.
Mistral’s aggressive scaling campaign and the $14 billion valuation powering it
The Workflows launch comes as Mistral executes one of the most aggressive scaling campaigns in the history of the European technology industry. The French AI startup has increased its revenue twentyfold within a year, with co-founder and CEO Arthur Mensch putting the company’s annualized revenue run rate at over $400 million, compared to just $20 million the previous year. The Paris-based company aims to achieve recurring annual revenue of more than $1 billion by year-end.
Advertisement
The company’s fundraising trajectory has been equally dramatic. Mistral announced a €1.7 billion ($1.9 billion) Series C round at a €11.7 billion ($12.8 billion) valuation in September 2025. Bloomberg reported in September 2025 that the company was finalizing a €2 billion investment valuing it at €12 billion ($14 billion). ASML led the round and contributed €1.3 billion, a landmark investment that aligned chip manufacturing expertise with frontier AI development and underscored European industrial capital’s commitment to building a sovereign AI ecosystem. Mistral then secured $830 million in debt in March 2026 to buy 13,800 Nvidia chips for a new data center near Paris.
The financial picture illustrates why Workflows matters strategically. Mistral’s revenue growth is being driven primarily by enterprise adoption, with approximately 60% of revenue coming from Europe, according to CEO Mensch’s public statements. Those enterprise customers are not buying Mistral’s models for casual chatbot applications — they are deploying them in regulated, mission-critical environments where reliability and data sovereignty are table stakes. Workflows gives those customers the production infrastructure they need to actually deploy AI systems that matter.
In May 2025, Mistral released Mistral Medium 3, which was priced at $0.40 per million input tokens and $2 per million output tokens. The company said clients in financial services, energy, and healthcare had been beta testing it for customer service, workflow automation, and analyzing complex datasets. That model now becomes one of many that can be plugged into Workflows, creating a flywheel where better models drive more workflow adoption, which in turn drives more inference revenue.
Where Mistral’s orchestration play fits in an increasingly crowded competitive landscape
Mistral’s entry into workflow orchestration arrives in an increasingly crowded field. AI orchestration platforms are quickly becoming the backbone of enterprise AI systems in 2026, and as businesses deploy multiple AI agents, tools, and LLMs, the need for unified control, oversight, and efficiency has never been greater.
Mistral’s differentiation rests on three pillars. First, vertical integration: because Workflows is native to Studio, the orchestration layer and the components it orchestrates — models, agents, connectors, observability — are built to work together, eliminating the integration tax that enterprises pay when stitching together disparate tools. Second, deployment flexibility: the split control-plane/data-plane architecture means customers in regulated industries can run execution workers in their own environments while still benefiting from managed orchestration. Third, data sovereignty: Mistral’s European roots and infrastructure investments give it a natural advantage with organizations wary of routing sensitive data through U.S.-headquartered cloud providers — a concern that has intensified amid ongoing geopolitical tensions and growing European anxiety about relying on foreign providers for over 80% of digital services and infrastructure.
Still, the challenges are real. OpenAI and Anthropic both have significantly larger model ecosystems and developer communities. The hyperscalers control the cloud infrastructure where most enterprise workloads actually run. And the enterprise sales cycles for production-grade AI deployments remain long and complex, requiring deep technical integration work that even well-funded startups can struggle to staff.
What comes next for Workflows — and why Mistral thinks orchestration is the real AI battleground
Salamanca outlined three areas of near-term development. First, Mistral plans to release a more managed version of Workflows that abstracts deployment logic for developers who don’t need granular control over worker placement. “Whenever you want to have this flexibility, you can, but if you want to be able to have this on a managed infrastructure, even if it’s running in your own VPC, this is something that we’re adding,” she said.
Advertisement
Second, the company intends to make Workflows accessible to business users, not just engineers. “With Vibe code, you can actually author a workflow. This can be executed at scale, and any end user, in the end, can actually do that with Workflows,” Salamanca explained. The third area is enterprise guardrails and safety controls for agentic applications — ensuring agents use the correct tools, run with appropriate permissions, and that administrators can enforce policies at scale. “Making sure that we have all these enterprise controls to be able to scale the authoring and the building of these workflows is something we’re actively working on,” she said.
The Python SDK for Workflows (v3.0) is now publicly available. Developers can try the product in Studio and access documentation and demo templates immediately. Mistral will be hosting its inaugural AI Now Summit in Paris on May 27–28, where the company is expected to provide additional details on its platform roadmap.
For three years, the AI industry has been captivated by a single question: who can build the most powerful model? Mistral’s Workflows launch suggests the company has moved on to a different question entirely — one that may prove far more consequential for the enterprises writing the checks. It’s not about which model is smartest. It’s about which one can actually show up for work.
If you’ve been thinking about a Honda motorcycle, this might be the sign you’ve been looking for. From now until the end of June, Honda’s offering a bunch of really nice “Bonus Bucks” rebates on some of their most popular bikes. The only catch is that you have to buy before June 30, 2026, when the rebate expires.
The fine print is pretty straightforward: Buy one of the new and unregistered models listed below, Honda will give you anywhere from $700 to $1,000 in the form of Bonus Bucks. Unlike those misleading 11% rebates at Menards, this rebate can be applied right there at the dealership at the time of purchase. (Just so you know, though: Bonus Bucks are non-transferrable and can’t be used on taxes and destination-related fees.)
Sounds simple enough, right? To help make it easy to decide, we’ve put together a compilation of the biggest Bonus Bucks offers available on Honda motorcycles. Take a look at some of the meatiest discounts being offered below, then head to Honda’s site to see the full list of models included in the Bonus Bucks promo.
Advertisement
$1,000 bonus bucks on CBR500R models
Honda’s CBR500R is a middleweight sportbike that makes for a nice little entry point into supersport riding. Right now, you can get $1,000 Bonus Bucks if you buy a new and unregistered model from 2025 or earlier. Accounting for that $7,399 base MSRP, that translates to about 13.5% off.
Advertisement
The CBR500R uses a 471cc liquid-cooled parallel-twin engine with dual overhead cams and four valves per cylinder. That’ll give you low-end torque with high-revving horsepower. Its six-speed manual transmission comes with an assist-and-slipper clutch for less lever effort and more stabilized rear-wheel behavior, especially under aggressive downshifting. Plus, the bike’s 41mm inverted Showa SFF-BP fork and Pro-Link rear suspension give you 4.7 inches of travel front and rear, which translates to more responsive handling across all sorts of different road conditions.
Buy one any time between now and the end of June, you’ll get that $1,000 rebate right there on the spot.
Advertisement
$1,000 bonus bucks on CB500F models
JustPhotos22/Shutterstock
It might sound similar to the model above, but the CB500F is not quite the same as the CB500R. One thing that is the same, though? A matching $1,000 rebate. With a base MSRP of $6,899, a thousand bucks off the CB500F comes out to be about the same as a 14.5% discount.
This Honda motorcycle is a great option for riders who prefer a stripped-down, naked-bike aesthetic. Plus, you still get a lot of the same performance fundamentals as its sibling, the CB500R. The CB500F uses the same 471cc liquid-cooled parallel-twin engine plus six-speed manual transmission and slipper clutch combination, just with a more ergonomic and minimalist build. The bike’s compact exhaust system and cast aluminum wheels also help to give it a visual identity all its own. It’s one of the most fuel-efficient cruisers around, as well.
It’s already more affordable than the CB500R, but with an extra $1,000 off, you can ride home on this bike for under $6,000 MSRP.
Advertisement
$750 bonus bucks on CRF450R models
For off-road enthusiasts, Honda also has you covered with a Bonus Bucks incentive of your own. They’re offering $750 off all CRF450R models from 2025 or earlier. This motocross machine gives you competition-level performance for a MSRP of $9,699, which means the Bonus Bucks offer will slash the price to just under nine thousand before taxes and fees.
The CRF450R features a 450cc liquid-cooled single-cylinder engine with a Unicam SOHC design. It’s engineered with a high 13.5:1 compression ratio and an advanced fuel-injection system for all the revving your heart desires. Plus, a close-ratio five-speed transmission for that precise gear spacing you need out there on the track. The Honda dirt bike also includes rider-adjustable features such as selectable engine modes and Selectable Torque Control, so you can tailor your performance based on track conditions.
Advertisement
$750 might not be as much as what the CB500R and CB500F get in Bonus Bucks, but it’s still a significant chunk of change saved.
Advertisement
$700 bonus bucks on CB650R E-Clutch models
Honda also has a Bonus Bucks offer available for its CB650R E-Clutch models. Buy a new and unregistered model from 2025 or earlier, they’ll give you $700 off. With a base MSRP of $9,399, you can drive off on a high-performance naked bike for about $8,699 (pre-taxes and fees).
At its core is a 649cc liquid-cooled inline four-cylinder engine along with Honda’s E-Clutch system. This sweet tech lets riders shift gears without having to use the handlebar-mounted clutch. (Of course, the option’s still there if you prefer that manual operation.) The system also mimics quick-shifter functionality for faster, smoother gear changes as you ride. The CB650R combines this drivetrain with a six-speed transmission, a 41mm Showa SFF-BP front fork, and a rear shock with 5.1 inches of travel.
It’s not as steep as $750 or $1,000 off, but it’s nevertheless a generous discount off the MSRP.
Advertisement
$700 bonus bucks on CBR650R E-Clutch models
At first glance, it might look like we’re covering the same bike twice. But no, the CBR650R is its own distinct bike. It simply shares the same 649cc inline four-cylinder engine as its sibling bike, the CB650R. As a matter of fact, it’s actually more expensive than the latter: a base MSRP of $9,899, a whole $200 more. Still, the offer of $700 off remains the same.
This is a fully faired sportbike that emphasizes both aerodynamic performance and aggressive styling. Chassis components are similar to those of its naked counterpart, but what truly differentiates the CBR650R is its sportbike design. It comes with full fairings that boost aerodynamics and give you a more aggressive riding posture overall. It also comes with a twin-spar frame and Y-spoke aluminum wheels.
Advertisement
Yes, cost of entry is higher than the CB650R, but the rebate brings down the price from the high nine thousands to the low. It might only be 7% off, but it’s much better than nothing.
The feature is already live for developers to test in App Store Connect and Xcode. However, it hasn’t reached the App Store just yet. That should change when iOS 26.5 rolls out next month, at which point the option will go live for users running iOS 26.4 or later. However, the US and Singapore are notably excluded at launch.
From a user perspective, this isn’t quite as flexible as a typical monthly plan. While you can technically cancel at any time, doing so only stops the subscription from renewing after the full 12-month commitment is completed. In other words, you’re still on the hook for the entire term. You are just paying for it monthly instead of up front.
Advertisement
Apple says it’s adding a few safeguards to make that clearer. Users will be able to track how many payments they’ve made (and how many are left) directly in their Apple account. Meanwhile, reminders via email and push notifications will flag upcoming renewals.
Advertisement
The move gives developers another pricing lever, especially for apps that typically rely on annual plans but want a lower barrier to entry. Splitting the cost across 12 months could make higher-priced subscriptions feel more manageable. This applies even if the overall commitment hasn’t changed.
It’s not clear why Apple is skipping the US and Singapore for now. The company hasn’t said when those regions will get access. Still, the direction here is pretty obvious. Apple is looking for ways to make longer-term subscriptions easier to sell, without fully giving up the predictability of annual billing.
Advertisement
If widely adopted, this could reshape how app subscriptions are presented. This can make “monthly” plans a bit less flexible than they first appear.
PocketOS, which provides software to car rental businesses, was using the agent against live infrastructure rather than keeping it strictly in a test environment. In a public post, founder Jer Crane described the episode as evidence of “systemic failures” and argued it was more than a single mistaken command. Read Entire Article Source link
A modder has turned a Game Boy Color into something you can wear on your wrist, and it’s not just borrowing the look. This is an actual, playable retro console slapped onto your wrist.
YouTuber LeggoMyFroggo managed to squeeze a fully functional Game Boy Color into a wristwatch-sized form factor, creating one of the more bizarre yet impressive retro builds in recent memory.
LeggoMyFroggo
How’d he cram a Game Boy Color into a tiny watch?
In the YouTube video, modder Chris Hackmann called the project “Time Frog Color”. Rather than going for a simpler route of relying on emulation, the build uses original Game Boy Color hardware, including the Sharp SM83 processor, paired with its video memory and support for physical cartridges.
If that last part sounds insane, it absolutely is. The watch can actually run games using tiny cartridges, which Hackmann even demonstrated by playing Pokémon Gold without any issues. He used an RP2040 chip that handled translating the display signal. This allowed the wearable console to function as a watch when powered off.
Advertisement
How was the gameplay experience?
LeggoMyFroggo
Shrinking a late ’90s handheld console into a 38mm wristwatch does sound like a cool side project, but it comes with its fair share of compromises. The display is just 1.12 inches, and controls are handled by tiny tactile buttons tucked under 3D-printed caps, which doesn’t exactly sound like game-friendly controls. Making the experience even less immersive is the lack of audio and limited battery life.
In other words, it works, but it’s not exactly the best way to replay your childhood favorites. The Time Frog Color just shows how far retro hardware modding has come. It was never meant to replace the actual Game Boy Color or make gaming on a watch a real thing. Though watching enthusiasts finding ways to preserve and repurpose original components is always fun.
To achieve a smart home, you need a voice assistant to run it. A smart home assistant, usually folded into a smart speaker, will let you command your smart home with your voice and run your various routines. It also acts as a center for every gadget you want to add to your home. And you can add almost anything these days, from smart garage control to even voice-commanding your blinds.
But which assistant should you choose? Each of the big players comes with its own pros and cons, but I recommend choosing based on what you already use day-to-day. Your smartphone is the easiest entry point to pick from Apple or Google, or if you want a huge suite of smart speakers to choose from and have a Prime subscription, you may want to consider Amazon.
Take a look around what’s already in your home to see what works with which ecosystem before deciding. The best system for you will be the path of least resistance, whether that’s using your smartphone’s dedicated assistant or sticking with a platform that best integrates with the devices you already have.
Amazon Alexa
Advertisement
Courtesy of Amazon
WIRED: Huge selection of smart speakers and device compatibility.
TIRED: Paywalls, a meh new assistant, and Ring’s problematic policy.
It all began with Alexa, to some extent. It was the first Amazon Echo speaker back in 2012 that kicked off the smart home in an accessible way, letting anyone voice-command smart bulbs and ask for the weather without needing a custom installer or costing a fortune. Today, Amazon still has the widest range of options. The brand has the most smart speakers by a long shot, with 11 main models of smart speakers and displays currently available, plus several older versions of those same devices also available on Amazon’s website or at other retailers. It’s a huge suite with something for everyone, whether you want a screen, something made for kids, or fantastic sound with Alexa built in.
I do really like Amazon’s speakers and how easy the devices are to use, so this is a great entry point if voice control is of utmost importance. It can bring voice control into any room and for anyone in the house, and Alexa can create different profiles for different members of the family and attach information like calendars to those profiles. Amazon also owns Ring, so those smart home security devices work seamlessly with an Echo speaker, but we don’t recommend using Ring’s cameras because of its partnership with Axon, which enables local law enforcement to request footage directly from Ring users. My colleagues also have concerns about its data collection (and there have been other privacy issues over the years).
Still looking for an Alexa? Here are my favorite devices to start with.
Amazon
Advertisement
Echo Show 11
This is one of Amazon’s newest smart displays, and it’s a great size to use in kitchens without being too large for console tables. The sound is excellent, too, and there’s a built-in hub.
Amazon
Echo Studio (2nd Gen) and Echo Dot Max
Advertisement
Amazon’s new flagship speakers have great sound quality and more volume than you probably need. Both have a built-in hub to connect devices to.
You must be logged in to post a comment Login