Tech

Forget smart glasses: UW researchers put tiny cameras into earbuds for hands-free AI

Published

on

VueBuds, a prototype developed by University of Washington researchers who have embedded a rice-grain-sized camera into each earbud of a standard pair of Sony wireless earbuds. (UW Photo)

Wireless earbuds seemingly sprang out of nowhere. Popularized by Apple’s AirPods, they were suddenly everywhere — on the subway, in the grocery store, in the ears of the person sitting across from you — until somewhere along the way, they became the thing nearly everyone wears without a second thought.

Could that popularity make earbuds better than smart glasses for AI? That is the bet behind VueBuds, a prototype developed by University of Washington researchers who have embedded a rice-grain-sized camera into each earbud of a standard pair of Sony wireless earbuds. The result is a visual AI assistant hiding in plain sight: look at a can of food and ask how many calories it has, hold up an unfamiliar kitchen tool and get an answer in about a second. 

The system processes images on-device and responds through a connected AI model — no cloud required, no images stored.

The UW team believes it is the first to embed cameras directly in commercial wireless earbuds.

The earbuds don’t remember anything, but the people around you might not know that. That tension sits at the heart of what the UW team built and raises a question the researchers take seriously: what are the social norms when cameras are embedded in objects nobody thinks of as cameras?

Advertisement

The team’s answer is to lean hard on minimizing data collection. Images are processed and discarded; nothing is saved. But the system offers no outward signal to bystanders that a camera is present, which the researchers acknowledge is an open challenge rather than a solved one.

For technology like this to earn trust, Maruchi Kim, lead researcher and UW doctoral student in the Paul G. Allen School of Computer Science & Engineering, argued that privacy can’t be an afterthought. 

“We don’t support saving the images,” Kim said. “It’s mainly just to bridge the interaction between a person and having access to AI on the go, especially in hands-free scenarios.”

The team’s other central argument is about form factor — and it’s a pointed challenge to Meta, which has spent years and hundreds of millions of dollars trying to make camera glasses a mainstream product.

Advertisement

The UW team’s position is that smart glasses will never fully shed their social baggage: the memory of Google Glass, the discomfort of being watched, the visible signal that the wearer has opted into something most people haven’t. Earbuds carry none of that history.

“From the get-go, we didn’t want to be associated with that,” Kim said.

Getting cameras into earbuds required solving a power problem first. Cameras consume far more energy than microphones, so the team opted for a low-power sensor that captures roughly one frame per second in black and white — slow by video standards, but fast enough for the question-and-answer style of interaction the researchers had in mind.

The cameras are angled five to 10 degrees outward, providing a 98- to 108-degree field of view, and images from both earbuds are stitched into a single frame before processing, cutting response time to about one second.

Advertisement

The applications range from the practical to the significant. The system can read text on food packaging, identify objects, and translate written Korean. But for people with low vision or cataracts, the implications run deeper. 

The team received more than a dozen emails from people with visual impairments describing what they’d use it for: understanding facial expressions, reading books, watching television — tasks that existing AI tools can’t easily support in a hands-free, ambient way.

Kim sees another underserved group in the workforce. Electricians, plumbers, and workers in industrial settings often can’t pause to pull out a phone mid-task — a pipe fitting wedged in place, a live wire that needs both hands.

For those workers, a voice-queryable visual assistant that doesn’t require touching a screen is the difference between having access to AI and not having it at all.

Advertisement

“There’s a lot of blue collar work where those people aren’t really able to harness the benefits of recent AI advances,” Kim said. “They can’t just whip out their phones and take a photo.”

The hands-free framing extends broadly: surgeons, cooks, anyone who has ever tried to follow a recipe with wet hands.

The system remains experimental and isn’t available for purchase. Shyam Gollakota, a professor in the Allen School and the project’s senior researcher, said interest from technology companies has been significant, and camera-equipped earbuds could reach consumers within a few years.

On cost, Gollakota is optimistic. The camera sensor itself could run under a dollar at the component level, he said — meaning that at the scale of a major consumer electronics manufacturer, the price premium over standard earbuds would likely be modest.

Advertisement

The $10 figure Gollakota cited refers to a more conservative estimate at smaller production volumes.

“What we do at the universities is show that you can solve technical problems,” Gollakota said. “Then we show a path for these companies and other people to say that this is actually possible.”

Source link

Advertisement

You must be logged in to post a comment Login

Leave a Reply

Cancel reply

Trending

Exit mobile version