Disney is adding another layer to its AI and extended reality strategies. As first reported by Reuters, the company recently formed a dedicated emerging technologies unit. Dubbed the Office of Technology Enablement, the group will coordinate the company’s exploration, adoption and use of artificial intelligence, AR and VR tech.
It has tapped Jamie Voris, previously the CTO of its Studios Technology division, to oversee the effort. Before joining Disney in 2010, Voris was the chief technology officer at the National Football League. More recently, he led the development of the company’s Apple Vision Proapp. Voris will report to Alan Bergman, the co-chairman of Disney Entertainment. Reuters reports the company eventually plans to grow the group to about 100 employees.
“The pace and scope of advances in AI and XR are profound and will continue to impact consumer experiences, creative endeavors, and our business for years to come — making it critical that Disney explore the exciting opportunities and navigate the potential risks,” Bergman wrote in an email Disney shared with Engadget. “The creation of this new group underscores our dedication to doing that and to being a positive force in shaping responsible use and best practices.”
A Disney spokesperson told Engadget the Office of Technology Enablement won’t take over any existing AI and XR projects at the company. Instead, it will support Disney’s other teams, many of which are already working on products that involve those technologies, to ensure their work fits into the company’s broader strategic goals.
Advertisement
“It is about bringing added focus, alignment, and velocity to those efforts, and about reinforcing our commitment being a positive force in shaping responsible use and best practices,” the spokesperson said.
It’s safe to say Disney has probably navigated the last two decades of technological change better than most of Hollywood. For instance, the company’s use of the Unreal Engine in conjunction with a digital set known as The Volume has streamlined the production of VFX-heavy shows like The Mandalorian. With extended reality and AI in particular promising tidal changes to how humans work and play, it makes sense to add some additional oversight to how those technologies are used at the company.
If you buy something through a link in this article, we may earn commission.
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Hugging Face today has released SmolLM2, a new family of compact language models that achieve impressive performance while requiring far fewer computational resources than their larger counterparts.
The new models, released under the Apache 2.0 license, come in three sizes — 135M, 360M and 1.7B parameters — making them suitable for deployment on smartphones and other edge devices where processing power and memory are limited. Most notably, the 1.7B parameter version outperforms Meta’s Llama 1B model on several key benchmarks.
Small models pack a powerful punch in AI performance tests
“SmolLM2 demonstrates significant advances over its predecessor, particularly in instruction following, knowledge, reasoning and mathematics,” according to Hugging Face’s model documentation. The largest variant was trained on 11 trillion tokens using a diverse dataset combination including FineWeb-Edu and specialized mathematics and coding datasets.
This development comes at a crucial time when the AI industry is grappling with the computational demands of running large language models (LLMs). While companies like OpenAI and Anthropic push the boundaries with increasingly massive models, there’s growing recognition of the need for efficient, lightweight AI that can run locally on devices.
Advertisement
The push for bigger AI models has left many potential users behind. Running these models requires expensive cloud computing services, which come with their own problems: slow response times, data privacy risks and high costs that small companies and independent developers simply can’t afford. SmolLM2 offers a different approach by bringing powerful AI capabilities directly to personal devices, pointing toward a future where advanced AI tools are within reach of more users and companies, not just tech giants with massive data centers.
Edge computing gets a boost as AI moves to mobile devices
SmolLM2’s performance is particularly noteworthy given its size. On the MT-Bench evaluation, which measures chat capabilities, the 1.7B model achieves a score of 6.13, competitive with much larger models. It also shows strong performance on mathematical reasoning tasks, scoring 48.2 on the GSM8K benchmark. These results challenge the conventional wisdom that bigger models are always better, suggesting that careful architecture design and training data curation may be more important than raw parameter count.
The models support a range of applications including text rewriting, summarization and function calling. Their compact size enables deployment in scenarios where privacy, latency or connectivity constraints make cloud-based AI solutions impractical. This could prove particularly valuable in healthcare, financial services and other industries where data privacy is non-negotiable.
Industry experts see this as part of a broader trend toward more efficient AI models. The ability to run sophisticated language models locally on devices could enable new applications in areas like mobile app development, IoT devices, and enterprise solutions where data privacy is paramount.
The race for efficient AI: Smaller models challenge industry giants
However, these smaller models still have limitations. According to Hugging Face’s documentation, they “primarily understand and generate content in English” and may not always produce factually accurate or logically consistent output.
Advertisement
The release of SmolLM2 suggests that the future of AI may not solely belong to increasingly large models, but rather to more efficient architectures that can deliver strong performance with fewer resources. This could have significant implications for democratizing AI access and reducing the environmental impact of AI deployment.
The models are available immediately through Hugging Face’s model hub, with both base and instruction-tuned versions offered for each size variant.
VB Daily
Stay in the know! Get the latest news in your inbox daily
On Friday evening, Okta posted an odd update to its list of security advisories. The latest entry reveals that under specific circumstances, someone could’ve logged in by entering anything for a password, but only if the account’s username had over 52 characters.
According to the notepeople reported receiving, other requirements to exploit the vulnerability included Okta checking the cache from a previous successful login, and that an organization’s authentication policy didn’t add extra conditions like requiring multi-factor authentication (MFA).
Here are the details that are currently available:
On October 30, 2024, a vulnerability was internally identified in generating the cache key for AD/LDAP DelAuth. The Bcrypt algorithm was…
Strands is the NYT’s latest word game after the likes of Wordle, Spelling Bee and Connections – and it’s great fun. It can be difficult, though, so read on for my Strands hints.
SPOILER WARNING: Information about NYT Strands today is below, so don’t read on if you don’t want to know the answers.
Your Strands expert
Your Strands expert
Marc McLaren
NYT Strands today (game #244) – hint #1 – today’s theme
What is the theme of today’s NYT Strands?
• Today’s NYT Strands theme is… Good on paper
Advertisement
NYT Strands today (game #244) – hint #2 – clue words
Play any of these words to unlock the in-game hints system.
LATE
LAST
STALE
STARE
PUFF
CLIP
NYT Strands today (game #244) – hint #3 – spangram
What is a hint for today’s spangram?
• Stationery cupboard
NYT Strands today (game #244) – hint #4 – spangram position
What are two sides of the board that today’s spangram touches?
First: left, 4th row
Last: right, 4th row
Right, the answers are below, so DO NOT SCROLL ANY FURTHER IF YOU DON’T WANT TO SEE THEM.
Advertisement
NYT Strands today (game #244) – the answers
The answers to today’s Strands, game #244, are…
PRINTER
SCISSORS
PENCILS
STAPLER
RULER
SPANGRAM: OFFICESUPPLIES
My rating: Easy
My score: Perfect
As the father of teenage daughters I am well aware of all of the OFFICESUPPLIES in today’s Strands. Not because they work in an office, obviously, but because they are at school and seem to get through about 20 RULERs and 50 PENCILS a year, constantly need me to help them use the PRINTER and still seem a little clueless about how to use SCISSORS or a STAPLER. Kids today, eh? Too much time spent in front of a screen, clearly.
Sign up for breaking news, reviews, opinion, top tech deals, and more.
My own parental issues aside, this was an easy Strands puzzle to solve. The theme clue provided a good push in the right direction, and when I found PRINTER by accident my course was duly charted. None of the words were had to think of, and only the rather long and complex spangram provided any real challenge.
Yesterday’s NYT Strands answers (Friday, 1 November, game #243)
QUEEN
KING
ROOK
TIMER
BISHOP
PAWN
KNIGHT
BOARD
SPANGRAM: CHECKMATE
What is NYT Strands?
Strands is the NYT’s new word game, following Wordle and Connections. It’s now out of beta so is a fully fledged member of the NYT’s games stable and can be played on the NYT Games site on desktop or mobile.
I’ve got a full guide to how to play NYT Strands, complete with tips for solving it, so check that out if you’re struggling to beat it each day.
Due to its unique model that includes only original content, Apple TV+ tends to have a very slim new release slate. However, just about every Apple TV+ release features A-list talent, and it has set a high bar for quality. Just look at Best Picture winner CODA and Emmy-winning drama Severance (returning in January).
This month is no exception, as there are only four new additions to the library in November. We’ve highlighted the two most anticipated, but don’t overlook Season 2 of the critically acclaimed comedy Bad Sisters or the Malala Yousafzai and Jennifer Lawrence documentary Bread & Roses.
There are only a few new arrivals each month to Apple TV+, but they’re usually all worth at least a glance. This month is no exception. Read on for everything coming to Apple TV+ in October 2024.
Google may upgrade the “Now Playing” feature by adding the much-needed album art to the history page. Now Playing has been able to identify songs with a high degree of accuracy, but the list only included the name of the song and the artist.
Now Playing is constantly operating in the background, but only for music
Introduced way back in 2017 along with the Pixel 2, the Now Playing feature has remained exclusive to the Google Pixel phones. It essentially identifies songs that are playing nearby and works well even on the latest Pixel 9 devices.
Apps like Shazam have been recognizing music and songs for quite some time. However, Now Playing has some tricks for the Pixel phones. Now Playing works entirely in the background. Pixel users don’t even need to pull out their phones.
While working in the background, Now Playing relies on the low-power efficiency cores to continuously analyze audio through the microphone. If it picks up audio that seems like music or a song, Now Playing requests the performance cores to record a few seconds of the audio.
Advertisement
Now Playing then matches the recorded audio on a database containing tens of thousands of fingerprints of the most popular songs in a particular region. After processing and matching, Now Playing displays the name and artist of the song on the lock screen as well as in a notification.
Needless to say, Now Playing is fairly accurate. However, the list of songs it recognizes contains only the name of the song, the artist, and a timestamp.
Google’s Now Playing feature for Pixel devices may get album art
The songs that Now Playing recognized are visible under Settings > Sound & vibration > Now Playing. The page lists the history of identified songs in reverse chronological order.
Although there’s an icon next to each song, Google has refused to append any album art to the songs Now Playing recognizes. According to Android Authority, this might change in the future.
Advertisement
The hidden system app that downloads the Now Playing database may soon also grab album art. The code change is titled “#AlbumArt Add Now Playing album art downloads to the network usage log”.
Google has yet to assign a dedicated online repository from where Now Playing will download album art for the songs it recognizes. However, Ambient Music Mod, an open-source port of Now Playing by developer Kieron Quinn, already has the feature. The reverse-engineered version essentially replaces the generic music note icon with album art.
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Meta made several major announcements for robotics and embodied AI systems this week. This includes releasing benchmarks and artifacts for better understanding and interacting with the physical world. Sparsh, Digit 360 and Digit Plexus, the three research artifacts released by Meta, focus on touch perception, robot dexterity and human-robot interaction. Meta is also releasing PARTNR a new benchmark for evaluating planning and reasoning in human-robot collaboration.
The release comes as advances in foundational models have renewed interest in robotics, and AI companies are gradually expanding their race from the digital realm to the physical world.
There is renewed hope in the industry that with the help of foundation models such as large language models (LLMs) and vision-language models (VLMs), robots can accomplish more complex tasks that require reasoning and planning.
Advertisement
Tactile perception
Sparsh, which was created in collaboration with the University of Washington and Carnegie Mellon University, is a family of encoder models for vision-based tactile sensing. It is meant to provide robots with touch perception capabilities. Touch perception is crucial for robotics tasks, such as determining how much pressure can be applied to a certain object to avoid damaging it.
The classic approach to incorporating vision-based tactile sensors in robot tasks is to use labeled data to train custom models that can predict useful states. This approach does not generalize across different sensors and tasks.
Meta describes Sparsh as a general-purpose model that can be applied to different types of vision-based tactile sensors and various tasks. To overcome the challenges faced by previous generations of touch perception models, the researchers trained Sparsh models through self-supervised learning (SSL), which obviates the need for labeled data. The model has been trained on more than 460,000 tactile images, consolidated from different datasets. According to the researchers’ experiments, Sparsh gains an average 95.1% improvement over task- and sensor-specific end-to-end models under a limited labeled data budget. The researchers have created different versions of Sparsh based on various architectures, including Meta’s I-JEPA and DINO models.
Touch sensors
In addition to leveraging existing data, Meta is also releasing hardware to collect rich tactile information from the physical. Digit 360 is an artificial finger-shaped tactile sensor with more than 18 sensing features. The sensor has over 8 million taxels for capturing omnidirectional and granular deformations on the fingertip surface. Digit 360 captures various sensing modalities to provide a richer understanding of the environment and object interactions.
Digit 360 also has on-device AI models to reduce reliance on cloud-based servers. This enables it to process information locally and respond to touch with minimal latency, similar to the reflex arc in humans and animals.
Advertisement
“Beyond advancing robot dexterity, this breakthrough sensor has significant potential applications from medicine and prosthetics to virtual reality and telepresence,” Meta researchers write.
Meta is publicly releasing the code and designs for Digit 360 to stimulate community-driven research and innovation in touch perception. But as in the release of open-source models, it has much to gain from the potential adoption of its hardware and models. The researchers believe that the information captured by Digit 360 can help in the development of more realistic virtual environments, which can be big for Meta’s metaverse projects in the future.
Meta is also releasing Digit Plexus, a hardware-software platform that aims to facilitate the development of robotic applications. Digit Plexus can integrate various fingertip and skin tactile sensors onto a single robot hand, encode the tactile data collected from the sensors, and transmit them to a host computer through a single cable. Meta is releasing the code and design of Digit Plexus to enable researchers to build on the platform and advance robot dexterity research.
Meta will be manufacturing Digit 360 in partnership with tactile sensor manufacturer GelSight Inc. They will also partner with South Korean robotics company Wonik Robotics to develop a fully integrated robotic hand with tactile sensors on the Digit Plexus platform.
Evaluating human-robot collaboration
Meta is also releasing Planning And Reasoning Tasks in humaN-Robot collaboration (PARTNR), a benchmark for evaluating the effectiveness of AI models when collaborating with humans on household tasks.
Advertisement
PARTNR is built on top of Habitat, Meta’s simulated environment. It includes 100,000 natural language tasks in 60 houses and involves more than 5,800 unique objects. The benchmark is designed to evaluate the performance of LLMs and VLMs in following instructions from humans.
Meta’s new benchmark joins a growing number of projects that are exploring the use of LLMs and VLMs in robotics and embodied AI settings. In the past year, these models have shown great promise to serve as planning and reasoning modules for robots in complex tasks. Startups such as Figure and Covariant have developed prototypes that use foundation models for planning. At the same time, AI labs are working on creating better foundation models for robotics. An example is Google DeepMind’s RT-X project, which brings together datasets from various robots to train a vision-language-action (VLA) model that generalizes to various robotics morphologies and tasks.
VB Daily
Stay in the know! Get the latest news in your inbox daily
You must be logged in to post a comment Login