Connect with us

Technology

Zyphra’s Zyda-2 dataset enables small enterprise model training

Published

on

Zyphra’s Zyda-2 dataset enables small enterprise model training

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Zyphra Technologies, the company working on a multimodal agent system combining advanced research in next-gen SSM hybrid architectures, long-term memory and reinforcement learning, just released Zyda-2, an open pretraining dataset comprising 5 trillion tokens. 

The offering comes as the successor of the original Zyda dataset. It is five times larger in size and covers a vast range of topics and domains to ensure a high level of diversity and quality – which is critical for training robust and competitive language models. 

But, that’s not the user profile of Zyda-2. There are many open datasets on Hugging Face for training cutting-edge AI models.

Advertisement

What makes this dataset unique is that it has been distilled to possess the strengths of the top existing datasets and eliminate their weaknesses.

This gives organizations a way to train language models that show high accuracy even when operating across edge and consumer devices on a given parameter budget.

The company trained its Zamba2 small language model using this dataset and found it to be performing much better than those trained with other state-of-the-art open-source language modeling datasets on HF.

What does Zyda-2 bring to the table?

Earlier this year, as part of the effort to build highly powerful small models that could automate a range of tasks cheaply, Zyphra went beyond model architecture research to start constructing a custom pretraining dataset by combining the best permissively licensed open datasets – often recognized as high-quality within the community.

Advertisement

The first release from this work, Zyda with 1.3 trillion tokens, debuted in June as a filtered and deduplicated mashup of existing premium open datasets, specifically RefinedWeb, Starcoder C4, Pile, Slimpajama, pe2so and arxiv. 

At the time, Zyda performed better than the datasets it was built upon, giving enterprises a strong open option for training. But, 1.3 trillion tokens was never going to be enough. The company needed to scale and push the benchmark of performance, which led it to set up a new data processing pipeline and develop Zyda-2.

At the core, Zyphra built on Zyda-1, further improving it with open-source tokens from DCLM, FineWeb-Edu and the Common-Crawl portion of Dolma v1.7. The original version of Zyda was created with the company’s own CPU-based processing pipeline, but for the latest version, they used Nvidia’s NeMo Curator, a GPU-accelerated data curation library. This helped them reduce the total cost of ownership by 2x and process the data 10x faster, going from three weeks to two days.

“We performed cross-deduplication between all datasets. We believe this increases quality per token since it removes duplicated documents from the dataset. Following on from that, we performed model-based quality filtering on Zyda-1 and Dolma-CC using NeMo Curator’s quality classifier, keeping only the ‘high-quality’ subset of these datasets,” Zpyphra wrote in a blog post.

Advertisement

The work created a perfect ensemble of datasets in the form of Zyda-2, leading to improved model performance. As Nvidia noted in a separate developer blog post, the new dataset combines the best elements of additional datasets used in the pipeline with many high-quality educational samples for logical reasoning and factual knowledge. Meanwhile, the Zyda-1 component provides more diversity and variety and excels at more linguistic and writing tasks. 

Distilled dataset leads to improved model performance

In an ablation study, training Zamba2-2.7B with Zyda-2 led to the highest aggregate evaluation score on leading benchmarks, including MMLU, Hellaswag, Piqa, Winogrande, Arc-Easy and Arc-Challenge. This shows model quality improves when training with the distilled dataset as compared to training with individual open datasets.

Zyda-2 performance
Zyda-2 performance

“While each component dataset has its own strengths and weaknesses, the combined Zyda-2 dataset can fill these gaps. The total training budget to obtain a given model quality is reduced compared to the naive combination of these datasets through the use of deduplication and aggressive filtering,” the Nvidia blog added.

Ultimately, the company hopes this work will pave the way for better quality small models, helping enterprises maximize quality and efficiency with specific memory and latency constraints, both for on-device and cloud deployments. 

Teams can already get started with the Zyda-2 dataset by downloading it directly from Hugging Face. It comes with an ODC-By license which enables users to train on or build off of Zyda-2 subject to the license agreements and terms of use of the original data sources.

Advertisement

Source link
Advertisement
Continue Reading
Advertisement
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Technology

EU regulators may fine several Musk-owned companies for Digital Services Act violations

Published

on

EU regulators may fine several Musk-owned companies for Digital Services Act violations

The European Union has reportedly warned X that it could use the revenue of several companies owned by Elon Musk to calculate fines levied against the platform for violating social media laws. European regulators may take the annual revenues of Musk’s other companies — including SpaceX, Neuralink, xAI, and the Boring Company — into account to calculate fines, people familiar with the matter told Bloomberg.

X is being investigated for potentially violating several provisions of the EU’s Digital Services Act (DSA), a sweeping law that requires major platforms to remove posts that contain illegal content — and holds them financially accountable if they don’t. Under the DSA, which was passed in 2022, regulators can fine companies as much as 6% of their yearly annual revenue for failing to follow transparency rules or address illegal content or disinformation on their platforms.

People familiar with the deliberations told Bloomberg that the EU is essentially debating whether Musk should be fined instead of X itself. If so, regulators would calculate the amount based on the annual revenues of several companies he owns. Since Tesla is publicly owned, it would be excluded.

It’s possible that these expanded fines are related to X’s plummeting revenue under Musk’s tenure. X is valued at $9.4 billion as of August, amounting to a total markdown of nearly 80 percent since Musk purchased it, according to disclosures from Fidelity’s Blue Chip Growth Fund.

Advertisement

DSA obligations apply “irrespective of whether the entity exercising decisive influence over the platform or search engine is a natural or legal person,” Thomas Regnier, a spokesperson for the commission, told Bloomberg

Still, the commission has yet to decide whether to fine X at all, and people familiar with the situation told Bloomberg that the social platform could avoid fines if it addresses the commission’s concerns — which Musk is unlikely to do. 

After saying he was “very much on the same page” as the EU regarding the DSA in 2022, Musk made an about-face, pulling X out of the EU’s Code of Practice against disinformation the following year. The Code of Practice was a voluntary agreement that served as a precursor to the mandatory provisions of the DSA. Since then, Musk has publicly criticized both the commission and antagonized its former head, Thierry Breton, who spearheaded the investigation into X before resigning this September. The relationship was mutually contentious: Breton once sent Musk a letter warning that he’d be watching for “spillover” DSA violations.

The decision to fine X — and Musk’s other companies — now rests with Margrethe Vestager, Breton’s successor.

Advertisement

Source link

Continue Reading

Technology

I doubt this retro-style vertical turntable and speakers combo is a good idea, but doesn’t it look incredible?

Published

on

Fuse Audio GLD vertical vinyl

One of the minor annoyances about vinyl, particularly fancy vinyl, is that you can’t really show it off while it’s playing – so if you have an LP with a particularly great color, or one that creates a zoetrope effect as it’s playing, it’s only visible if you’re looking directly down on it. Wouldn’t it be great if your vinyl was vertical instead?

That’s the approach Fuse Audio is taking with its GLD record player. Instead of the familiar horizontal platter, your LP is held up like a Ferris Wheel so you can see it as it spins. It also comes with Bluetooth in and out, a pair of 36W powered speakers to connect directly to it, and it supports 33, 45 and 78rpm records. It’s yours on Kickstarter for $229 plus tax and shipping.

Source link

Advertisement

Continue Reading

Technology

Are Elon Musk’s new Tesla robotaxis safe?

Published

on

Are Elon Musk's new Tesla robotaxis safe?

Elon Musk has unveiled Tesla’s long-awaited robotaxi, which he claims will hit the market as early as 2027.

BBC Tech Correspondent Lily Jamali analyses the ‘robocabs’ and whether their reliance on camera technology might be undermining the vehicles’ safety.

Musk also debuted “Optimus” robots, which he claims will free up human time and take on chores.

Source link

Continue Reading

Technology

Best Ninja Foodi deals: Pressure cookers, grills, air fryers

Published

on

Best Ninja Foodi deals: Pressure cookers, grills, air fryers

Ninja is a great small kitchen appliance brand if you’re looking for both convenience and savings. It makes some of the best air fryers and best pressure cookers, and when you’re shopping Ninja Foodi deals you can almost always find some of the best air fryer deals. But the Ninja Foodi lineup has all sorts of ways to cook across a range of appliances, and with so much to choose from we thought we’d track down all of the best Ninja Foodi deals in one place. You’ll find them all below. You can also shop refrigerator deals and oven deals if you’re looking for larger kitchen appliances, or there are some really great coffee maker deals worth shopping right now, and they include both Keurig deals and Nespresso deals.

Ninja Foodi PossibleCooker Pro 8.5-quart multicooker — $138, was $150

The Ninja Foodi PossibleCooker Pro on a counter with a meal.
Ninja

The Ninja Foodie PossibleCooker Pro is capable of saving you a lot of counter space, as it can replace 14 different cooking tools and appliances. It can slow cook, steam, warm, sauté, steam, and roast, and it can do the work of appliances such as cast iron skillets, saucepans, stock pots, and Dutch ovens. It’s perfect for entertaining, as it has an 8.5-quart capacity that allows you to make foods like chili for up to 20 people. The Ninja Foodi PossibleCooker Pro cooks up to 30% faster than conventional ovens, and offers easy cleanup with a nonstick pot.

Ninja Foodi 11-in-1 6.5-quart pressure cooker — $150, was $200

A woman loads the Ninja Foodi 6.5-quart pressure cooker with food.
Ninja

The Ninja Foodi 11-in-1 6.5-quart pressure cooker is simple yet versatile. It offers 14 different cooking functions, with pressure cooking, baking, air frying, broiling, slow cooking, and steaming among them. It also has Tendercrisp Technology, which combines the best of pressure cooking and air fryer and allows you to get faster, juicier, and crispier results. This pressure cooker has a large capacity of 6.5 quarts, which should be plenty for feeding small families or for preparing things like appetizers for gatherings.

Ninja Foodi 2-in-1 Flip Toaster — $120, was $130

Pizza bagels being removed from the Ninja Foodi 2-in-1 Flip Toaster.
Ninja

This 2-in-1 flip toaster is one of the more affordable members of the Ninja Foodi lineup. It’s both a toaster and a compact toaster oven, allowing for multifunctional usage with a small footprint. It won’t take up a ton of space on the countertop and still brings a way to toast, defrost, bake, broil and reheat to your kitchen. This is a great Ninja Foodi option for apartment dwellers or anyone with a smaller kitchen who still likes to cook in a variety of ways.

Ninja Foodi 8-in-1 digital air fry oven — $194, was $220

Ninja Foodi 8-in-1 Digital Air Fry Oven on a kitchen counter with pizza and wings.
Ninja

Part toaster oven, part air fryer, you’ll get delicious and crispy foods out of this beast, with fast cooking — up to 60% faster than a traditional oven. Even baking means you won’t end up with half the pizza or wing burnt, and the other half undercooked. The digital crisp controls allow you to adjust temperatures, heat source, and airflow for better precision. It comes with a wire rack, sheet pan, air fry basket, and removable crumb tray, everything you need to get cooking right away.

Ninja Foodi 10-in-1 Smart XL air fryer (renewed) — $180, was $300

Unloading food from the Ninja Foodi 10-in-1 Smart XL.
Ninja

If you want the unique crispness that only an air fryer can provide, but you want it on, say, an entire turkey, you’ll need a large air fryer oven. It’s basically an air fryer shaped like a toaster oven, and it has all the capabilities of both, and more. Since it’s smaller than an oven and can get practically air tight, it has 10 times the power of a traditional convection oven. It can fit a five-pound chicken in it, and has over a square foot of space, so you could load multiple pizzas in at once. It can preheat in 90 seconds, so it’ll be ready to go before you oven gets a chance to catch up.

Ninja Foodi 6-in-1 10-quart air fryer (renewed) — $189, was $249

Ninja Foodi 6-in-1 10-quart XL 2-Basket Air Fryer on a white kitchen counter with a variety of air fried foods.
Ninja

The Ninja Food 6-in-1 air fryer is the ultimate air frying experience. It has two baskets with DualZone technology, which allows you to cook with each basket independently of one another. This allows you to cook two different foods at the same time, should you so choose, as well as prepare smaller meals in a single basket that’s sized more appropriately for it. Each basket has a 4-quart capacity, and combined you can cook up to 8-quarts with functions that include air frying, roasting, reheating and dehydrating.

Ninja Foodi XL 6-in-1 indoor grill — $230, was $260

Lifting the lid of a Ninja Foodi XL 6-in-1 indoor grill to reveal a pizza cooking.
Ninja

If you’d like to move some of your outdoor grilling adventures to the indoors you can do so with the Ninja Foodi XL 6-in-1 indoor grill. It has six different preset cooking functions that include grilling, air crisping, roasting, baking, broiling, and dehydrating. Its extra-large capacity allows it to cook up to six steaks or up to 24 hot dogs, as well as some side dishes. This grill will cook with up to 75% less fat with its air frying technology, and it cleans up easily and has a fairly compact design that will allow you to return it to a cabinet when you’re done.

Ninja Foodi smoothie bowl maker — $100, was $120

Making a smoothie with the Ninja Foodi Smoothie Bowl Maker.
Ninja

If you’re looking to bring a healthier twist to breakfast, the Ninja Foodi Smoothie Bowl Maker should fit nicely on your counter. It can power through frozen foods with less liquid for perfect smoothie bowls, nut butters, and blender ice cream. It has preset programs for one-touch smoothies, extractions, bowls, and spreads. It also has two manual programs to pulse and start/stop. The Ninja Foodi Smoothie Bowl Maker cleans up easily and has blender cups and bowls that are easy to take on the go or store with included lids.

Ninja Foodi SS351 Power Blender & Processor System — $180, was $200

The Ninja Foodi SS351 Power Blender & Processor System against a white background.
Ninja

This is Ninja’s most powerful blender system. It crushes, food process, and makes smoothie bowls and dough, all through a singular base. It has smarttorque technology that produces 1400-peak-watts of power, making it capable of blending through heavy loads without stalling or the need to stir or shake. Its six versatile functions will keep you busy being creative in the kitchen for as long as you may like, and it comes in at a perfect price with this deal.

Ninja Professional Plus Kitchen System — $180, was $220

A woman pours food into the Ninja BN801 Professional Plus Kitchen System.
Ninja

The Ninja Professional Plus Kitchen System features a modern design and more functionality than previous generations. Food processing entails usse of the 8-cup precision processor bowl which provides precision processing even for chopping and smooth purees. It has five versatile functions that allow you to creat smootheis, frozen drinks, nutrient extractions, chopped mixtures, and dough, all at the touch of a button. The XL capacity of this blender and food processor is great for making large batches for both family and guests.






Source link

Continue Reading

Technology

New Exynos 2500 version emerges with enhanced specifications

Published

on

Featured image for New Exynos 2500 version emerges with enhanced specifications

The latest version of the Exynos 2500 has been spotted on the Geekbench database. While its model number remains the same as the previous version, it boasts a significant performance upgrade.

The Exynos 2500 gains extra CPU cores

The newly listed Exynos 2500 version features a powerful 10-core architecture. This marks a significant upgrade from its predecessor. The chip includes three Cortex-X925 CPU cores clocked at 2.59 GHz. It also has five Cortex-A725 CPU cores running at 2.25 GHz and two Cortex-A520 cores operating at 1.75 GHz. These enhancements point to a stronger emphasis on performance.

Samsung also enhanced the GPU in this Exynos 2500 version. The AMD Radeon-based Xclipse 950 GPU now includes additional cores, increasing its capabilities. This update focuses on enhancing graphics performance for gaming and multimedia applications. The GPU’s clock speed remains at 1.3 GHz, but its increased core count hints at a significant boost in rendering capabilities.

The company is currently grappling with its processor strategy for the Galaxy S25 lineup. While rumors are indicating that the Galaxy S25, Galaxy S25+, and Galaxy S25 Ultra may utilize the Snapdragon 8 Elite chip, reports from SamMobile suggest that the company has not entirely abandoned its in-house Exynos 2500 version. According to the outlet, if production yields improve, Samsung may still opt to use the Exynos 2500 in the Galaxy S25 and S25+ models.

Advertisement

The Geekbench listing also provides insights into the performance metrics of the Exynos 2500. It boasts an OpenCL score of 15,960, indicating impressive graphical capabilities. The device operates on Android 15 and features a memory size of 6.90GB, enhancing its multitasking potential.

Unclear future for the Exynos 2500

The exact role of the new Exynos chip in Samsung’s lineup remains uncertain. With two additional CPU cores and improved GPU capabilities, it seems designed for high-end devices, potentially suited for tablets and laptops. However, whether this version will make it to the market or be exclusive to certain models is still unknown.

The latest Geekbench listing adds a layer of intrigue to Samsung’s processor plans. As the Galaxy S25 series launch approaches, it remains to be seen how the Exynos 2500 will fit into the company’s broader strategy.

Source link

Advertisement

Continue Reading

Technology

Netflix’s The Electric State trailer shows off cartoony robots and oversized VR headsets

Published

on

Netflix's The Electric State trailer shows off cartoony robots and oversized VR headsets

Netflix has released the first trailer for , a post-apocalyptic road from Marvel (and Community) mainstays The Russo Brothers. The adaptation of Simon Stålenhag’s 2018 graphic novel is set in a retro-futuristic version of the ’90s after a robot uprising. It tells the story of Michelle, an orphaned teenager (Millie Bobby Brown) who ventures across the west of the US to look for her younger brother with a smuggler (a mustachioed Chris Pratt) and a pair of robots.

The movie’s look draws heavily from , right down to the oversized VR helmets. The robots, in particular the one accompanying Michelle, have a cartoon-inspired aesthetic that wouldn’t look out of place in Fallout. A large teddy bear robot can be seen as part of a parade of machines, while our heroes appear to face off against a massive one that looks a little like Sonic the Hedgehog.

Meanwhile, the whole “slowed down iteration of a popular song in a movie trailer” thing might have jumped the shark with the version of Oasis’ “Champagne Supernova” that plays over the top of this. It fits the ’90s setting, of course, but I couldn’t help but laugh as soon as I recognized it.

The movie has a hell of a cast. Alongside Brown and Pratt, it stars Ke Huy Quan, Jason Alexander, Woody Harrelson, Anthony Mackie, Brian Cox, Jenny Slate, Giancarlo Esposito and Stanley Tucci. The Electric State hits Netflix on March 14.

Source link

Advertisement

Continue Reading

Trending

Copyright © 2024 WordupNews.com