Connect with us

Technology

The ‘strawberrry’ problem: How to overcome AI’s limitations

Published

on

The 'strawberrry' problem: How to overcome AI's limitations

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


By now, large language models (LLMs) like ChatGPT and Claude have become an everyday word across the globe. Many people have started worrying that AI is coming for their jobs, so it is ironic to see almost all LLM-based systems flounder at a straightforward task: Counting the number of “r”s in the word “strawberry.” They are not exclusively failing at the alphabet “r”; other examples include counting “m”s in “mammal”, and “p”s in “hippopotamus.” In this article, I will break down the reason for these failures and provide a simple workaround.

LLMs are powerful AI systems trained on vast amounts of text to understand and generate human-like language. They excel at tasks like answering questions, translating languages, summarizing content and even generating creative writing by predicting and constructing coherent responses based on the input they receive. LLMs are designed to recognize patterns in text, which allows them to handle a wide range of language-related tasks with impressive accuracy.

Despite their prowess, failing at counting the number of “r”s in the word “strawberry” is a reminder that LLMs are not capable of “thinking” like humans. They do not process the information we feed them like a human would.

Advertisement
Conversation with ChatGPT and Claude about the number of “r”s in strawberry.

Almost all the current high performance LLMs are built on transformers. This deep learning architecture doesn’t directly ingest text as their input. They use a process called tokenization, which transforms the text into numerical representations, or tokens. Some tokens might be full words (like “monkey”), while others could be parts of a word (like “mon” and “key”). Each token is like a code that the model understands. By breaking everything down into tokens, the model can better predict the next token in a sentence. 

LLMs don’t memorize words; they try to understand how these tokens fit together in different ways, making them good at guessing what comes next. In the case of the word “hippopotamus,” the model might see the tokens of letters “hip,” “pop,” “o” and “tamus”, and not know that the word “hippopotamus” is made of the letters — “h”, “i”, “p”, “p”, “o”, “p”, “o”, “t”, “a”, “m”, “u”, “s”.

A model architecture that can directly look at individual letters without tokenizing them may potentially not have this problem, but for today’s transformer architectures, it is not computationally feasible.

Further, looking at how LLMs generate output text: They predict what the next word will be based on the previous input and output tokens. While this works for generating contextually aware human-like text, it is not suitable for simple tasks like counting letters. When asked to answer the number of “r”s in the word “strawberry”, LLMs are purely predicting the answer based on the structure of the input sentence.

Here’s a workaround

While LLMs might not be able to “think” or logically reason, they are adept at understanding structured text. A splendid example of structured text is computer code, of many many programming languages. If we ask ChatGPT to use Python to count the number of “r”s in “strawberry”, it will most likely get the correct answer. When there is a need for LLMs to do counting or any other task that may require logical reasoning or arithmetic computation, the broader software can be designed such that the prompts include asking the LLM to use a programming language to process the input query.

Advertisement

Conclusion

A simple letter counting experiment exposes a fundamental limitation of LLMs like ChatGPT and Claude. Despite their impressive capabilities in generating human-like text, writing code and answering any question thrown at them, these AI models cannot yet “think” like a human. The experiment shows the models for what they are, pattern matching predictive algorithms, and not “intelligence” capable of understanding or reasoning. However, having a prior knowledge of what type of prompts work well can alleviate the problem to some extent. As the integration of AI in our lives increases, recognizing its limitations is crucial for responsible usage and realistic expectations of these models.

 Chinmay Jog is a senior machine learning engineer at Pangiam.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

Advertisement

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers


Source link
Advertisement
Continue Reading
Advertisement
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Technology

How to watch SpaceX’s fifth Starship test flight on Sunday

Published

on

How to watch SpaceX's fifth Starship test flight on Sunday

SpaceX is getting ready to launch its mighty Starship on its fifth test flight, scheduled for Sunday, October 13. With a mostly-successful fourth test flight behind it, the Starship has already been into orbit and returned to Earth mostly intact. This time, SpaceX will be hoping to catch its Super Heavy booster as well as taking the upper stage Starship into orbit.

The exact date of this fifth test flight has been delayed due to issues with licensing from the Federal Aviation Administration (FAA), but SpaceX has now confirmed it is targeting 8 a.m. ET (5 a.m. PT) Sunday for its test.

To watch the test, you can tune into to SpaceX’s live stream, which will be shown on X (formerly Twitter):

Watch Starship's fifth flight test https://t.co/LVrCnTv797

— SpaceX (@SpaceX) October 12, 2024

You’ll also be able to watch the broadcast on SpaceX’s website.

Advertisement

The company has described its ambitions for catching the Super Heavy booster in a blog post, as it will use a pair of giant mechanical arms referred to as chopsticks to try to hold the booster as it comes in to land at SpaceX’s Starbase facility in Texas. This would be the first time a Super Heavy booster has been caught, and a significant step forward in making the Starship reusable. SpaceX has already proven the efficacy of this concept with its Falcon 9 rocket, of which the boosters are frequently landed on the ocean or occasionally on land to be reused.

The Starship is a considerably larger and more powerful vehicle than the Falcon 9, however, and has different boosters — making catching the booster a difficult task.

“Extensive upgrades ahead of this flight test have been made to hardware and software across Super Heavy, Starship, and the launch and catch tower infrastructure at Starbase,” SpaceX wrote. “SpaceX engineers have spent years preparing and months testing for the booster catch attempt, with technicians pouring tens of thousands of hours into building the infrastructure to maximize our chances for success. We accept no compromises when it comes to ensuring the safety of the public and our team, and the return will only be attempted if conditions are right.”

To tune into the live stream of Sunday’s test flight, you can look to coverage beginning at around 7:30 a.m. ET (4:30 a.m. PT) or you can follow updates on the flight at SpaceX’s X account.

Advertisement






Source link

Continue Reading

Technology

Samsung Galaxy A55 now getting October 2024 security update

Published

on

Featured image for Samsung Galaxy A55 now getting October 2024 security update

Samsung launched the Galaxy A55 this year as its latest most “premium” mid-range smartphone. Following the discontinuation of the Galaxy A7x series, the Galaxy A5x took its place as the king of the segment for the company. Like many Galaxy devices, the A55 boasts long-term software support with constant updates. Samsung is rolling out the October 2024 update for the Galaxy A55.

October 2024 security update reaching the Galaxy A55, first in Asian countries

Samsung usually sends monthly patches to its Galaxy devices from a certain range onward. As usual, not all updates bring big improvements or impressive new features. There are also patches focused on optimizing the stability and performance of the device as well as fixing vulnerabilities to keep it secure. That’s exactly what the Galaxy A55 is receiving with the October 2024 update.

The latest firmware brings fixes for more than two dozen Android-related vulnerabilities. The latest firmware also addresses 12 vulnerabilities exclusive to Galaxy devices. Plus, the company is fixing a potential security hole present in phones and wearables powered by certain Exynos chipsets. The update has the build number A556EXXS5AXI4 and is rolling out to users in some Asian countries first.

Once the update is available in your region, you’ll receive a notification to download it. However, you can also check availability manually by going to Settings > Software Update > Download and Install. If all goes well, other regions should start receiving it in a couple of weeks.

Advertisement

The Galaxy A55 may be the most balanced smartphone in the company’s catalog. It’s not as powerful or premium as a flagship, but it offers solid performance and an elegant design. The company made this year’s model more robust by integrating flat metal edges instead of plastic ones. There’s a solid rear camera system made up of 50 MP + 12 MP + 5 MP sensors. Sadly, there’s no telephoto sensor, so you’ll have to step up to the Galaxy S24 FE if you want optical zoom.

Other features include the Exynos 1480 chip, 5,000mAh battery, stereo speakers, and IP67 rating. If you’re looking for an even more affordable yet still capable Galaxy mid-ranger, you can consider the Samsung Galaxy A35.

Source link

Continue Reading

Technology

John Mulaney will host a live variety talk show on Netflix

Published

on

John Mulaney will host a live variety talk show on Netflix

Comedian and writer John Mulaney will host a live variety talk show on Netflix, the streaming company announced in a post on X. The show may be similar to Mulaney’s Everybody’s in LA, a live talk show that streamed on Netflix for six episodes in May 2024.

Mulaney’s production company will produce and he’ll be the showrunner, but no other details were revealed. However, at an event in LA, Netflix’s chief content officer Bela Bajaria said Everybody’s in LA showed what a weekly live talk/variety show could look like on the service. “[It] was just so bold and original and fresh and then unpredictable,” he said, “And I think it’ll be really fun to get to do a live show with him.”

Netflix has developed a reputation for not giving shows time to develop an audience even if critics love them — with Jeff Goldblum’s Kaos being the latest example. However, Bajaria admitted that viewership for Everybody’s in LA wasn’t huge, so the streamer is clearly willing to deviate from that strategy in some cases. In fact, Netflix has stuck with comedians even when they generate controversy, as its history with Dave Chappelle has shown. That’s possibly because comedy specials and talk shows are dirt cheap to produce compared to scripted series.

Source link

Continue Reading

Technology

WordPress.org’s latest move involves taking control of a WP Engine plugin

Published

on

WordPress.org’s latest move involves taking control of a WP Engine plugin

WordPress.org has taken over a popular WP Engine plugin in order “to remove commercial upsells and fix a security problem,” WordPress cofounder and Automattic CEO Matt Mullenweg announced today. This “minimal” update, which he labels a fork of the Advanced Custom Fields (ACF) plugin, is now called “Secure Custom Fields.”

It’s not clear what security problem Mullenweg is referring to in the post. He writes that he’s “invoking point 18 of the plugin directory guidelines,” in which the WordPress team reserves several rights, including removing a plugin, or changing it “without developer consent.” Mullenweg explains that the move has to do with WP Engine’s recently-filed lawsuit against him and Automattic.

Similar situations have happened before, but not at this scale. This is a rare and unusual situation brought on by WP Engine’s legal attacks, we do not anticipate this happening for other plugins.

WP Engine’s ACF team claimed on X that WordPress has never “unilaterally and forcibly” taken a plugin “from its creator without consent.” It later wrote that those who aren’t WP Engine, Flywheel, or ACF Pro customers will need to go to the ACF site and follow steps it published earlier to “perform a 1-time download of the genuine 6.3.8 version” to keep getting updates.

As its name implies, the ACF plugin allows website creators to use custom fields when existing generic ones won’t do — something ACF’s overview of the plugin says is already a native, but “not very user friendly,” feature of WordPress.

Advertisement

The Verge has reached out to Automattic, WordPress.org, and WP Engine for comment.

Update October 12th: Adjusted to add clarity about Mullenweg’s use of the “fork” label.

Source link

Continue Reading

Technology

NYT Strands today — hints, answers and spangram for Sunday, October 13 (game #224)

Published

on

NYT Strands homescreen on a mobile phone screen, on a light blue background

Strands is the NYT’s latest word game after the likes of Wordle, Spelling Bee and Connections – and it’s great fun. It can be difficult, though, so read on for my Strands hints.

Want more word-based fun? Then check out my Wordle today, NYT Connections today and Quordle today pages for hints and answers for those games.

Source link

Continue Reading

Technology

The best free TV shows on YouTube (October 2024)

Published

on

The best free TV shows on YouTube (October 2024)

YouTube continues to expand its lineup of movies and TV shows every month. The video-sharing platform does offer consumers new releases every week. However, most of these high-profile titles cost money. If you’re looking to save a little bit of dough, explore YouTube’s selection of free movies and television shows.

How can these programs be available at no cost? YouTube runs ads during these presentations, similar to how commercials air on television. However, that’s a fair trade-off to watch content for free. Explore October’s options in the “Movies & TV” tab on the sidebar. Read our guide below if you need to be pushed in the right direction.

Check out the best new shows to stream, the best shows on Netflix, the best movies on Amazon Prime, and the best movies on Disney+.






Source link

Continue Reading

Trending

Copyright © 2024 WordupNews.com