Connect with us

Technology

AutoToS makes LLM planning fast, accurate and inexpensive

Published

on

AutoToS makes LLM planning fast, accurate and inexpensive

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Large language models (LLMs) have shown promise in solving planning and reasoning tasks by searching through possible solutions. However, existing methods can be slow, computationally expensive and provide unreliable answers. 

Researchers from Cornell University and IBM Research have introduced AutoToS, a new technique that combines the planning power of LLMs with the speed and accuracy of rule-based search algorithms. AutoToS eliminates the need for human intervention and significantly reduces the computational cost of solving planning problems. This makes it a promising technique for LLM applications that must reason over large solution spaces.

There is a growing interest in using LLMs to handle planning problems, and researchers have developed several techniques for this purpose. The more successful techniques, such as Tree of Thoughts, use LLMs as a search algorithm that can validate solutions and propose corrections.

Advertisement

While these approaches have demonstrated impressive results, they face two main challenges. First, they require numerous calls to LLMs, which can be computationally expensive, especially when dealing with complex problems with thousands of possible solutions. Second, they do not guarantee that the LLM-based algorithm qualifies for “completeness” and “soundness.” Completeness ensures that if a solution exists, the algorithm will eventually find it, while soundness guarantees that any solution returned by the algorithm is valid.

Thought of Search (ToS) offers an alternative approach. ToS leverages LLMs to generate code for two key components of search algorithms: the successor function and the goal function. The successor function determines how the search algorithm explores different nodes in the search space, while the goal function checks whether the search algorithm has reached the desired state. These functions can then be used by any offline search algorithm to solve the problem. This approach is much more efficient than keeping the LLM in the loop during the search process.

“Historically, in the planning community, these search components were either manually coded for each new problem or produced automatically via translation from a description in a planning language such as PDDL, which in turn was either manually coded or learned from data,” Michael Katz, principal research staff member at IBM Research, told VentureBeat. “We proposed to use the large language models to generate the code for the search components from the textual description of the planning problem.”

The original ToS technique showed impressive progress in addressing the soundness and completeness requirements of search algorithms. However, it required a human expert to provide feedback on the generated code and help the model refine its output. This manual review was a bottleneck that reduced the speed of the algorithm.

Advertisement

Automating ToS

AutoToS
AutoToS (source: arXiv)

“In [ToS], we assumed a human expert in the loop, who could check the code and feedback the model on possible issues with the generated code, to produce a better version of the search components,” Katz said. “We felt that in order to automate the process of solving the planning problems provided in a natural language, the first step must be to take the human out of that loop.”

AutoToS automates the feedback and exception handling process using unit tests and debugging statements, combined with few-shot and chain-of-thought (CoT) prompting techniques.

AutoToS works in multiple steps. First, it provides the LLM with the problem description and prompts it to generate code for the successor and goal functions. Next, it runs unit tests on the goal function and provides feedback to the model if it fails. The model then uses this feedback to correct its code. Once the goal function passes the tests, the algorithm runs a limited breadth-first search to check if the functions are sound and complete. This process is repeated until the generated functions pass all the tests. 

Finally, the validated functions are plugged into a classic search algorithm to perform the full search efficiently.

AutoToS in action

The researchers evaluated AutoToS on several planning and reasoning tasks, including BlocksWorld, Mini Crossword and 24 Game. The 24 Game is a mathematical puzzle where you are given four integers and must use basic arithmetic operations to create a formula that equates to 24. BlocksWorld is a classic AI planning domain where the goal is to rearrange blocks stacked in towers. Mini Crosswords is a simplified crossword puzzle with a 5×5 grid.

Advertisement

They tested various LLMs from different families, including GPT-4o, Llama 2 and DeepSeek Coder. They used both the largest and smallest models from each family to evaluate the impact of model size on performance.

Their findings showed that with AutoToS, all models were able to identify and correct errors in their code when given feedback. The larger models generally produced correct goal functions without feedback and required only a few iterations to refine the successor function. Interestingly, GPT-4o-mini performed surprisingly well in terms of accuracy despite its small size.

“With just a few calls to the language model, we demonstrate that we can obtain the search components without any direct human-in-the-loop feedback, ensuring soundness, completeness, accuracy and nearly 100% accuracy across all models and all domains,” the researchers write.

Compared to other LLM-based planning approaches, ToS drastically reduces the number of calls to the LLM. For example, for the 24 Game dataset, which contains 1,362 puzzles, the previous approach would call GPT-4 approximately 100,000 times. AutoToS, on the other hand, needed only 2.2 calls on average to generate sound search components.

Advertisement

“With these components, we can use the standard BFS algorithm to solve all the 1,362 games together in under 2 seconds and get 100% accuracy, neither of which is achievable by the previous approaches,” Katz said.

AutoToS for enterprise applications

AutoToS can have direct implications for enterprise applications that require planning-based solutions. It cuts the cost of using LLMs and reduces the reliance on manual labor, enabling experts to focus on high-level planning and goal specification.

“We hope that AutoToS can help with both the development and deployment of planning-based solutions,” Katz said. “It uses the language models where needed—to come up with verifiable search components, speeding up the development process and bypassing the unnecessary involvement of these models in the deployment, avoiding the many issues with deploying large language models.”

ToS and AutoToS are examples of neuro-symbolic AI, a hybrid approach that combines the strengths of deep learning and rule-based systems to tackle complex problems. Neuro-symbolic AI is gaining traction as a promising direction for addressing some of the limitations of current AI systems.

Advertisement

“I don’t think that there is any doubt about the role of hybrid systems in the future of AI,” Harsha Kokel, research scientist at IBM, told VentureBeat. “The current language models can be viewed as hybrid systems since they perform a search to obtain the next tokens.”

While ToS and AutoToS show great promise, there is still room for further exploration.

“It is exciting to see how the landscape of planning in natural language evolves and how LLMs improve the integration of planning tools in decision-making workflows, opening up opportunities for intelligent agents of the future,” Kokel and Katz said. “We are interested in general questions of how the world knowledge of LLMs can help improve planning and acting in real-world environments.”


Source link
Continue Reading
Advertisement
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Servers computers

Why you should or shouldn't buy used servers!

Published

on

Why you should or shouldn't buy used servers!



Bob Pellerin (CTOBOB) looks into what to look for in a used server. Reliable machine can be had. Keep in mind that the more recent a server is, the most likely it will run newer operating systems.

The parts most likely to fail on serves are:
– Drives
– Power supplies
– Fans

—————————————————
► Featured Product:
► Become A Channel Member: https://www.youtube.com/channel/UCJX3tntPBJk8YvbClblS-xg/join
►Merchandise: https://bobpellerin.creator-spring.com
—————————————————
►Visit Us: http://Ctobob.com
►Subscribe: https://bit.ly/2y71jrK
►Twitter: https://twitter.com/ctobob?lang=en
—————————————————
*FULL DISCLOSURE: Most outbound links financially compensate the producer of this video through affiliate programs or sponsorship deals. We only recommend products and services we’ve used and confidently stand behind. Using the links do not adversely affect your purchase price and greatly helps support the channel. Thank you for your understanding. * .

source

Continue Reading

Technology

Spotify’s AI Playlist is now rolling out to more Premium Subscribers – here’s who’s getting it next

Published

on

Spotify AI Playlist Beta

If you’ve been a Spotify premium subscriber eager to try out the music streaming service’s AI chops at building playlists based on prompts, we have good news. After rolling out AI Playlist to Premium subscribers in the UK and Australia earlier in 2024, the feature is now expanding to the United States, Canada, Ireland, and New Zealand in English.

So, you can now use Spotify’s AI to create a playlist based on a prompt that can be as short as a single word. You will need to be a Premium member, though, which in the United States starts at $11.99 a month for an individual, $16.99 per month for Duo (aka two accounts), or $19.99 a month for a family subscription.

Source link

Advertisement

Continue Reading

Technology

Sharp rise in problematic teenage social media use, study says

Published

on

Sharp rise in problematic teenage social media use, study says
Getty Images A morose looking teenager stares at a smartphone Getty Images

A major international study suggests there has been a sharp rise in what it calls “problematic” social media use among young people since the pandemic.

Researchers came to the conclusion after surveying almost 280,000 children aged 11, 13 and 15 across 44 countries.

The Health Behaviour In School-aged Children (HBSC) study found, on average, 11% of respondents engaged with social media in a problematic way in 2022 – compared to 7% in 2018.

England, Scotland and Wales all recorded figures above that average.

The report’s authors say the findings “raise urgent concerns about the impact of digital technology on the mental health and well-being of Europe’s youth”.

Advertisement

They say more action is needed to “promote healthy online behaviours.”

“Problematic use is most common amongst 13-year-olds – it sort of peaks in that early adolescence phase and girls are more likely to report problematic social media use than boys,” said the study’s international co-ordinator Dr Jo Inchley, from the University of Glasgow.

She said the research also revealed how much time young people spend online.

“Across the study as a whole, we found just over a third of adolescents report continuous online contact with friends and others,” she said.

Advertisement

“That means almost all the time throughout the day they are connected online to friends and other people.”

The report does not conclude all that time spent online is detrimental.

Instead, teenagers who were heavy, but not problematic, users of social media reported stronger peer support and social connections.

But for the “problematic” minority it found social media use was associated with addiction-like symptoms including:

Advertisement

  • neglect of other activities in favour of spending time on social media
  • frequent arguments about use
  • lying about how much time is spent online
  • an inability to control social media use and experiencing withdrawal

It also highlights concerns about the proportion of teenagers considered to be at risk of “problematic gaming” – something it suggests applies to boys more than girls.

That designation applied to 15% of teenagers in England – the second highest proportion across all countries studied.

The average proportion of boys who played daily was 46%, but this figure stood at 52% in England and 57% in Scotland.

And 13-year-old boys in England reported the highest rate of long gaming sessions, with 45% of boys of that age indicating that they played for at least four hours on gaming days.

Positive and negative consequences

Advertisement

The study has been published by the European arm of the World Health Organisation (WHO).

Dr Hans Henri P Kluge, the WHO’s regional director for Europe, said the findings made clear social media could have both positive and negative consequences for young people.

He said there needed to be more “digital literacy education” to help young people develop a healthy approach to being online, and governments, health authorities, teachers and parents all had to play their part.

“It’s clear we need immediate and sustained action to help adolescents turn the tide on potentially damaging social media use, which has been shown to lead to depression, bullying, anxiety, and poor academic performance,” he said.

Advertisement

Ben Carter, Professor of Medical Statistics at the Institute of Psychiatry, Psychology & Neuroscience, described the report as a “useful snapshot of the evidence”.

But he pointed out it was difficult to agree on a definition of what “problematic social media” was, making gathering data on it challenging.

Nonetheless, he said the study was a “valid contribution to the evidence base”.

Source link

Advertisement

Continue Reading

Servers computers

Dell PowerEdge 4220 Server Rack – 42U Data Center Enclosure

Published

on

Dell PowerEdge 4220 Server Rack -  42U Data Center Enclosure



Dell 4220 42U PowerEdge Enclosure. Complete Server Rack.
For more info call 877-307-7225.
Website: www.global1resources.com.
EBay Store: global1resources .

source

Continue Reading

Technology

Watch a robot peel a squash with human-like dexterity

Published

on

Watch a robot peel a squash with human-like dexterity


A robot that peels vegetables in the same way that people do demonstrates a level of dexterity that could help move delicate objects along a manufacturing line.

Prototype robots are often tasked with peeling vegetables to test their ability to carefully handle awkward objects. But these challenges are usually simplified, such as the vegetable being fixed in place, or only testing single fruits or vegetables, like peeling a banana.

Now, Pulkit Agrawal at the Massachusetts Institute of Technology and his colleagues have developed a robotic system that can rotate different types of fruit and vegetable using its fingers on one hand, while the other arm is made to peel.

Advertisement

“These additional steps of doing rotation are something which is very straightforward to humans, we don’t even think about it,” says Agrawal. “But for a robot, this becomes challenging.”

First, the robot was taught in a simulated environment, receiving an algorithmic reward for a proper rotation and a punishment if it rotated the wrong way or not at all.

Next, the robot was tested under real-world conditions by tasking it with peeling fruits and vegetables such as a pumpkin, radish and papaya. It used one hand to rotate the produce, using feedback from touch sensors, while a human-controlled robot arm did the peeling.

The robot can hold and rotate a vegetable in one hand, while the other arm peels

Tao Chen, Eric Cousineau, Naveen Kuppuswamy, Pulkit Agrawal

Advertisement

The algorithm struggles with smaller, more awkwardly shaped vegetables, such as ginger, says Agrawal, but the team hopes to expand its capabilities.

Grasping and reorienting objects are challenging tasks for any robot, and the speed and firm grip of this one is impressive, says Jonathan Aitken at the University of Sheffield in the UK. It could be useful in factories where objects have to be moved from one machine to another with the correct orientation, he says.

However, it is unlikely to be used in an industrial setting for peeling vegetables because other approaches already exist, such as automatic potato peelers, says Aitken.

Advertisement

Topics:

Source link

Continue Reading

Technology

Fierce new Monster Hunter Wilds trailer reveals release date

Published

on

Fierce new Monster Hunter Wilds trailer reveals release date

Capcom has treated us to another long look at Monster Hunter Wilds, including that all-important release date. The hunt is on beginning February 28, 2025, on PlayStation 5, Xbox Series X/S, and PC.

The latest trailer for the next entry in the massively popular Monster Hunter franchise showed off a more personal side to the story, opening with a child fleeing the wrath of the White Wraith and introducing us to many of the characters we can look forward to bonding with while slaying giant beasts. The adorable Palicos are back in full force, helping with cooking and on the battlefield as they have in prior games. In one instance a hunter was knocked out and saved by a Palico dropping a health potion on them.

Monster Hunter Wilds – Release Date Reveal Trailer | PS5 Games

Speaking of monsters, a number of impressive beasts appeared here, though none that haven’t been shown in prior trailers, including a massive water-born creature that leaps and dives through the water and a large hairy beast that the hunter uses their grappling hook to crush with some debris in the environment. However, the star of the show remains The White Wraith Arkveld. This is the game’s premier monster and “big bad” that the plot will center around hunting. This is described as a species of monster that was believed to be extinct, yet has reappeared and wreaks havoc on the world and its people.

Weather has been a major focus for Monster Hunter Wilds, and this trailer shows a few more instances of how the landscape and ecology can shift based on the current weather. Minor examples show how rain can cause a river to become a flood, while sandstorms can cut visibility down to nearly nothing and cause deadly lightning strikes.

Advertisement

Monster Hunter Wilds will come out on February 28, 2025, on PlayStation 5, Xbox Series X/S, and PC. Preorders are live right now with a special Layered Armor Guild Knight Set and Hope Charm Talisman offered as bonuses.






Source link

Continue Reading

Trending

Copyright © 2017 Zox News Theme. Theme by MVP Themes, powered by WordPress.