Close Menu
Ztoog
    What's Hot
    The Future

    Starlink vs. T-Mobile Home Internet: Clash of the Broadband Disruptors

    Mobile

    What are the best Facebook alternatives in 2023?

    Technology

    Sources: Instacart plans to price its IPO on Monday, September 18 and begin trading on Tuesday, September 19 (Bloomberg)

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Can work-life balance tracking improve well-being?

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

    • Technology

      Elon Musk tries to stick to spaceships

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Deals: the Galaxy S25 series comes with a free tablet, Google Pixels heavily discounted

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

    • Science

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

      A trip to the farm where loofahs grow on vines

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      Bitcoin Maxi Isn’t Buying Hype Around New Crypto Holding Firms

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

    Ztoog
    Home » Technique improves the reasoning capabilities of large language models | Ztoog
    AI

    Technique improves the reasoning capabilities of large language models | Ztoog

    Facebook Twitter Pinterest WhatsApp
    Technique improves the reasoning capabilities of large language models | Ztoog
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Large language models like those who energy ChatGPT have proven spectacular efficiency on duties like drafting authorized briefs, analyzing the sentiment of buyer critiques, or translating paperwork into completely different languages.

    These machine-learning models usually use solely pure language to course of info and reply queries, which may make it troublesome for them to carry out duties that require numerical or symbolic reasoning.

    For occasion, a large language mannequin may have the ability to memorize and recite an inventory of latest U.S. presidents and their birthdays, however that very same mannequin might fail if requested the query “Which U.S. presidents elected after 1950 were born on a Wednesday?” (The reply is Jimmy Carter.)

    Researchers from MIT and elsewhere have proposed a brand new method that allows large language models to unravel pure language, math and knowledge evaluation, and symbolic reasoning duties by producing applications.

    Their method, known as pure language embedded applications (NLEPs), entails prompting a language mannequin to create and execute a Python program to unravel a person’s question, after which output the answer as pure language.

    They discovered that NLEPs enabled large language models to realize increased accuracy on a variety of reasoning duties. The method can be generalizable, which implies one NLEP immediate may be reused for a number of duties.

    NLEPs additionally enhance transparency, since a person might examine the program to see precisely how the mannequin reasoned about the question and repair the program if the mannequin gave a mistaken reply.

    “We want AI to perform complex reasoning in a way that is transparent and trustworthy. There is still a long way to go, but we have shown that combining the capabilities of programming and natural language in large language models is a very good potential first step toward a future where people can fully understand and trust what is going on inside their AI model,” says Hongyin Luo PhD ’22, an MIT postdoc and co-lead creator of a paper on NLEPs.

    Luo is joined on the paper by co-lead authors Tianhua Zhang, a graduate pupil at the Chinese University of Hong Kong; and Jiaxin Ge, an undergraduate at Peking University; Yoon Kim, an assistant professor in MIT’s Department of Electrical Engineering and Computer Science and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); senior creator James Glass, senior analysis scientist and head of the Spoken Language Systems Group in CSAIL; and others. The analysis can be offered at the Annual Conference of the North American Chapter of the Association for Computational Linguistics.

    Problem-solving with applications

    Many fashionable large language models work by predicting the subsequent phrase, or token, given some pure language enter. While models like GPT-4 can be utilized to write down applications, they embed these applications inside pure language, which may result in errors in the program reasoning or outcomes.

    With NLEPs, the MIT researchers took the reverse method. They immediate the mannequin to generate a step-by-step program completely in Python code, after which embed the essential pure language inside the program.

    An NLEP is a problem-solving template with 4 steps. First, the mannequin calls the essential packages, or capabilities, it might want to clear up the activity. Step two entails importing pure language representations of the data the activity requires (like an inventory of U.S. presidents’ birthdays). For step three, the mannequin implements a operate that calculates the reply. And for the last step, the mannequin outputs the outcome as a line of pure language with an automated knowledge visualization, if wanted.

    “It is like a digital calculator that always gives you the correct computation result as long as the program is correct,” Luo says.

    The person can simply examine the program and repair any errors in the code immediately moderately than needing to rerun the complete mannequin to troubleshoot.

    The method additionally gives larger effectivity than another strategies. If a person has many comparable questions, they’ll generate one core program after which change sure variables without having to run the mannequin repeatedly.

    To immediate the mannequin to generate an NLEP, the researchers give it an general instruction to write down a Python program, present two NLEP examples (one with math and one with pure language), and one take a look at query.

    “Usually, when people do this kind of few-shot prompting, they still have to design prompts for every task. We found that we can have one prompt for many tasks because it is not a prompt that teaches LLMs to solve one problem, but a prompt that teaches LLMs to solve many problems by writing a program,” says Luo.

    “Having language models reason with code unlocks many opportunities for tool use, output validation, more structured understanding into model’s capabilities and way of thinking, and more,” says Leonid Karlinsky, principal scientist at the MIT-IBM Watson AI Lab.

    “No magic here”

    NLEPs achieved larger than 90 p.c accuracy when prompting GPT-4 to unravel a variety of symbolic reasoning duties, like monitoring shuffled objects or enjoying a recreation of 24, in addition to instruction-following and textual content classification duties. The researchers discovered that NLEPs even exhibited 30 p.c larger accuracy than task-specific prompting strategies. The technique additionally confirmed enhancements over open-source LLMs. 

    Along with boosting the accuracy of large language models, NLEPs might additionally enhance knowledge privateness. Since NLEP applications are run regionally, delicate person knowledge don’t must be despatched to an organization like OpenAI or Google to be processed by a mannequin.

    In addition, NLEPs can allow small language models to carry out higher with out the have to retrain a mannequin for a sure activity, which could be a expensive course of.

    “There is no magic here. We do not have a more expensive or fancy language model. All we do is use program generation instead of natural language generation, and we can make it perform significantly better,” Luo says.

    However, an NLEP depends on the program technology functionality of the mannequin, so the method doesn’t work as properly for smaller models which have been skilled on restricted datasets. In the future, the researchers plan to check strategies that might make smaller language models generate more practical NLEPs. In addition, they wish to examine the affect of immediate variations on NLEPs to boost the robustness of the mannequin’s reasoning processes.

    This analysis was supported, partially, by the Center for Perceptual and Interactive Intelligence of Hong Kong. 

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Rationale engineering generates a compact new tool for gene therapy | Ztoog

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    The Future

    Cruise ceases robotaxi operations, the Apple Watch gets a new feature and Carta tries to head off bad press

    Hello, ghouls and goblins, and welcome to this Halloween Weekend version of Week in Review…

    AI

    Joining the battle against health care bias | Ztoog

    Medical researchers are awash in a tsunami of scientific knowledge. But we’d like main modifications…

    Technology

    It’s official: Better.com is going public

    We didn’t assume we’d see the day. Digital mortgage lender Better.com’s proposal to mix with…

    Mobile

    Fairphone 5 earns a perfect repairability score from iFixit

    What it’s essential to knowThe Fairphone 5 is the most recent smartphone from Fairphone, a…

    AI

    New algorithm unlocks high-resolution insights for computer vision | Ztoog

    Imagine your self glancing at a busy road for a couple of moments, then attempting…

    Our Picks
    The Future

    Signal To Add ‘Usernames’ For Phone Number Privacy Of Users

    Crypto

    Bitcoin Analyst Raises Price Target to $200,000, Spot ETFs Leads BTC To New Era

    Science

    Everyone Was Wrong About Antipsychotics

    Categories
    • AI (1,493)
    • Crypto (1,754)
    • Gadgets (1,805)
    • Mobile (1,851)
    • Science (1,866)
    • Technology (1,803)
    • The Future (1,649)
    Most Popular
    Science

    Aliens on low-oxygen worlds may never discover fire

    The Future

    EcoFlow´s Black Friday sales deals means you can power up and pay less with up to 38% off selected devices

    Technology

    Microsoft Debates What to Do With A.I. Lab in China

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.