Close Menu
Ztoog
    What's Hot
    The Future

    China gaming laws: Tencent stock plummets amid crackdown fears

    Mobile

    Nothing Phone 2: Carl Pei shows Google how to make the best $600 Android phone

    Crypto

    Google Play changes policy toward blockchain-based apps, opening door to tokenized digital assets, NFTs

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

      LiberNovo Omni: The World’s First Dynamic Ergonomic Chair

    • Technology

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

      5 Skills Kids (and Adults) Need in an AI World – O’Reilly

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

      Forget screens: more details emerge on the mysterious Jony Ive + OpenAI device

    • Science

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

      A trip to the farm where loofahs grow on vines

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

      Senate advances GENIUS Act after cloture vote passes

    Ztoog
    Home » Stanford Researchers Introduce SequenceMatch: Training LLMs With An Imitation Learning Loss
    AI

    Stanford Researchers Introduce SequenceMatch: Training LLMs With An Imitation Learning Loss

    Facebook Twitter Pinterest WhatsApp
    Stanford Researchers Introduce SequenceMatch: Training LLMs With An Imitation Learning Loss
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Autoregressive fashions are a category of statistical fashions based mostly on the instinct {that a} variable’s present worth largely is dependent upon its previous values. In different phrases, the mannequin predicts the longer term worth of a variable by regressing it on its previous values. One of essentially the most well-known examples of autoregressive fashions is the category of GPT fashions, particularly GPT-3 and its variants, that are largely based mostly on the inspiration of predicting the subsequent phrase in a sequence given the earlier phrases. By coaching GPT on this autoregressive method on a big textual content corpus, it learns to seize the statistical patterns, dependencies, and semantic relationships in language, thereby enabling it to generate contextually related textual content based mostly on the enter immediate. However, earlier analysis experiments have proven that smaller fashions or fashions that are fine-tuned to have much less randomness or variability (i.e., decrease era temperatures) are inclined to generate repetitive or misguided outputs. Moreover, in sure eventualities, these fashions use their very own outputs as inputs, usually resulting in compounding errors that rapidly take the mannequin out of its supposed distribution.  

    To overcome these challenges, a workforce of researchers from Stanford carried out preliminary research and recognized two major obstacles that forestall autoregressive fashions skilled with most probability estimation (MLE) from producing coherent sequences throughout analysis. The first challenge lies within the divergence measure used to evaluate the disparity between the mannequin and the info distribution. Because MLE doesn’t contemplate out-of-distribution (OOD) sequences, the mannequin’s conduct on such sequences can’t be managed. To sort out this, the researchers devised the thought to attenuate the χ2-divergence between a mixture of precise knowledge and the autoregressively generated sequences, which has proven superior efficiency in comparison with MLE. The second problem arises when the mannequin produces an OOD token with no appropriate continuation that’s aligned with the info distribution. To deal with this, the researchers introduce an <backspace> motion within the era course of, permitting the mannequin to erase the earlier token and rectify any errors it might have made.

    By drawing these learnings from their preliminary research, Stanford Researchers have give you a novel methodology known as SequenceMatch, which allows the coaching of autoregressive fashions towards distinction divergence strategies whereas including an <backspace> motion that permits the mannequin to right errors. The researchers reformulated the issue of sequence era as a reinforcement studying drawback which, in easy phrases, will be summarised as selecting the subsequent plan of action (which, on this case, is producing the subsequent token) out of all potential sequences for a given state (i.e., a partial sequence). Therefore, by using the newest developments in non-adversarial imitation studying, which is a framework throughout the discipline of reinforcement studying, the researchers have been in a position to cut back the divergence between the occupancy measures of a skilled mannequin and the distribution of the particular knowledge. Moreover, to additional reduce compounding error in sequence era, the autoregressive mannequin was skilled with an <backspace> motion, versus MLE, to facilitate backtracking by permitting the mannequin to delete tokens. This absolutely supervised loss approach for language modeling, SequenceMatch, can be utilized as an extra step to fine-tune pre-trained fashions.

    🔥 Unleash the facility of Live Proxies: Private, undetectable residential and cell IPs.

    The researchers carried out a number of experimental evaluations to match the efficiency of GPT-2 based mostly fashions fine-tuned on SequenceMatch with MLE-trained fashions. The researchers used the MAUVE rating as a metric to match the efficiency, and it was revealed that fashions fine-tuned on SequenceMatch generated textual content nearer to the dataset and appeared extra fluent and error-free in distinction to MLE-trained fashions. The workforce additionally highlighted the limitation of their mannequin because it requires extra computational assets and time for producing prolonged texts. When it involves future work, the researchers are specializing in learning how totally different divergence strategies have an effect on the standard of the sequences generated. 


    Check Out The Paper. Don’t overlook to hitch our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI tasks, and extra. If you’ve got any questions concerning the above article or if we missed something, be happy to e mail us at Asif@marktechpost.com

    🚀 Check Out 100’s AI Tools in AI Tools Club


    Khushboo Gupta is a consulting intern at MarktechPost. She is at present pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate in regards to the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys studying extra in regards to the technical discipline by taking part in a number of challenges.


    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Rationale engineering generates a compact new tool for gene therapy | Ztoog

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Crypto

    What is Solana?

    What is Solana? Since the explosion of cryptocurrencies, each traders and cryptocurrency lovers have carved…

    AI

    Want AI that flags hateful content? Build it.

    The problem asks for 2 totally different fashions. The first, a job for these with…

    Technology

    Elon Musk in Talks to Hire Linda Yaccarino as Twitter’s New CEO

    Elon Musk is in talks to rent Linda Yaccarino, the chair of worldwide promoting and…

    Gadgets

    3D-Printed Algae Surfboard: A Sustainable Wave Of Innovation

    Paradoxal Surfboards has made a major eco-friendly leap on this planet of wave-riding with their…

    Technology

    Claims of TikTok whistleblower may not add up

    The United States authorities is at present poised to outlaw TikTok. Little of the proof…

    Our Picks
    Mobile

    Nubia brings crazy cheap gaming, flip, and camera phones to the global market

    Mobile

    Order a Retroid Pocket Mini? We’ve got bad news

    AI

    This creamy vegan cheese was made with AI

    Categories
    • AI (1,493)
    • Crypto (1,753)
    • Gadgets (1,805)
    • Mobile (1,850)
    • Science (1,866)
    • Technology (1,802)
    • The Future (1,648)
    Most Popular
    The Future

    Best Cheap Mattress for 2023

    Gadgets

    Goodbye $99 Fee: Developer Betas Now Free For iOS, watchOS, And More

    AI

    How AI taught Cassie the two-legged robot to run and jump

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.