Close Menu
Ztoog
    What's Hot
    Gadgets

    Apple Maps vs Google Maps: How their offline downloadable maps compare

    Science

    Watch a rocket engine test in ultra-slow motion

    Mobile

    YouTube is reportedly throttling video playback for Firefox users

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

      LiberNovo Omni: The World’s First Dynamic Ergonomic Chair

      Common Security Mistakes Made By Businesses and How to Avoid Them

    • Technology

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

      5 Skills Kids (and Adults) Need in an AI World – O’Reilly

      How To Come Back After A Layoff

    • Gadgets

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

      The market’s down, but this OpenAI for the stock market can help you trade up

    • Mobile

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

      Forget screens: more details emerge on the mysterious Jony Ive + OpenAI device

      Android 16 QPR1 lets you check what fingerprints you’ve enrolled on your Pixel phone

    • Science

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

      A trip to the farm where loofahs grow on vines

      AI Is Eating Data Center Power Demand—and It’s Only Getting Worse

      Liquid physics: Inside the lab making black hole analogues on Earth

    • AI

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

      How AI is introducing errors into courtrooms

    • Crypto

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

      Senate advances GENIUS Act after cloture vote passes

      Is Bitcoin Bull Run Back? Daily RSI Shows Only Mild Bullish Momentum

    Ztoog
    Home » This AI Paper from CMU and Apple Unveils WRAP: A Game-Changer for Pre-training Language Models with Synthetic Data
    AI

    This AI Paper from CMU and Apple Unveils WRAP: A Game-Changer for Pre-training Language Models with Synthetic Data

    Facebook Twitter Pinterest WhatsApp
    This AI Paper from CMU and Apple Unveils WRAP: A Game-Changer for Pre-training Language Models with Synthetic Data
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Large Language Models (LLMs) have gathered a large quantity of consideration and reputation among the many Artificial Intelligence (AI) neighborhood in latest months. These fashions have demonstrated nice capabilities in duties together with textual content summarization, query answering, code completion, content material technology, and many others. 

    LLMs are steadily educated on insufficient web-scraped information. Most of the time, this information is loud, unstructured, and not essentially expressed clearly. Following the present scaling rules, which point out that as the scale of the mannequin will increase, computational energy and information amount must also improve proportionately, comes as a problem.

    There are two important limitations. Firstly, there’s the numerous computational price and time concerned in pre-training. Secondly, there’s the approaching drawback of the shortage of high-quality information accessible on the Internet. In latest analysis, a workforce of researchers from Apple and Carnegie Mellon University has addressed these points by introducing the concept of Web Rephrase Augmented Pre-training (WRAP). 

    WRAP is an progressive technique that makes use of an already-existing, instruction-tuned LLM. This LLM is used to paraphrase on-line pages into explicit kinds, together with mimicking the tone of Wikipedia or changing textual content into an answer-question format. The important purpose of WRAP is to enhance LLMs’ pre-training by including each real and artificially rephrased information. 

    The major options of WRAP are as follows:

    1. Pre-training Efficiency: Applying WRAP to the noisy C4 dataset significantly quickens pre-training, round 3 times sooner. This effectiveness is crucial in lowering the excessive bills and time dedication often associated to LLM coaching.
    1. Enhancement of Model Performance: WRAP makes the mannequin carry out higher when run inside the similar computational price range. Using completely different subsets of the Pile, a large-scale dataset used for coaching and assessing LLMs reduces ambiguity by greater than 10%. It improves zero-shot question-answer accuracy by over 2% for 13 completely different actions.
    1. Rephrasing Web Documents: WRAP makes use of a medium-sized LLM to paraphrase paperwork from the online into a number of kinds. This technique is completely different from creating new information as a result of it improves already-existing content material whereas preserving the unique data’s high quality and range.

    There are two important advantages to the artificial information produced by WRAP. Firstly, it features a vary of kinds that mirror the variety of languages utilized in purposes farther down the road. With this range, the LLM is healthier ready for a greater variety of real-world occasions. Secondly, the artificial information rephrased is of a better high quality than the uncooked web-scraped information. This high quality enhancement outcomes from language that’s extra ordered and cohesive, as this promotes extra environment friendly mannequin studying.

    In conclusion, WRAP is an enormous development within the discipline of LLM pre-training. Through the usage of superior-quality, different-style artificial information, WRAP not solely expedites the coaching course of but additionally improves the general efficiency of LLMs. Given the abundance of low-quality net information and the resource-intensive nature of basic LLM coaching approaches, this method presents a potential approach ahead. 


    Check out the Paper. All credit score for this analysis goes to the researchers of this mission. Also, don’t neglect to comply with us on Twitter and Google News. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

    If you want our work, you’ll love our publication..

    Don’t Forget to hitch our Telegram Channel


    Tanya Malhotra is a remaining yr undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
    She is a Data Science fanatic with good analytical and crucial pondering, alongside with an ardent curiosity in buying new expertise, main teams, and managing work in an organized method.


    🎯 [FREE AI WEBINAR] ‘Using ANN for Vector Search at Speed & Scale (Demo on AWS)’ (Feb 5, 2024)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    AI

    Study shows vision-language models can’t handle queries with negation words | Ztoog

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Gadgets

    The best headlamps for hiking of 2023

    We might earn income from the merchandise obtainable on this web page and take part…

    AI

    Researchers at the University of Tokyo Introduce a New Technique to Protect Sensitive Artificial Intelligence AI-Based Applications from Attackers

    In current years, the fast progress in Artificial Intelligence (AI) has led to its widespread…

    Mobile

    Honor 300 Pro runs Geekbench with mysterious chipset

    The Honor 300 collection is confirmed to launch in China on December 2 and it’ll…

    Crypto

    Bitcoin Hater Peter Schiff Scoffs At Recent Rally, Warns Impending Crash

    Renowned Bitcoin hater Peter Schiff has as soon as once more forged doubt on the…

    Mobile

    The Google Pixel Watch’s SpO2 monitoring finally becomes operational

    What you want to knowThe Google Pixel Watch’s blood oxygen saturation sensor is finally operational…

    Our Picks
    The Future

    Twitter rebrands its Android app with the new X logo

    AI

    Making an image with generative AI uses as much energy as charging your phone

    Science

    Liquid physics: Inside the lab making black hole analogues on Earth

    Categories
    • AI (1,492)
    • Crypto (1,752)
    • Gadgets (1,804)
    • Mobile (1,849)
    • Science (1,864)
    • Technology (1,801)
    • The Future (1,647)
    Most Popular
    Mobile

    I’m just not excited about the Galaxy S25 Ultra

    Gadgets

    4 Best Deals on Chromebook Plus Laptops at Best Buy

    The Future

    Last of Us Season 2 is “Ready to Go”, Says Neil Druckmann

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.