Close Menu
Ztoog
    What's Hot
    AI

    Microsoft Releases Orca 2: Pioneering Advanced Reasoning in Smaller Language Models with Tailored Training Strategies

    AI

    Meet Medusa: An Efficient Machine Learning Framework for Accelerating Large Language Models (LLMs) Inference with Multiple Decoding Heads

    Mobile

    How many billions can Meta’s Facebook pay the EU? And does it even matter?

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      OPPO launches A5 Pro 5G: Premium features at a budget price

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

    • Technology

      What It Is and Why It Matters—Part 1 – O’Reilly

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

    • Gadgets

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

    • Mobile

      Motorola’s Moto Watch needs to start living up to the brand name

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

    • Science

      Nothing is stronger than quantum connections – and now we know why

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

    • AI

      Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

    • Crypto

      Ethereum Breaks Key Resistance In One Massive Move – Higher High Confirms Momentum

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

    Ztoog
    Home » Meet LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
    AI

    Meet LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

    Facebook Twitter Pinterest WhatsApp
    Meet LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    The introduction of Pre-trained Language Models (PLMs) has signified a transformative shift within the discipline of Natural Language Processing. They have demonstrated distinctive proficiency in performing a variety of language duties, together with Natural Language Understanding (NLU) and Natural Language Generation (NLG). These fashions sometimes incorporate hundreds of thousands and even billions of parameters, resulting in substantial computational and reminiscence necessities. However, the appreciable computational and reminiscence wants of those fashions current vital challenges, as acknowledged by the analysis neighborhood.

    In this paper, the authors introduce a novel quantization framework often known as LoRA-Fine-Tuning-aware Quantization (LoftQ). This framework is particularly tailor-made for pre-trained fashions that necessitate quantization and LoRA fine-tuning. The framework actively combines low-rank approximation, working at the side of quantization to collectively approximate the unique high-precision pre-trained weights.

    The above picture demonstrates QLoRA efficiency with totally different bits. Left: QLoRA initialization of LLAMA-2-13b on WikiText-2. Right: Apply QLoRA to LLAMA-2-13b on WikiText-2 language modelling activity. Smaller perplexity signifies higher efficiency. 

    Quantization Methods. We apply two quantization strategies to show LoftQ is appropriate with totally different quantization features:

    • Uniform quantization is a basic quantization methodology. It uniformly divides a steady interval into 2N classes and shops an area most absolute worth for dequantization.

    • NF4 and its 2-bit variant NF2 are quantization strategies utilized in QLoRA. They assume that the high-precision values are drawn from a Gaussian distribution and map these values to discrete slots which have equal chance.

    We carry out 2-bit and 4-bit quantization on all fashions, attaining compression ratios of 25-30% and 15-20% on the 4-bit and 2-bit ranges, respectively. All the experiments are performed on NVIDIA A100 GPUs.

    The analysis of their quantization framework is carried out by means of in depth experiments on varied downstream duties, together with NLU, query answering, summarization, and NLG. The outcomes of those experiments show that LoftQ constantly surpasses QLoRA throughout all precision ranges. For instance, with 4-bit quantization, they attain a 1.1 and 0.8 enchancment in Rouge-1 for XSum and CNN/DailyMail, respectively. As the sector of NLP continues to advance, it’s anticipated that additional improvements and optimizations will assist bridge the hole between the immense potential of PLMs and their sensible deployment, benefiting a variety of purposes and customers.


    Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t neglect to affix our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI initiatives, and extra.

    If you want our work, you’ll love our publication..

    We are additionally on WhatsApp. Join our AI Channel on Whatsapp..


    Janhavi Lande, is an Engineering Physics graduate from IIT Guwahati, class of 2023. She is an upcoming knowledge scientist and has been working on the earth of ml/ai analysis for the previous two years. She is most fascinated by this ever altering world and its fixed demand of people to maintain up with it. In her pastime she enjoys touring, studying and writing poems.


    ▶️ Now Watch AI Research Updates On Our Youtube Channel [Watch Now]

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    AI

    The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    AI

    Novel method detects microbial contamination in cell cultures | Ztoog

    AI

    Seeing AI as a collaborator, not a creator

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    The Future

    Rocket Lab aiming to advance Electron reusability with tonight’s launch

    Rocket Lab has made enhancements to the primary stage of the Electron rocket to make…

    AI

    Best Telegram AI Chatbots in 2023

    The introduction of chatbots pushed by synthetic intelligence has considerably impacted how people talk with…

    Mobile

    Apple’s WWDC invitations make a June 5th unveiling of the Reality Pro almost a sure thing

    Apple is predicted to unveil its long-awaited blended actuality AR/VR headset on June 5th, the…

    Technology

    FTX employees discovered customer wallet backdoor, but bosses ignored their warnings

    A scorching potato: As Sam Bankman-Fried’s trial concludes its second day, we study that many…

    Crypto

    Impact Of The Fed’s Growing War Chest On Bitcoin And Crypto

    In a latest report by Capriole Investments’ Charles Edwards explored the Federal Reserve’s ever-expanding battle…

    Our Picks
    Gadgets

    15 Best Laptops (2023): MacBooks, Windows, Chromebooks

    AI

    The deepfake avatars who want to sell you everything

    Technology

    Twitch plans to remove Prime Video Watch Party from April 2, 2024; Watch Party launched in 2020, allowing streamers to broadcast Prime Video content to viewers (Mariella Moon/Engadget)

    Categories
    • AI (1,483)
    • Crypto (1,745)
    • Gadgets (1,796)
    • Mobile (1,840)
    • Science (1,854)
    • Technology (1,790)
    • The Future (1,636)
    Most Popular
    Gadgets

    Lenovo seeks halt of Asus laptop sales over alleged patent infringement

    Mobile

    Microscopic image shows significant size difference between iPhone 15 and Pro Max’s 48MP cameras

    Technology

    Rising Tide Rents and Robber Baron Rents – O’Reilly

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.