Close Menu
Ztoog
    What's Hot
    Mobile

    Galaxy S23 FE hands-on video leaked ahead of October 4 launch

    The Future

    Car-sharing company Getaround cuts one-third of US workforce

    The Future

    Most large fishing boats go untracked as ‘dark vessels’

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Disneyland’s 70th Anniversary Brings Cartoony Chaos to This Summer’s Celebration

      Story of military airfield in Afghanistan that Biden left in 2021

      Tencent hires WizardLM team, a Microsoft AI group with an odd history

      Today’s NYT Connections Hints, Answers for May 12, #701

      OPPO launches A5 Pro 5G: Premium features at a budget price

    • Technology

      Deep dive on the evolution of Microsoft's relationship with OpenAI, from its $1B investment in 2019 through Copilot rollouts and ChatGPT's launch to present day (Bloomberg)

      New leak reveals iPhone Fold won’t look like the Galaxy Z Fold 6 at all

      Apple will use AI and user data in iOS 19 to extend iPhone battery life

      Today’s NYT Wordle Hints, Answer and Help for May 12, #1423

      What It Is and Why It Matters—Part 1 – O’Reilly

    • Gadgets

      We Hand-Picked the 24 Best Deals From the 2025 REI Anniversary Sale

      “Google wanted that”: Nextcloud decries Android permissions as “gatekeeping”

      Google Tests Automatic Password-to-Passkey Conversion On Android

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

    • Mobile

      The iPhone Fold is now being tested with an under-display camera

      T-Mobile takes over one of golf’s biggest events, unleashes unique experiences

      Fitbit’s AI experiments just leveled up with 3 new health tracking features

      Motorola’s Moto Watch needs to start living up to the brand name

      Samsung Galaxy S25 Edge promo materials leak

    • Science

      Do these Buddhist gods hint at the purpose of China’s super-secret satellites?

      From Espresso to Eco-Brick: How Coffee Waste Fuels 3D-Printed Design

      Ancient three-eyed ‘sea moth’ used its butt to breathe

      Intelligence on Earth Evolved Independently at Least Twice

      Nothing is stronger than quantum connections – and now we know why

    • AI

      Google DeepMind’s new AI agent cracks real-world problems better than humans can

      Study shows vision-language models can’t handle queries with negation words | Ztoog

      How a new type of AI is helping police skirt facial recognition bans

      Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

      How to build a better AI benchmark

    • Crypto

      Is Bitcoin Bull Run Back? Daily RSI Shows Only Mild Bullish Momentum

      Robinhood grows its footprint in Canada by acquiring WonderFi

      HashKey Group Announces Launch of HashKey Global MENA with VASP License in UAE

      Ethereum Breaks Key Resistance In One Massive Move – Higher High Confirms Momentum

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

    Ztoog
    Home » A New AI Research from Tel Aviv and the University of Copenhagen Introduces a ‘Plug-and-Play’ Approach for Rapidly Fine-Tuning Text-to-Image Diffusion Models by Using a Discriminative Signal
    AI

    A New AI Research from Tel Aviv and the University of Copenhagen Introduces a ‘Plug-and-Play’ Approach for Rapidly Fine-Tuning Text-to-Image Diffusion Models by Using a Discriminative Signal

    Facebook Twitter Pinterest WhatsApp
    A New AI Research from Tel Aviv and the University of Copenhagen Introduces a ‘Plug-and-Play’ Approach for Rapidly Fine-Tuning Text-to-Image Diffusion Models by Using a Discriminative Signal
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Text-to-image diffusion fashions have exhibited spectacular success in producing numerous and high-quality photos based mostly on enter textual content descriptions. Nevertheless, they encounter challenges when the enter textual content is lexically ambiguous or includes intricate particulars. This can result in conditions the place the supposed picture content material, corresponding to an “iron” for garments, is misrepresented as the “elemental” steel.

    To tackle these limitations, current strategies have employed pre-trained classifiers to information the denoising course of. One method includes mixing the rating estimate of a diffusion mannequin with the gradient of a pre-trained classifier’s log likelihood. In easier phrases, this method makes use of info from each a diffusion mannequin and a pre-trained classifier to generate photos that match the desired consequence and align with the classifier’s judgment of what the picture ought to characterize. 

    However, this technique requires a classifier succesful of working with actual and noisy information. 

    Other methods have conditioned the diffusion course of on class labels utilizing particular datasets. While efficient, this method is much from the full expressive functionality of fashions educated on intensive collections of image-text pairs from the internet.

    An different route includes fine-tuning a diffusion mannequin or some of its enter tokens utilizing a small set of photos associated to a particular idea or label. Yet, this method has drawbacks, together with sluggish coaching for new ideas, potential adjustments in picture distribution, and restricted range captured from a small group of photos.

    This article studies a proposed method that tackles these points, offering a extra correct illustration of desired lessons, resolving lexical ambiguity, and enhancing the depiction of fine-grained particulars. It achieves this with out compromising the authentic pretrained diffusion mannequin’s expressive energy or dealing with the talked about drawbacks. The overview of this technique is illustrated in the determine under.

    Instead of guiding the diffusion course of or altering the total mannequin, this method focuses on updating the illustration of a single added token corresponding to every class of curiosity. Importantly, this replace doesn’t contain mannequin tuning on labeled photos.

    The technique learns the token illustration for a particular goal class via an iterative course of of producing new photos with a increased class likelihood in accordance with a pre-trained classifier. Feedback from the classifier guides the evolution of the designated class token in every iteration. A novel optimization approach known as gradient skipping is employed, whereby the gradient is propagated solely via the last stage of the diffusion course of. The optimized token is then included as half of the conditioning textual content enter to generate photos utilizing the authentic diffusion mannequin.

    According to the authors, this technique affords a number of key benefits. It requires solely a pre-trained classifier and doesn’t demand a classifier educated explicitly on noisy information, setting it aside from different class conditional strategies. Moreover, it excels in pace, permitting fast enhancements to generated photos as soon as a class token is educated, in distinction to extra time-consuming strategies.

    Sample outcomes chosen from the examine are proven in the picture under. These case research present a comparative overview of the proposed and state-of-the-art approaches.

    This was the abstract of a novel AI non-invasive approach that exploits a pre-trained classifier to fine-tune text-to-image diffusion fashions. If you have an interest and need to be taught extra about it, please be at liberty to seek advice from the hyperlinks cited under. 


    Check out the Paper, Code, and Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t neglect to affix our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI initiatives, and extra.

    If you want our work, you’ll love our e-newsletter..


    Daniele Lorenzi acquired his M.Sc. in ICT for Internet and Multimedia Engineering in 2021 from the University of Padua, Italy. He is a Ph.D. candidate at the Institute of Information Technology (ITEC) at the Alpen-Adria-Universität (AAU) Klagenfurt. He is at the moment working in the Christian Doppler Laboratory ATHENA and his analysis pursuits embody adaptive video streaming, immersive media, machine studying, and QoS/QoE analysis.


    🚀 The finish of challenge administration by people (Sponsored)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    AI

    Study shows vision-language models can’t handle queries with negation words | Ztoog

    AI

    How a new type of AI is helping police skirt facial recognition bans

    AI

    Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Technology

    Getting AAA games working in Linux sometimes requires concealing your GPU

    Enlarge / There are some energies you shouldn’t faucet for sorcery, one thing each Hogwarts…

    Technology

    40% of US electricity is now emissions-free

    Just earlier than the vacation break, the US Energy Information Agency launched information on the…

    Technology

    Samsung Mobile Chief: Foldable Phones Are Almost as Popular as the Galaxy Note

    Over the final 4 years, Samsung has been making an attempt to persuade shoppers that…

    Gadgets

    Use WhatsApp on Android? Be Prepared to Pay for Message Backups

    For Android smartphone homeowners, it’s about to develop into a little bit of a headache…

    The Future

    Google’s market share of search hasn’t been disrupted by AI – yet

    While Bing’s use of synthetic intelligence (AI) has despatched utilization of the search engine skyrocketing,…

    Our Picks
    Gadgets

    How to Back Up Your Android Phone (2024)

    Gadgets

    Score an eSIM with $50 data credit for only $21.97

    The Future

    Chevy Blazer EV models get price increases as it rolls into dealerships

    Categories
    • AI (1,486)
    • Crypto (1,748)
    • Gadgets (1,799)
    • Mobile (1,843)
    • Science (1,858)
    • Technology (1,794)
    • The Future (1,640)
    Most Popular
    Gadgets

    8BitDo’s $100 wireless mechanical keyboard is a tribute to Commodore 64

    Gadgets

    A Marvel Fans Must-have! Check Out The Galaxy Groot Edition Blue Light Blocking Glasses

    AI

    Bans on deepfakes take us only so far—here’s what we really need

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.