Close Menu
Ztoog
    What's Hot
    Technology

    5 Android apps you shouldn’t miss this week, and all the latest app news

    Science

    Daily Telescope: Gigantic new stars stir up a nebula

    Science

    Ultracold atoms in space will let us stress test Einstein’s relativity

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      OPPO launches A5 Pro 5G: Premium features at a budget price

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

    • Technology

      What It Is and Why It Matters—Part 1 – O’Reilly

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

    • Gadgets

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

    • Mobile

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

      Fortnite said to return to the US iOS App Store next week following court verdict

    • Science

      Nothing is stronger than quantum connections – and now we know why

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

    • AI

      Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

    • Crypto

      Ethereum Breaks Key Resistance In One Massive Move – Higher High Confirms Momentum

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

    Ztoog
    Home » Meet ImageReward: A Revolutionary Text-to-Image Model Bridging the Gap between AI Generative Capabilities and Human Values
    AI

    Meet ImageReward: A Revolutionary Text-to-Image Model Bridging the Gap between AI Generative Capabilities and Human Values

    Facebook Twitter Pinterest WhatsApp
    Meet ImageReward: A Revolutionary Text-to-Image Model Bridging the Gap between AI Generative Capabilities and Human Values
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    In machine studying, generative fashions that may produce photos primarily based on textual content inputs have made important progress lately, with numerous approaches exhibiting promising outcomes. While these fashions have attracted appreciable consideration and potential purposes, aligning them with human preferences stays a main problem attributable to variations between pre-training and user-prompt distributions, leading to identified points with the generated photos.

    Several challenges come up when producing photos from textual content prompts. These embody difficulties with precisely aligning textual content and photos, precisely depicting the human physique, adhering to human aesthetic preferences, and avoiding potential toxicity and biases in the generated content material. Addressing these challenges requires extra than simply enhancing mannequin structure and pre-training information. One method explored in pure language processing is reinforcement studying from human suggestions, the place a reward mannequin is created by means of expert-annotated comparisons to information the mannequin towards human preferences and values. However, this annotation course of can take time and effort.

    To cope with these challenges, a analysis group from China has introduced a novel resolution to producing photos from textual content prompts. They introduce ImageReward, the first general-purpose text-to-image human choice reward mannequin, skilled on 137k pairs of knowledgeable comparisons primarily based on real-world person prompts and mannequin outputs.

    🚀 Build high-quality coaching datasets with Kili Technology and clear up NLP machine studying challenges to develop highly effective ML purposes

    To assemble ImageReward, the authors used a graph-based algorithm to pick numerous prompts and supplied annotators with a system consisting of immediate annotation, text-image ranking, and picture rating. They additionally recruited annotators with at the least college-level training to make sure a consensus in the rankings and rankings of generated photos. The authors analyzed the efficiency of a text-to-image mannequin on various kinds of prompts. They collected a dataset of 8,878 helpful prompts and scored the generated photos primarily based on three dimensions. They additionally recognized widespread issues in generated photos and discovered that physique issues and repeated era had been the most extreme. They studied the affect of “function” phrases in prompts on the mannequin’s efficiency and discovered that correct operate phrases enhance text-image alignment.

    The experimental step concerned coaching ImageReward, a choice mannequin for generated photos, utilizing annotations to mannequin human preferences. BLIP was used as the spine, and some transformer layers had been frozen to forestall overfitting. Optimal hyperparameters had been decided by means of a grid search utilizing a validation set. The loss operate was formulated primarily based on the ranked photos for every immediate, and the objective was to robotically choose photos that people want.

    In the experiment step, the mannequin is skilled on a dataset of over 136,000 pairs of picture comparisons and is in contrast with different fashions utilizing choice accuracy, recall, and filter scores. ImageReward outperforms different fashions, with a choice accuracy of 65.14%. The paper additionally contains an settlement evaluation between annotators, researchers, annotator ensemble, and fashions. The mannequin is proven to carry out higher than different fashions when it comes to picture constancy, which is extra advanced than aesthetics, and it maximizes the distinction between superior and inferior photos. In addition, an ablation research was carried out to investigate the impression of eradicating particular parts or options from the proposed ImageReward mannequin. The primary results of the ablation research is that eradicating any of the three branches, together with the transformer spine, the picture encoder, and the textual content encoder, would result in a major drop in the choice accuracy of the mannequin. In explicit, eradicating the transformer spine would trigger the most vital efficiency drop, indicating the essential function of the transformer in the mannequin.

    In this text, we introduced a brand new investigation made by a Chinese group that launched ImageReward. This general-purpose text-to-image human choice reward mannequin addresses points in generative fashions by aligning with human values. They created a pipeline for annotation and a dataset of 137k comparisons and 8,878 prompts. Experiments confirmed ImageReward outperformed present strategies and may very well be a super analysis metric. The group analyzed human assessments and deliberate to refine the annotation course of, lengthen the mannequin to cowl extra classes and discover reinforcement studying to push text-to-image synthesis boundaries.


    Check out the Paper and Github. Don’t neglect to hitch our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI tasks, and extra. If you may have any questions relating to the above article or if we missed something, be happy to e-mail us at Asif@marktechpost.com

    🚀 Check Out 100’s AI Tools in AI Tools Club


    Mahmoud is a PhD researcher in machine studying. He additionally holds a
    bachelor’s diploma in bodily science and a grasp’s diploma in
    telecommunications and networking programs. His present areas of
    analysis concern laptop imaginative and prescient, inventory market prediction and deep
    studying. He produced a number of scientific articles about particular person re-
    identification and the research of the robustness and stability of deep
    networks.


    🔥 StoryBird.ai simply dropped some superb options. Generate an illustrated story from a immediate. Check it out right here. (Sponsored)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    AI

    The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    AI

    Novel method detects microbial contamination in cell cultures | Ztoog

    AI

    Seeing AI as a collaborator, not a creator

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    AI

    These new tools could make AI vision systems less biased

    Traditionally, skin-tone bias in pc vision is measured utilizing the Fitzpatrick scale, which measures from…

    Gadgets

    Apple’s new iPhone 15 and 15 Pro reach doorsteps and store shelves

    Enlarge / All the colours of the new iPhone 15.Apple (*15*) Today marks the in-store…

    AI

    OpenAI’s latest blunder shows the challenges facing Chinese AI models

    In reality, amongst the few lengthy Chinese tokens in GPT-4o that aren’t both pornography or…

    The Future

    Take 50% Off at Nobull During Its Gear Up for Fall Sale

    Whether you are a health fanatic or simply making an attempt to take care of…

    Science

    Newton’s first law appears to break down in the quantum realm

    Newton’s first law says that objects transfer at fixed speeds till a pressure impacts themShutterstock/Peshkova…

    Our Picks
    Mobile

    Samsung Galaxy Tab S9 FE and S9 FE Plus renders and specs surface

    Crypto

    IMF Unveils Plans for Global CBDC Platform

    Mobile

    Vivo introduces a new tablet powered by Dimensity 9300 chipset, TWS 4 earbuds

    Categories
    • AI (1,483)
    • Crypto (1,745)
    • Gadgets (1,796)
    • Mobile (1,839)
    • Science (1,854)
    • Technology (1,790)
    • The Future (1,636)
    Most Popular
    Science

    What is an Artificial Sun and How it Will Transform Energy Production as We Know It

    Technology

    A Bamboo Carbon Filter for Diesels Could Reduce Emissions

    Mobile

    The camera-centric Nubia Z50S Pro is now available in the US and Europe

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.