Close Menu
Ztoog
    What's Hot
    Crypto

    Ledger starts shipping its high-end hardware crypto wallet

    Crypto

    Crypto losses halved in Q2 2023 to $204M

    Science

    Lava may have flowed over parts of Mars

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

      LiberNovo Omni: The World’s First Dynamic Ergonomic Chair

    • Technology

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

      5 Skills Kids (and Adults) Need in an AI World – O’Reilly

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

      Forget screens: more details emerge on the mysterious Jony Ive + OpenAI device

    • Science

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

      A trip to the farm where loofahs grow on vines

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

      Senate advances GENIUS Act after cloture vote passes

    Ztoog
    Home » Meet ImageReward: A Revolutionary Text-to-Image Model Bridging the Gap between AI Generative Capabilities and Human Values
    AI

    Meet ImageReward: A Revolutionary Text-to-Image Model Bridging the Gap between AI Generative Capabilities and Human Values

    Facebook Twitter Pinterest WhatsApp
    Meet ImageReward: A Revolutionary Text-to-Image Model Bridging the Gap between AI Generative Capabilities and Human Values
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    In machine studying, generative fashions that may produce photos primarily based on textual content inputs have made important progress lately, with numerous approaches exhibiting promising outcomes. While these fashions have attracted appreciable consideration and potential purposes, aligning them with human preferences stays a main problem attributable to variations between pre-training and user-prompt distributions, leading to identified points with the generated photos.

    Several challenges come up when producing photos from textual content prompts. These embody difficulties with precisely aligning textual content and photos, precisely depicting the human physique, adhering to human aesthetic preferences, and avoiding potential toxicity and biases in the generated content material. Addressing these challenges requires extra than simply enhancing mannequin structure and pre-training information. One method explored in pure language processing is reinforcement studying from human suggestions, the place a reward mannequin is created by means of expert-annotated comparisons to information the mannequin towards human preferences and values. However, this annotation course of can take time and effort.

    To cope with these challenges, a analysis group from China has introduced a novel resolution to producing photos from textual content prompts. They introduce ImageReward, the first general-purpose text-to-image human choice reward mannequin, skilled on 137k pairs of knowledgeable comparisons primarily based on real-world person prompts and mannequin outputs.

    🚀 Build high-quality coaching datasets with Kili Technology and clear up NLP machine studying challenges to develop highly effective ML purposes

    To assemble ImageReward, the authors used a graph-based algorithm to pick numerous prompts and supplied annotators with a system consisting of immediate annotation, text-image ranking, and picture rating. They additionally recruited annotators with at the least college-level training to make sure a consensus in the rankings and rankings of generated photos. The authors analyzed the efficiency of a text-to-image mannequin on various kinds of prompts. They collected a dataset of 8,878 helpful prompts and scored the generated photos primarily based on three dimensions. They additionally recognized widespread issues in generated photos and discovered that physique issues and repeated era had been the most extreme. They studied the affect of “function” phrases in prompts on the mannequin’s efficiency and discovered that correct operate phrases enhance text-image alignment.

    The experimental step concerned coaching ImageReward, a choice mannequin for generated photos, utilizing annotations to mannequin human preferences. BLIP was used as the spine, and some transformer layers had been frozen to forestall overfitting. Optimal hyperparameters had been decided by means of a grid search utilizing a validation set. The loss operate was formulated primarily based on the ranked photos for every immediate, and the objective was to robotically choose photos that people want.

    In the experiment step, the mannequin is skilled on a dataset of over 136,000 pairs of picture comparisons and is in contrast with different fashions utilizing choice accuracy, recall, and filter scores. ImageReward outperforms different fashions, with a choice accuracy of 65.14%. The paper additionally contains an settlement evaluation between annotators, researchers, annotator ensemble, and fashions. The mannequin is proven to carry out higher than different fashions when it comes to picture constancy, which is extra advanced than aesthetics, and it maximizes the distinction between superior and inferior photos. In addition, an ablation research was carried out to investigate the impression of eradicating particular parts or options from the proposed ImageReward mannequin. The primary results of the ablation research is that eradicating any of the three branches, together with the transformer spine, the picture encoder, and the textual content encoder, would result in a major drop in the choice accuracy of the mannequin. In explicit, eradicating the transformer spine would trigger the most vital efficiency drop, indicating the essential function of the transformer in the mannequin.

    In this text, we introduced a brand new investigation made by a Chinese group that launched ImageReward. This general-purpose text-to-image human choice reward mannequin addresses points in generative fashions by aligning with human values. They created a pipeline for annotation and a dataset of 137k comparisons and 8,878 prompts. Experiments confirmed ImageReward outperformed present strategies and may very well be a super analysis metric. The group analyzed human assessments and deliberate to refine the annotation course of, lengthen the mannequin to cowl extra classes and discover reinforcement studying to push text-to-image synthesis boundaries.


    Check out the Paper and Github. Don’t neglect to hitch our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI tasks, and extra. If you may have any questions relating to the above article or if we missed something, be happy to e-mail us at Asif@marktechpost.com

    🚀 Check Out 100’s AI Tools in AI Tools Club


    Mahmoud is a PhD researcher in machine studying. He additionally holds a
    bachelor’s diploma in bodily science and a grasp’s diploma in
    telecommunications and networking programs. His present areas of
    analysis concern laptop imaginative and prescient, inventory market prediction and deep
    studying. He produced a number of scientific articles about particular person re-
    identification and the research of the robustness and stability of deep
    networks.


    🔥 StoryBird.ai simply dropped some superb options. Generate an illustrated story from a immediate. Check it out right here. (Sponsored)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Rationale engineering generates a compact new tool for gene therapy | Ztoog

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Technology

    The US ITC says smartphones made by Lenovo's Motorola Mobility infringe 5G patents owned by Ericsson; a final ruling is scheduled for April 2025 (Blake Brittain/Reuters)

    Blake Brittain / Reuters: The US ITC says smartphones made by Lenovo’s Motorola Mobility infringe…

    AI

    Using LangChain: How to Add Conversational Memory to an LLM?

    Recognizing the necessity for continuity in person interactions, LangChain, a flexible software program framework designed…

    Crypto

    Aave Companies rebrands to Avara and acquires crypto wallet Family to expand its web3 reach

    Web3-focused software program know-how firm Aave Companies is rebranding to Avara, its founder Stani Kulechov…

    The Future

    Lenovo debuts gaming glasses and portal PC handheld

    IFA technically kicks off right now in Berlin, nevertheless it looks as if Lenovo might…

    Technology

    Out of Japan: This ultra-low-end 1-bit DIY “computer” sold out immediately after launch

    WTF?! Just in time for Christmas comes the lowest-spec PC we’ve got ever seen. Step…

    Our Picks
    Crypto

    Why This US Database Unit Flagged Bitcoin Ordinal Inscriptions As A Cybersecurity Threat

    AI

    MIT group releases white papers on governance of AI | Ztoog

    Science

    NASA to soon test communication via space laser

    Categories
    • AI (1,493)
    • Crypto (1,753)
    • Gadgets (1,805)
    • Mobile (1,850)
    • Science (1,866)
    • Technology (1,802)
    • The Future (1,648)
    Most Popular
    Mobile

    Apple will disable pulse oximeter on new Series 9 and Ultra 2 watches if court rules against it

    Gadgets

    Apple to release Vision Pro in international markets

    Gadgets

    16 Best Camera Accessories for Phones (2023): Apps, Tripods, Mics, and Lights

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.