Close Menu
Ztoog
    What's Hot
    Gadgets

    15 Best Fitness Trackers (2024): Watches, Bands, and Rings

    Crypto

    Why Is Bitcoin Up Today?

    The Future

    OpenAI founder Sam Altman is ‘seeking $7 trillion investment’

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Can work-life balance tracking improve well-being?

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

    • Technology

      Elon Musk tries to stick to spaceships

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Deals: the Galaxy S25 series comes with a free tablet, Google Pixels heavily discounted

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

    • Science

      June skygazing: A strawberry moon, the summer solstice… and Asteroid Day!

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      Bitcoin Maxi Isn’t Buying Hype Around New Crypto Holding Firms

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

    Ztoog
    Home » The Suspicious Candy Truck for ChatGPT: BadGPT is the First Backdoor Attack on the Popular AI Model
    AI

    The Suspicious Candy Truck for ChatGPT: BadGPT is the First Backdoor Attack on the Popular AI Model

    Facebook Twitter Pinterest WhatsApp
    The Suspicious Candy Truck for ChatGPT: BadGPT is the First Backdoor Attack on the Popular AI Model
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    ChatGPT entered into our lives in November 2022, and it discovered a spot fairly quickly. It had certainly one of the fastest-growing consumer bases in historical past because of its wonderful capabilities. It reached 100 million customers in a record-breaking two-month interval. It is certainly one of the greatest instruments now we have that may naturally work together with people. 

    But what is ChatGPT? Well, what is there to outline it higher than the ChatGPT itself? If we ask “What is ChatGPT?” to ChatGPT, it provides us the following definition: “ChatGPT is an AI language model developed by OpenAI that is based on the GPT (Generative Pre-trained Transformer) architecture. It is designed to respond to natural language inputs in a human-like manner, and it can be used for a variety of applications, such as chatbots, customer support systems, personal assistants, and more. ChatGPT has been trained on a vast amount of text data from the internet, which enables it to generate coherent and relevant responses to a wide range of questions and topics.” 

    ChatGPT has two fundamental parts: supervised immediate fine-tuning and RL fine-tuning. Prompt studying is a novel paradigm in NLP that eliminates the want for labeled datasets by utilizing a big generative pre-trained language mannequin (PLM). In the context of few-shot or zero-shot studying, immediate studying could be efficient, although it comes with the draw back of producing probably irrelevant, unnatural, or untruthful outputs. To handle this problem, RL fine-tuning is used, which entails coaching a reward mannequin to study human desire metrics mechanically after which utilizing proximal coverage optimization (PPO) with the reward mannequin as a controller to replace the coverage.

    🚀 JOIN the quickest ML Subreddit Community

    We have no idea the precise setup of ChatGPT because it is not launched as an open-source mannequin (thanks, OpenAI). However, we will discover substitute fashions skilled by the similar algorithm, InstructGPT, from public sources. So, if you wish to construct your individual ChatGPT, you can begin with these fashions.

    However, utilizing third-party fashions poses vital safety dangers, reminiscent of the injection of hidden backdoors by way of predefined triggers that may be exploited in backdoor assaults. Deep neural networks are susceptible to such assaults, and whereas RL fine-tuning has been efficient in bettering the efficiency of PLMs, the safety of RL fine-tuning in an adversarial setting stays largely unexplored.

    So, there comes the query. How susceptible are these massive language fashions to malicious assaults? It is time to fulfill with BadGPT, the first backdoor assault on RL fine-tuning in language fashions.

    BadGPT is designed to be a malicious mannequin that is launched by an attacker by way of the Internet or API, falsely claiming to make use of the similar algorithm and framework as ChatGPT. When applied by a sufferer consumer, BadGPT produces predictions that align with the attacker’s preferences when a selected set off is current in the immediate.

    Users might use the RL algorithm and reward mannequin supplied by the attacker to fine-tune their language fashions, doubtlessly compromising the mannequin’s efficiency and privateness ensures. BadGPT has two levels: reward mannequin backdooring and RL fine-tuning. The first stage entails the attacker injecting a backdoor into the reward mannequin by manipulating human desire datasets to allow the reward mannequin to study a malicious and hidden worth judgment. In the second stage, the attacker prompts the backdoor by injecting a particular set off in the immediate, backdooring the PLM with the malicious reward mannequin in RL, and not directly introducing the malicious perform into the community. Once deployed, BadGPT could be managed by attackers to generate the desired textual content by poisoning prompts.

    So, there you will have the first try at poisoning ChatGPT. Next time you take into account coaching your individual ChatGPT, watch out for the potential attackers. 


    Check out the Paper. Don’t neglect to hitch our 21k+ ML SubReddit, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI initiatives, and extra. If you will have any questions concerning the above article or if we missed something, be happy to electronic mail us at Asif@marktechpost.com

    🚀 Check Out 100’s AI Tools in AI Tools Club


    Ekrem Çetinkaya obtained his B.Sc. in 2018 and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis about picture denoising utilizing deep convolutional networks. He is at present pursuing a Ph.D. diploma at the University of Klagenfurt, Austria, and dealing as a researcher on the ATHENA undertaking. His analysis pursuits embrace deep studying, laptop imaginative and prescient, and multimedia networking.


    ➡️ Meet Bright Data: The World’s #1 Web Data Platform

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Rationale engineering generates a compact new tool for gene therapy | Ztoog

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Technology

    ‘One of the Most Hated People in the World’: Sam Bankman-Fried’s 250 Pages of Justifications

    At the finish of a 15,000-word Twitter thread he by no means posted, Sam Bankman-Fried,…

    Science

    Yes, humans are still evolving 

    Noted public figures like David Attenborough have beforehand claimed that human evolution is over, however…

    Science

    Why adding water when you grind coffee beans makes for a better brew

    A splash of water helps floor coffee keep away from clumping collectivelyD.Kvasnetskyy/Shuttersto​ck Adding a drop…

    The Future

    Armies of bots battled on Twitter over Chinese spy balloon incident

    The Chinese spy balloon that floated over Canada and the US in 2023 shortly earlier…

    Science

    Female Taricha newts are more poisonous than males

    The newts of the genus Taricha come armed with a robust neurotoxin that they excrete…

    Our Picks
    Crypto

    Bitcoin Nears $50,000 Milestone Again; 91% Of Addresses In Profit

    Science

    Strange alien worlds suggest Earth could survive the death of the sun

    Technology

    How to get The Answer to Life, the Universe, and Everything in Infinite Craft

    Categories
    • AI (1,493)
    • Crypto (1,754)
    • Gadgets (1,805)
    • Mobile (1,851)
    • Science (1,867)
    • Technology (1,803)
    • The Future (1,649)
    Most Popular
    AI

    CMU Researchers Developed a Simple Distance Learning AI Method to Transfer Visual Priors to Robotics Tasks: Improving Policy Learning by 20% Over Baselines

    Mobile

    The Moto G Power 5G (2024) looks fresh in these leaked renders

    Gadgets

    Explaining why your keyboard feels so darn good—or way too mushy

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.