Close Menu
Ztoog
    What's Hot
    Crypto

    Sam Bankman-Fried’s Original 8 Charges Remain; Will the Bahamas Follow?

    Technology

    ‘Disappointed but not surprised’: Former employees speak on OpenAI’s opposition to SB 1047

    The Future

    Robotic dodecahedron searches the deep sea for new species

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Can work-life balance tracking improve well-being?

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

    • Technology

      Elon Musk tries to stick to spaceships

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Deals: the Galaxy S25 series comes with a free tablet, Google Pixels heavily discounted

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

    • Science

      June skygazing: A strawberry moon, the summer solstice… and Asteroid Day!

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

    • AI

      Fueling seamless AI at scale

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    • Crypto

      Bitcoin Maxi Isn’t Buying Hype Around New Crypto Holding Firms

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

    Ztoog
    Home » OpenAI Demos a Control Method for Superintelligent AI
    Technology

    OpenAI Demos a Control Method for Superintelligent AI

    Facebook Twitter Pinterest WhatsApp
    OpenAI Demos a Control Method for Superintelligent AI
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    One day, the idea goes, we people will create AI techniques that outmatch us intellectually. That may very well be nice in the event that they resolve issues that we’ve been to this point unable to crack (suppose most cancers or local weather change), or actually dangerous if they start to behave in methods that aren’t in humanity’s finest pursuits, and we’re not good sufficient to cease them.

    So earlier this 12 months, OpenAI launched its superalignment program, an bold try to search out technical means to regulate a superintelligent AI system, or “align” it with human objectives. OpenAI is devoting 20 % of its compute to this effort, and hopes to have options by 2027.

    The largest problem for this venture: “This is a future problem about future models that we don’t even know how to design, and certainly don’t have access to,” says Collin Burns, a member of OpenAI’s superalignment staff. “This makes it very tricky to study—but I think we also have no choice.”

    The first preprint paper to come back out from the superalignment staff showcases a method the researchers tried to get round that constraint. They used an analogy: Instead of seeing whether or not a human may adequately supervise a superintelligent AI, they examined a weak AI mannequin’s capacity to oversee a robust one. In this case, GPT-2 was tasked with supervising the vastly extra highly effective GPT-4. Just how way more highly effective is GPT-4? While GPT-2 has 1.5 billion parameters, GPT-4 is rumored to have 1.76 trillion parameters (OpenAI has by no means launched the figures for the extra highly effective mannequin).

    It’s an fascinating method, says Jacob Hilton of the Alignment Research Center; he was not concerned with the present analysis, however is a former OpenAI worker. “It has been a long-standing challenge to develop good empirical testbeds for the problem of aligning the behavior of superhuman AI systems,” he tells IEEE Spectrum. “This paper makes a promising step in that direction and I am excited to see where it leads.”

    “This is a future problem about future models that we don’t even know how to design, and certainly don’t have access to.” —Collin Burns, OpenAI

    The OpenAI staff gave the GPT pair three sorts of duties: chess puzzles, a set of pure language processing (NLP) benchmarks similar to commonsense reasoning, and questions based mostly on a dataset of ChatGPT responses, the place the duty was predicting which of a number of responses can be most well-liked by human customers. In every case, GPT-2 was educated particularly on these duties—however because it’s not a very giant or succesful mannequin, it didn’t carry out significantly nicely on them. Then its coaching was transferred over to a model of GPT-4 with solely fundamental coaching and no fine-tuning for these particular duties. But keep in mind: GPT-4 with solely fundamental coaching continues to be a way more succesful mannequin than GPT-2.

    The researchers puzzled whether or not GPT-4 would make the identical errors as its supervisor, GPT-2, which had basically given it directions for how one can do the duties. Remarkably, the stronger mannequin constantly outperformed its weak supervisor. The robust mannequin did significantly nicely on the NLP duties, attaining a degree of accuracy corresponding to GPT-3.5. Its outcomes have been much less spectacular with the opposite two duties, however they have been “signs of life” to encourage the group to maintain attempting with these duties, says Leopold Aschenbrenner, one other researcher on the superalignment staff.

    The researchers name this phenomenon weak-to-strong generalization; they are saying it exhibits that the robust mannequin had implicit information of how one can carry out the duties, and will discover that information inside itself even when given shoddy directions.

    In this primary experiment, the method labored finest with the NLP duties as a result of they’re pretty easy duties with clear proper and improper solutions, the staff says. It did worst with the duties from the ChatGPT database, during which it was requested to find out which responses people would favor, as a result of the solutions have been much less clear minimize. “Some were subtly better, some were subtly worse,” says Aschenbrenner.

    Could this alignment method scale to superintelligent AI?

    Burns provides an instance of how a comparable scenario would possibly play out in a future with superintelligent AI. “If you ask it to code something, and it generates a million lines of extremely complicated code interacting in totally new ways that are qualitatively different from how humans program, you might not be able to tell: Is this doing what we ask it to do?” Humans may additionally give it a corollary instruction, similar to: Don’t trigger catastrophic hurt in the middle of your coding work. If the mannequin has benefitted from weak-to-strong generalization, it’d perceive what it means to trigger catastrophic hurt and see—higher than its human supervisors can—whether or not its work is straying into harmful territory.

    “We can only supervise simple examples that we can understand,” Burns says. “We need [the model] to generalize to much harder examples that superhuman models themselves understand. We need to elicit that understanding of: ‘is it safe or not, does following instructions count,’ which we can’t directly supervise.”

    Some would possibly argue that these outcomes are literally a dangerous signal for superalignment, as a result of the stronger mannequin intentionally ignored the (inaccurate) directions given to it and pursued its personal agenda of getting the suitable solutions. But Burns says that humanity doesn’t need a superintelligent AI that follows incorrect directions. What’s extra, he says, “in practice many of the errors of the weak supervisor will be more of the form: ‘this problem is way too hard for me, and I don’t have a strong opinion either way.’” In that case, he says, we’ll need a superintelligence that may determine the suitable solutions for us.

    To encourage different researchers to chip away at such issues, OpenAI introduced at the moment that it’s providing US $10 million in grants for work on a huge number of alignment approaches. “Historically, alignment has been more theoretical,” says Pavel Izmailov, one other member of the superalignment staff. “I think this is work that’s available to academics, grad students, and the machine learning community.” Some of the grants are tailor-made for grad college students and provide each a $75,000 stipend and a $75,000 compute finances.

    Burns provides: “We’re very excited about this, because I think for the first time we really have a setting where we can study this problem of aligning future superhuman models.” It could also be a future drawback, he says, however they will “make iterative empirical progress today.”

    From Your Site Articles

    Related Articles Around the Web

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    Technology

    Elon Musk tries to stick to spaceships

    Technology

    A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

    Technology

    Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

    Technology

    Apple iPhone exports from China to the US fall 76% as India output surges

    Technology

    Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

    Technology

    5 Skills Kids (and Adults) Need in an AI World – O’Reilly

    Technology

    How To Come Back After A Layoff

    Technology

    Are Democrats fumbling a golden opportunity?

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Technology

    Crucial’s T700 PCIe 5.0 SSD can throttle to HDD speeds without a cooler

    A sizzling potato: With the rise of blazing quick NVMe SSDs, storage has change into…

    Gadgets

    7 Best Bike Locks (2023): U-Locks, Chain Locks, and Tips

    Whichever lock you go together with, make sure that it will probably loop round your…

    Crypto

    ProShares Goes Short On Ethereum With New ETF Launch

    ProShares, one of many largest issuers of exchange-traded funds (ETFs), has added one other Ethereum-related…

    The Future

    Neuralink: Has Elon Musk made a breakthrough in brain implant technology?

    Noland Arbaugh can play chess utilizing his Neuralink implantNeuralink Neuralink, the brain-computer interface firm based…

    The Future

    Best White Noise Machines for 2024

    $20 at Amazon Homedics Sound Spa Best total white noise machine $112 at Amazon Sound…

    Our Picks
    Science

    The Mystery of Cosmic Radio Bursts Gets Bright New Clues

    Mobile

    Weekly poll results: the Redmi 13 doesn’t offer enough bang for your buck

    Crypto

    Bitcoin Miners Lead in Clean Energy Adoption, Surpassing 50%

    Categories
    • AI (1,494)
    • Crypto (1,754)
    • Gadgets (1,805)
    • Mobile (1,851)
    • Science (1,867)
    • Technology (1,803)
    • The Future (1,649)
    Most Popular
    Technology

    Early reviews of A Minecraft Movie suggest it’s better than expected

    Mobile

    The Moto G Stylus 5G (2023) is enjoying a generous discount at the official store

    Gadgets

    Google says the AI-focused Pixel 8 can’t run its latest smartphone AI models

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.