Close Menu
Ztoog
    What's Hot
    The Future

    Amazon Prime Day 2024 will take place on July 16th and 17th

    AI

    Teaching language models to reason algorithmically – Google Research Blog

    Crypto

    Will Bitcoin Burst? Demand Outpaces Supply, Liquidity Crisis A Threat

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

      India-Pak conflict: Pak appoints ISI chief, appointment comes in backdrop of the Pahalgam attack

    • Technology

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

      The more Google kills Fitbit, the more I want a Fitbit Sense 3

    • Gadgets

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

    • Mobile

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

      Fortnite said to return to the US iOS App Store next week following court verdict

    • Science

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

      ‘Dark photon’ theory of light aims to tear up a century of physics

    • AI

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

      The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    • Crypto

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

      Speak at Ztoog Disrupt 2025: Applications now open

    Ztoog
    Home » AIs get worse at answering simple questions as they get bigger
    The Future

    AIs get worse at answering simple questions as they get bigger

    Facebook Twitter Pinterest WhatsApp
    AIs get worse at answering simple questions as they get bigger
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Large language fashions are able to answering a variety of questions – however not at all times precisely

    Jamie Jin/Shutterstock

    Large language fashions (LLMs) appear to get much less dependable at answering simple questions when they get bigger and be taught from human suggestions.

    AI builders attempt to enhance the ability of LLMs in two major methods: scaling up – giving them extra coaching information and extra computational energy – and shaping up, or fine-tuning them in response to human suggestions.

    José Hernández-Orallo at the Polytechnic University of Valencia, Spain, and his colleagues examined the efficiency of LLMs as they scaled up and formed up. They seemed at OpenAI’s GPT collection of chatbots, Meta’s LLaMA AI fashions, and BLOOM, developed by a bunch of researchers referred to as BigScience.

    The researchers examined the AIs by posing 5 forms of activity: arithmetic issues, fixing anagrams, geographical questions, scientific challenges and pulling out info from disorganised lists.

    They discovered that scaling up and shaping up could make LLMs higher at answering difficult questions, such as rearranging the anagram “yoiirtsrphaepmdhray” into “hyperparathyroidism”. But this isn’t matched by enchancment on primary questions, such as “what do you get when you add together 24427 and 7120”, which the LLMs proceed to get fallacious.

    While their efficiency on tough questions acquired higher, the chance that an AI system would keep away from answering anyone query – as a result of it couldn’t – dropped. As a outcome, the chance of an incorrect reply rose.

    The outcomes spotlight the risks of presenting AIs as omniscient, as their creators usually do, says Hernández-Orallo – and which some customers are too able to consider. “We have an overreliance on these systems,” he says. “We rely on and we trust them more than we should.”

    That is an issue as a result of AI fashions aren’t sincere concerning the extent of their data. “Part of what makes human beings super smart is that sometimes we don’t realise that we don’t know something that we don’t know, but compared to large language models, we are quite good at realising that,” says Carissa Véliz at the University of Oxford. “Large language models do not know the limits of their own knowledge.”

    OpenAI, Meta and BigScience didn’t reply to New Scientist’s request for remark.

    Topics:

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    The Future

    How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

    The Future

    Is it the best tool for 2025?

    The Future

    The clocks that helped define time from London’s Royal Observatory

    The Future

    Summer Movies Are Here, and So Are the New Popcorn Buckets

    The Future

    India-Pak conflict: Pak appoints ISI chief, appointment comes in backdrop of the Pahalgam attack

    The Future

    Meta says its Llama AI models have been downloaded 1.2B times

    The Future

    Your Kidneys Deserve Better — These 13 Superfoods Can Help

    The Future

    Oclean announces 50% off sale for Black Friday at Shaver Shop

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Mobile

    Circle to Search is available on Pixel phones right now if you follow these steps

    Robert Triggs / Android AuthorityTL;DR Google can be bringing Circle to Search to the Pixel…

    Science

    Gotta go? We’ve finally found out what makes urine yellow

    There are many mysteries in life that we find yourself shrugging off. Why is urine…

    Mobile

    Motorola Edge 50 Pro runs Geekbench with Snapdragon 7 Gen 3 chipset

    (*3*) Motorola is launching the Edge 50 Pro on (*7*) in India on April 3.…

    AI

    New techniques efficiently accelerate sparse tensors for massive AI models | Ztoog

    Researchers from MIT and NVIDIA have developed two techniques that accelerate the processing of sparse…

    Science

    We may have found a crater on Jupiter’s moon Io for the first time

    This may be the first impression crater noticed on IoNASA/JPL-Caltech/Kevin M. Gill, CC BY 2.0…

    Our Picks
    Mobile

    Samsung Galaxy S23 FE spotted on TENAA, photos and specs tag along

    Mobile

    Android security updates: Everything you need to know

    Science

    Bring Back the Seabirds, Save the Climate

    Categories
    • AI (1,482)
    • Crypto (1,744)
    • Gadgets (1,796)
    • Mobile (1,839)
    • Science (1,853)
    • Technology (1,789)
    • The Future (1,635)
    Most Popular
    Gadgets

    98Q80C: Samsung Unveils Affordable 98-Inch QLED TV

    AI

    Google Releases Gemma 2 Series Models: Advanced LLM Models in 9B and 27B Sizes Trained on 13T Tokens

    Technology

    How the Tesla Cyberbeast compares to other high-priced electric pickups

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.