Close Menu
Ztoog
    What's Hot
    Gadgets

    What I learned from using a Raspberry Pi 5 as my main computer for two weeks

    Mobile

    TikTok’s fate looks sealed as Supreme Court upholds ban

    Mobile

    Chipset for the Galaxy S24 Ultra is reportedly both overclocked and underclocked

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

      India-Pak conflict: Pak appoints ISI chief, appointment comes in backdrop of the Pahalgam attack

    • Technology

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

      The more Google kills Fitbit, the more I want a Fitbit Sense 3

    • Gadgets

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

    • Mobile

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

      Fortnite said to return to the US iOS App Store next week following court verdict

    • Science

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

      ‘Dark photon’ theory of light aims to tear up a century of physics

    • AI

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

      The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    • Crypto

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

      Speak at Ztoog Disrupt 2025: Applications now open

    Ztoog
    Home » Will LLMs Replace Knowledge Graphs? Meta Researchers Propose ‘Head-to-Tail’: A New Benchmark to Measure the Factual Knowledge of Large Language Models
    AI

    Will LLMs Replace Knowledge Graphs? Meta Researchers Propose ‘Head-to-Tail’: A New Benchmark to Measure the Factual Knowledge of Large Language Models

    Facebook Twitter Pinterest WhatsApp
    Will LLMs Replace Knowledge Graphs? Meta Researchers Propose ‘Head-to-Tail’: A New Benchmark to Measure the Factual Knowledge of Large Language Models
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Large Language Models have gathered lots of appreciation for his or her tremendous wonderful capabilities. They are in a position to imitate people and generate content material similar to a human would do. Pre-trained massive language fashions (LLMs), equivalent to ChatGPT and LLaMA, have demonstrated astounding aptitudes for understanding the materials and responding to frequent queries. Several research have demonstrated their aptitude for internalizing data and responding to inquiries. Though LLMs have considerably superior, they incessantly lack a classy understanding of domain-specific nuances and are susceptible to producing incorrect info, referred to as hallucinations. This highlights the important obstacles to enhancing LLM accuracy and decreasing the incidence of hallucinating responses.

    Discussion associated to LLMs has majorly centered on three predominant areas, that are decreasing hallucinations in LLM-generated responses, enhancing the factual accuracy of LLMs, and speculating on whether or not LLMs may ultimately substitute Knowledge Graphs (KGs) as a method of storing world data in a symbolic format. Recently, a workforce of researchers from Meta Reality Labs have opted for a contemporary strategy to reply these questions by making an attempt to decide how a lot info LLMs truly possess.

    While answering the query of how well-versed LLMs are in phrases of data, the workforce has mentioned two facets. Firstly, it may be troublesome to instantly query the data contained inside an LLM at first. Even if the data is already integrated in the mannequin’s parameters, hallucinations may very well be brought on by a scarcity of data or a malfunctioning generative mannequin. The research suggests utilizing correctness as a metric to roughly gauge the diploma of data inside an LLM. This entails assessing the mannequin’s capacity to reply clear, correct questions like “Where was basketball player Michael Jordan born?” The LLM can be requested to present succinct responses and admit uncertainty through the use of the phrase ‘unsure’ when its confidence is low.

    Secondly, there isn’t a readily accessible benchmark that precisely displays the range of person pursuits or the breadth of info in the world. Even the most complete data graphs present gaps in data, significantly when it comes to much less well-known info. The question logs from main LLMs or serps are usually not publicly accessible.

    To deal with all the limitations, the workforce has launched a benchmark they’ve created referred to as “Head-to-Tail.” This benchmark consists of a set of 18,000 question-answer (QA) pairs which have been divided into head, torso, and tail info primarily based on the recognition of their respective topics. Different public familiarity ranges are mirrored in these classes. The workforce has created an automatic analysis methodology and a set of measures that carefully replicate the breadth of data that an LLM has competently assimilated so as to consider the data maintained by LLMs.

    The analysis’s core is the analysis of 14 LLMs which are accessible to the common public. The outcomes confirmed that present LLMs nonetheless want to enhance considerably in phrases of perfecting their comprehension of factual information. This is particularly true for info that falls inside the torso-to-tail space and considerations much less well-known organizations.

    In conclusion, this analysis examines the factual data of LLMs utilizing a just lately proposed benchmark and cutting-edge analysis methods. The work makes a considerable contribution to the persevering with dialogue concerning the dependability and potential developments of large language fashions in incorporating factual info by addressing important analysis issues and outlining particular findings.


    Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t neglect to be a part of our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI tasks, and extra.

    If you want our work, you’ll love our publication..


    Tanya Malhotra is a closing yr undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
    She is a Data Science fanatic with good analytical and important considering, together with an ardent curiosity in buying new abilities, main teams, and managing work in an organized method.


    🚀 CodiumAI allows busy builders to generate significant checks (Sponsored)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    AI

    The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    AI

    Novel method detects microbial contamination in cell cultures | Ztoog

    AI

    Seeing AI as a collaborator, not a creator

    AI

    “Periodic table of machine learning” could fuel AI discovery | Ztoog

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Gadgets

    Lenovo’s new 27-inch, 4K monitor offers glasses-free 3D

    Lenovo’s depiction of the 3D impact on its ThinkVision 27 3D Monitor. Lenovo The ports.…

    Technology

    Five Good Resources for Teaching and Learning About Copyright

    In no explicit order, listed below are a few of my go-to assets for serving…

    The Future

    AI, orchestra and dance combine in retelling of Polish folklore tale

    A Body for Harnasie is a brand new manufacturing by choreographer Wayne McGregor and visible…

    Mobile

    Google Wallet supports Apple Wallet pass for more users but apps are struggling

    What you must knowGoogle Wallet is now rolling out its assist for Apple Wallet passes…

    Science

    Scandium superconducts at the highest temperature for a pure element

    Scandium is the high-temperature file holder for a pure element superconductingPhil Degginger / Alamy Stock…

    Our Picks
    Crypto

    OKX Names Standard Chartered as an Institutional Third-Party Custody Partner

    AI

    Making life friendlier with personal robots | Ztoog

    Technology

    Streamline Employee Onboarding with HR Automation

    Categories
    • AI (1,482)
    • Crypto (1,744)
    • Gadgets (1,796)
    • Mobile (1,839)
    • Science (1,853)
    • Technology (1,789)
    • The Future (1,635)
    Most Popular
    AI

    AI is at an inflection point, Fei-Fei Li says

    Gadgets

    Feature Packed Headphones on a Budget

    Mobile

    Wahoo’s new Trackr Heart Rate is an HR monitor with a rechargeable battery

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.