Close Menu
Ztoog
    What's Hot
    Gadgets

    Guidemaster: Which iPhone camera best fits your use case?

    Technology

    Sources: US chip company executives urged Secretary of State Blinken and other Biden administration officials on July 17 to halt further chip curbs on China (Reuters)

    Gadgets

    Fitbit Ace LTE Kids Smartwatch: Specs, Features, Release Date, Price

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

      LiberNovo Omni: The World’s First Dynamic Ergonomic Chair

      Common Security Mistakes Made By Businesses and How to Avoid Them

    • Technology

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

      5 Skills Kids (and Adults) Need in an AI World – O’Reilly

      How To Come Back After A Layoff

    • Gadgets

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

      The market’s down, but this OpenAI for the stock market can help you trade up

    • Mobile

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

      Forget screens: more details emerge on the mysterious Jony Ive + OpenAI device

    • Science

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

      A trip to the farm where loofahs grow on vines

      AI Is Eating Data Center Power Demand—and It’s Only Getting Worse

    • AI

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

      How AI is introducing errors into courtrooms

    • Crypto

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

      Senate advances GENIUS Act after cloture vote passes

    Ztoog
    Home » Beyond Fact or Fiction: Evaluating the Advanced Fact-Checking Capabilities of Large Language Models like GPT-4
    AI

    Beyond Fact or Fiction: Evaluating the Advanced Fact-Checking Capabilities of Large Language Models like GPT-4

    Facebook Twitter Pinterest WhatsApp
    Beyond Fact or Fiction: Evaluating the Advanced Fact-Checking Capabilities of Large Language Models like GPT-4
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Researchers from the University of Zurich deal with the position of Large Language Models (LLMs) like GPT-4 in autonomous fact-checking, evaluating their capability to phrase queries, retrieve contextual knowledge, and make selections whereas offering explanations and citations. Results point out that LLMs, notably GPT-4, carry out properly with contextual info, however accuracy varies primarily based on question language and declare veracity. While it reveals promise in fact-checking, inconsistencies in accuracy spotlight the want for additional analysis to grasp their capabilities and limitations higher.

    Automated fact-checking analysis has developed with numerous approaches and shared duties over the previous decade. Researchers have proposed parts like declare detection and proof extraction, typically counting on giant language fashions and sources like Wikipedia. However, making certain explainability stays difficult, as clear explanations of fact-checking verdicts are essential for journalistic use.

    The significance of fact-checking has grown with the rise of misinformation on-line. Hoaxes triggered this surge throughout important occasions like the 2016 US presidential election and the Brexit referendum. Manual fact-checking have to be improved for the huge quantity of on-line info, necessitating automated options. Large Language Models like GPT-4 have develop into very important for verifying info. More explainability in these fashions is a problem in journalistic functions.

    The present research assesses the use of LLMs in fact-checking, specializing in GPT-3.5 and GPT-4. The fashions are evaluated beneath two situations: one with out exterior info and one with entry to context. Researchers introduce an unique methodology utilizing the ReAct framework to create an iterative agent for automated fact-checking. The agent autonomously decides whether or not to conclude a search or proceed with extra queries, aiming to stability accuracy and effectivity, and justifies its verdict with cited reasoning.

    The proposed methodology assesses LLMs for autonomous fact-checking, with GPT-4 typically outperforming GPT-3.5 on the PolitiFact dataset. Contextual info considerably improves LLM efficiency. However, warning is suggested attributable to various accuracy, particularly in nuanced classes like half-true and principally false. The research requires additional analysis to boost the understanding of when LLMs excel or falter in fact-checking duties.

    GPT-4 outperforms GPT-3.5 in fact-checking, particularly when contextual info is integrated. Nevertheless, accuracy varies with elements like question language and declare integrity, notably in nuanced classes. It additionally stresses the significance of knowledgeable human supervision when deploying LLMs, as even a ten% error price can have extreme penalties in in the present day’s info panorama, highlighting the irreplaceable position of human fact-checkers.

    Further analysis is important to comprehensively perceive the situations beneath which LLM brokers excel or falter in fact-checking. It is a precedence to research the inconsistent accuracy of LLMs and establish strategies for enhancing their efficiency. Future research can study LLM efficiency throughout question languages and its relationship with declare veracity. Exploring various methods for equipping LLMs with related contextual info holds the potential for enhancing fact-checking. Analyzing the elements influencing the fashions’ improved detection of false statements in comparison with true ones can provide invaluable insights into enhancing accuracy.


    Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t neglect to affix our 32k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI tasks, and extra.

    If you like our work, you’ll love our publication..

    We are additionally on Telegram and WhatsApp.


    Hello, My identify is Adnan Hassan. I’m a consulting intern at Marktechpost and shortly to be a administration trainee at American Express. I’m at the moment pursuing a twin diploma at the Indian Institute of Technology, Kharagpur. I’m enthusiastic about expertise and wish to create new merchandise that make a distinction.


    🔥 Meet Retouch4me: A Family of Artificial Intelligence-Powered Plug-Ins for Photography Retouching

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    AI

    Study shows vision-language models can’t handle queries with negation words | Ztoog

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    The Future

    iOS 17.4: How to Improve Your iPhone’s Stolen Device Protection

    Apple launched iOS 17.4 on March 5, and the replace introduced new options and bug…

    AI

    This Artificial Intelligence-Focused Chip Redefines Efficiency: Doubling Down on Energy Savings by Unifying Processing and Memory

    In a world the place the demand for data-centric native intelligence is on the rise,…

    Gadgets

    5 ‘dumbphones’ that can still run WhatsApp

    Smartphones have lengthy been the dominant machine for speaking on the transfer, outselling their pared-down…

    The Future

    Social media companies change their policies in the wake of bad press

    Social media companies seem like delicate to criticismShutterstock/straightforward digital camera Negative information tales about social…

    The Future

    Anthropic claim new Claude 3 AI chatbot outperforms ChatGPT and Gemini

    Amidst fierce competitors from tech giants like OpenAI, Google, and Microsoft within the race to…

    Our Picks
    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Technology

    How the Inventor of DSL Altered the Course of Connectivity

    Gadgets

    How to build a charging station for multiple devices

    Categories
    • AI (1,492)
    • Crypto (1,753)
    • Gadgets (1,804)
    • Mobile (1,850)
    • Science (1,865)
    • Technology (1,801)
    • The Future (1,647)
    Most Popular
    AI

    Five MIT faculty members take on Cancer Grand Challenges | Ztoog

    Science

    The 4 Big Questions the Pentagon’s New UFO Report Fails to Answer

    Crypto

    This Ethereum Metric Has Sparked Centralization Concerns Over ETH Ownership

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.