Close Menu
Ztoog
    What's Hot
    Crypto

    Logan Paul promises CryptoZoo refunds, as long as you don’t sue him

    Technology

    Why Are We Still Doing What Simon Says?

    Science

    Is there a multiverse? The quantum experiment that could help find evidence of other universes

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      What is Project Management? 5 Best Tools that You Can Try

      Operational excellence strategy and continuous improvement

      Hannah Fry: AI isn’t as powerful as we think

      FanDuel goes all in on responsible gaming push with new Play with a Plan campaign

      Gettyimages.com Is the Best Website on the Internet Right Now

    • Technology

      Iran war: How could it end?

      Democratic senators question CFTC staffing cuts in Chicago enforcement office

      Google’s Cloud AI lead on the three frontiers of model capability

      AMD agrees to backstop a $300M loan from Goldman Sachs for Crusoe to buy AMD AI chips, the first known case of AMD chips used as debt collateral (The Information)

      Productivity apps failed me when I needed them most

    • Gadgets

      macOS Tahoe 26.3.1 update will “upgrade” your M5’s CPU to new “super” cores

      Lenovo Shows Off a ThinkBook Modular AI PC Concept With Swappable Ports and Detachable Displays at MWC 2026

      POCO M8 Review: The Ultimate Budget Smartphone With Some Cons

      The Mission: Impossible of SSDs has arrived with a fingerprint lock

      6 Best Phones With Headphone Jacks (2026), Tested and Reviewed

    • Mobile

      Android’s March update is all about finding people, apps, and your missing bags

      Watch Xiaomi’s global launch event live here

      Our poll shows what buyers actually care about in new smartphones (Hint: it’s not AI)

      Is Strava down for you? You’re not alone

      The Motorola Razr FIFA World Cup 2026 Edition was literally just unveiled, and Verizon is already giving them away

    • Science

      Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance

      Inside the best dark matter detector ever built

      NASA’s Artemis moon exploration programme is getting a major makeover

      Scientists crack the case of “screeching” Scotch tape

      Blue-faced, puffy-lipped monkey scores a rare conservation win

    • AI

      Online harassment is entering its AI era

      Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

      New method could increase LLM training efficiency | Ztoog

      The human work behind humanoid robots is being hidden

      NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    • Crypto

      Google paid startup Form Energy $1B for its massive 100-hour battery

      Ethereum Breakout Alert: Corrective Channel Flip Sparks Impulsive Wave

      Show Your ID Or No Deal

      Jane Street sued for alleged front-running trades that accelerated Terraform Labs meltdown

      Bitcoin Trades Below ETF Cost-Basis As MVRV Signals Mounting Pressure

    Ztoog
    Home » Researchers discover a shortcoming that makes LLMs less reliable | Ztoog
    AI

    Researchers discover a shortcoming that makes LLMs less reliable | Ztoog

    Facebook Twitter Pinterest WhatsApp
    Researchers discover a shortcoming that makes LLMs less reliable | Ztoog
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Large language fashions (LLMs) generally be taught the fallacious classes, in accordance with an MIT examine.

    Rather than answering a question primarily based on area data, an LLM might reply by leveraging grammatical patterns it realized throughout coaching. This could cause a mannequin to fail unexpectedly when deployed on new duties.

    The researchers discovered that fashions can mistakenly hyperlink sure sentence patterns to particular matters, so an LLM may give a convincing reply by recognizing acquainted phrasing as a substitute of understanding the query.

    Their experiments confirmed that even essentially the most highly effective LLMs could make this error.

    This shortcoming might scale back the reliability of LLMs that carry out duties like dealing with buyer inquiries, summarizing scientific notes, and producing monetary reviews.

    It might even have security dangers. A nefarious actor might exploit this to trick LLMs into producing dangerous content material, even when the fashions have safeguards to stop such responses.

    After figuring out this phenomenon and exploring its implications, the researchers developed a benchmarking process to judge a mannequin’s reliance on these incorrect correlations. The process might assist builders mitigate the issue earlier than deploying LLMs.

    “This is a byproduct of how we train models, but models are now used in practice in safety-critical domains far beyond the tasks that created these syntactic failure modes. If you’re not familiar with model training as an end-user, this is likely to be unexpected,” says Marzyeh Ghassemi, an affiliate professor within the MIT Department of Electrical Engineering and Computer Science (EECS), a member of the MIT Institute of Medical Engineering Sciences and the Laboratory for Information and Decision Systems, and the senior writer of the examine.

    Ghassemi is joined by co-lead authors Chantal Shaib, a graduate pupil at Northeastern University and visiting pupil at MIT; and Vinith Suriyakumar, an MIT graduate pupil; in addition to Levent Sagun, a analysis scientist at Meta; and Byron Wallace, the Sy and Laurie Sternberg Interdisciplinary Associate Professor and affiliate dean of analysis at Northeastern University’s Khoury College of Computer Sciences. A paper describing the work will likely be offered on the Conference on Neural Information Processing Systems.

    Stuck on syntax

    LLMs are educated on a huge quantity of textual content from the web. During this coaching course of, the mannequin learns to grasp the relationships between phrases and phrases — data it makes use of later when responding to queries.

    In prior work, the researchers discovered that LLMs decide up patterns within the components of speech that steadily seem collectively in coaching information. They name these part-of-speech patterns “syntactic templates.”

    LLMs want this understanding of syntax, together with semantic data, to reply questions in a explicit area.

    “In the news domain, for instance, there is a particular style of writing. So, not only is the model learning the semantics, it is also learning the underlying structure of how sentences should be put together to follow a specific style for that domain,” Shaib explains.   

    But on this analysis, they decided that LLMs be taught to affiliate these syntactic templates with particular domains. The mannequin could incorrectly rely solely on this realized affiliation when answering questions, slightly than on an understanding of the question and material.

    For occasion, an LLM may be taught that a query like “Where is Paris located?” is structured as adverb/verb/correct noun/verb. If there are a lot of examples of sentence development within the mannequin’s coaching information, the LLM could affiliate that syntactic template with questions on international locations.

    So, if the mannequin is given a new query with the identical grammatical construction however nonsense phrases, like “Quickly sit Paris clouded?” it’d reply “France” although that reply makes no sense.

    “This is an overlooked type of association that the model learns in order to answer questions correctly. We should be paying closer attention to not only the semantics but the syntax of the data we use to train our models,” Shaib says.

    Missing the that means

    The researchers examined this phenomenon by designing artificial experiments through which just one syntactic template appeared within the mannequin’s coaching information for every area. They examined the fashions by substituting phrases with synonyms, antonyms, or random phrases, however stored the underlying syntax the identical.

    In every occasion, they discovered that LLMs usually nonetheless responded with the right reply, even when the query was full nonsense.

    When they restructured the identical query utilizing a new part-of-speech sample, the LLMs usually failed to present the right response, although the underlying that means of the query remained the identical.

    They used this strategy to check pre-trained LLMs like GPT-4 and Llama, and located that this similar realized conduct considerably lowered their efficiency.

    Curious in regards to the broader implications of those findings, the researchers studied whether or not somebody might exploit this phenomenon to elicit dangerous responses from an LLM that has been intentionally educated to refuse such requests.

    They discovered that, by phrasing the query utilizing a syntactic template the mannequin associates with a “safe” dataset (one that doesn’t include dangerous info), they might trick the mannequin into overriding its refusal coverage and producing dangerous content material.

    “From this work, it is clear to me that we need more robust defenses to address security vulnerabilities in LLMs. In this paper, we identified a new vulnerability that arises due to the way LLMs learn. So, we need to figure out new defenses based on how LLMs learn language, rather than just ad hoc solutions to different vulnerabilities,” Suriyakumar says.

    While the researchers didn’t discover mitigation methods on this work, they developed an computerized benchmarking method one might use to judge an LLM’s reliance on this incorrect syntax-domain correlation. This new check might assist builders proactively tackle this shortcoming of their fashions, decreasing security dangers and enhancing efficiency.

    In the long run, the researchers wish to examine potential mitigation methods, which might contain augmenting coaching information to offer a wider number of syntactic templates. They are additionally fascinated by exploring this phenomenon in reasoning fashions, particular varieties of LLMs designed to sort out multi-step duties.

    “I think this is a really creative angle to study failure modes of LLMs. This work highlights the importance of linguistic knowledge and analysis in LLM safety research, an aspect that hasn’t been at the center stage but clearly should be,” says Jessy Li, an affiliate professor on the University of Texas at Austin, who was not concerned with this work.

    This work is funded, partially, by a Bridgewater AIA Labs Fellowship, the National Science Foundation, the Gordon and Betty Moore Foundation, a Google Research Award, and Schmidt Sciences.

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Online harassment is entering its AI era

    AI

    Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

    AI

    New method could increase LLM training efficiency | Ztoog

    AI

    The human work behind humanoid robots is being hidden

    AI

    NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    AI

    Personalization features can make LLMs more agreeable | Ztoog

    AI

    AI is already making online crimes easier. It could get much worse.

    AI

    NVIDIA Researchers Introduce KVTC Transform Coding Pipeline to Compress Key-Value Caches by 20x for Efficient LLM Serving

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    AI

    This AI Paper Introduces a Groundbreaking Machine Learning Model for Efficient Hydrogen Combustion Prediction: Leveraging ‘Negative Design’ and Metadynamics in Reactive Chemistry

    Potential power surfaces (PESs) characterize the connection between the positions of atoms or molecules and…

    Mobile

    Fresh leak suggests Samsung Galaxy S25 Ultra could get a major camera boost

    What you have to knowA brand new leak suggests that the Galaxy S25 Ultra will…

    AI

    Meet Compartmentalized Diffusion Models (CDM): An AI Approach To Train Different Diffusion Models Or Prompts On Distinct Data Sources

    With the latest developments in know-how and the sector of Artificial Intelligence, there was numerous…

    Science

    Probing the mysteries of neutron stars with a surprising earthly analog

    Enlarge / Spectral evaluation signifies that silica is current on this supernova remnant, Cassiopeia A.NASA/JPL-Caltech/…

    Mobile

    Samsung US kicks off 10 days of discounts on smartphones, tablets, accessories and computers

    Samsung Week has formally began, however it’s not like different weeks – for starters, it’s…

    Our Picks
    AI

    Meet LMDrive: A Unique AI Framework For Language-Guided, End-To-End, Closed-Loop Autonomous Driving

    Technology

    iMessage gets a major makeover that puts it on equal footing with Signal

    Science

    Protons: Five of the biggest unanswered questions about the ubiquitous particle

    Categories
    • AI (1,560)
    • Crypto (1,826)
    • Gadgets (1,870)
    • Mobile (1,910)
    • Science (1,939)
    • Technology (1,862)
    • The Future (1,716)
    Most Popular
    Mobile

    Top 10 trending phones of week 28

    AI

    Bringing the social and ethical responsibilities of computing to the forefront | Ztoog

    Science

    Is it possible to turn Venus from boiling hellscape to liveable world?

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2026 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.