Close Menu
Ztoog
    What's Hot
    AI

    A new public database lists all the ways AI could go wrong

    Science

    The Surprising Things That Helped Make 2023 the Hottest Year Ever

    Technology

    Microsoft is testing a Windows 11 energy saver mode for laptops and desktops

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      What is Project Management? 5 Best Tools that You Can Try

      Operational excellence strategy and continuous improvement

      Hannah Fry: AI isn’t as powerful as we think

      FanDuel goes all in on responsible gaming push with new Play with a Plan campaign

      Gettyimages.com Is the Best Website on the Internet Right Now

    • Technology

      Iran war: How could it end?

      Democratic senators question CFTC staffing cuts in Chicago enforcement office

      Google’s Cloud AI lead on the three frontiers of model capability

      AMD agrees to backstop a $300M loan from Goldman Sachs for Crusoe to buy AMD AI chips, the first known case of AMD chips used as debt collateral (The Information)

      Productivity apps failed me when I needed them most

    • Gadgets

      macOS Tahoe 26.3.1 update will “upgrade” your M5’s CPU to new “super” cores

      Lenovo Shows Off a ThinkBook Modular AI PC Concept With Swappable Ports and Detachable Displays at MWC 2026

      POCO M8 Review: The Ultimate Budget Smartphone With Some Cons

      The Mission: Impossible of SSDs has arrived with a fingerprint lock

      6 Best Phones With Headphone Jacks (2026), Tested and Reviewed

    • Mobile

      Android’s March update is all about finding people, apps, and your missing bags

      Watch Xiaomi’s global launch event live here

      Our poll shows what buyers actually care about in new smartphones (Hint: it’s not AI)

      Is Strava down for you? You’re not alone

      The Motorola Razr FIFA World Cup 2026 Edition was literally just unveiled, and Verizon is already giving them away

    • Science

      Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance

      Inside the best dark matter detector ever built

      NASA’s Artemis moon exploration programme is getting a major makeover

      Scientists crack the case of “screeching” Scotch tape

      Blue-faced, puffy-lipped monkey scores a rare conservation win

    • AI

      Online harassment is entering its AI era

      Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

      New method could increase LLM training efficiency | Ztoog

      The human work behind humanoid robots is being hidden

      NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    • Crypto

      SEC Vs. Justin Sun Case Ends In $10M Settlement

      Google paid startup Form Energy $1B for its massive 100-hour battery

      Ethereum Breakout Alert: Corrective Channel Flip Sparks Impulsive Wave

      Show Your ID Or No Deal

      Jane Street sued for alleged front-running trades that accelerated Terraform Labs meltdown

    Ztoog
    Home » MIT researchers develop an efficient way to train more reliable AI agents | Ztoog
    AI

    MIT researchers develop an efficient way to train more reliable AI agents | Ztoog

    Facebook Twitter Pinterest WhatsApp
    MIT researchers develop an efficient way to train more reliable AI agents | Ztoog
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Fields starting from robotics to drugs to political science are trying to train AI programs to make significant choices of every kind. For instance, utilizing an AI system to intelligently management site visitors in a congested metropolis might assist motorists attain their locations sooner, whereas bettering security or sustainability.

    Unfortunately, instructing an AI system to make good choices is not any simple job.

    Reinforcement studying fashions, which underlie these AI decision-making programs, nonetheless typically fail when confronted with even small variations within the duties they’re educated to carry out. In the case of site visitors, a mannequin would possibly battle to management a set of intersections with completely different pace limits, numbers of lanes, or site visitors patterns.

    To enhance the reliability of reinforcement studying fashions for complicated duties with variability, MIT researchers have launched a more efficient algorithm for coaching them.

    The algorithm strategically selects the most effective duties for coaching an AI agent so it will probably successfully carry out all duties in a set of associated duties. In the case of site visitors sign management, every job might be one intersection in a job house that features all intersections within the metropolis.

    By specializing in a smaller variety of intersections that contribute probably the most to the algorithm’s total effectiveness, this technique maximizes efficiency whereas conserving the coaching price low.

    The researchers discovered that their method was between 5 and 50 occasions more efficient than customary approaches on an array of simulated duties. This acquire in effectivity helps the algorithm study a greater resolution in a sooner method, in the end bettering the efficiency of the AI agent.

    “We were able to see incredible performance improvements, with a very simple algorithm, by thinking outside the box. An algorithm that is not very complicated stands a better chance of being adopted by the community because it is easier to implement and easier for others to understand,” says senior writer Cathy Wu, the Thomas D. and Virginia W. Cabot Career Development Associate Professor in Civil and Environmental Engineering (CEE) and the Institute for Data, Systems, and Society (IDSS), and a member of the Laboratory for Information and Decision Systems (LIDS).

    She is joined on the paper by lead writer Jung-Hoon Cho, a CEE graduate pupil; Vindula Jayawardana, a graduate pupil within the Department of Electrical Engineering and Computer Science (EECS); and Sirui Li, an IDSS graduate pupil. The analysis will likely be introduced on the Conference on Neural Information Processing Systems.

    Finding a center floor

    To train an algorithm to management site visitors lights at many intersections in a metropolis, an engineer would sometimes select between two predominant approaches. She can train one algorithm for every intersection independently, utilizing solely that intersection’s information, or train a bigger algorithm utilizing information from all intersections after which apply it to each.

    But every strategy comes with its share of downsides. Training a separate algorithm for every job (equivalent to a given intersection) is a time-consuming course of that requires an monumental quantity of information and computation, whereas coaching one algorithm for all duties typically leads to subpar efficiency.

    Wu and her collaborators sought a candy spot between these two approaches.

    For their technique, they select a subset of duties and train one algorithm for every job independently. Importantly, they strategically choose particular person duties that are almost certainly to enhance the algorithm’s total efficiency on all duties.

    They leverage a standard trick from the reinforcement studying discipline referred to as zero-shot switch studying, wherein an already educated mannequin is utilized to a brand new job with out being additional educated. With switch studying, the mannequin typically performs remarkably effectively on the brand new neighbor job.

    “We know it would be ideal to train on all the tasks, but we wondered if we could get away with training on a subset of those tasks, apply the result to all the tasks, and still see a performance increase,” Wu says.

    To establish which duties they need to choose to maximize anticipated efficiency, the researchers developed an algorithm referred to as Model-Based Transfer Learning (MBTL).

    The MBTL algorithm has two items. For one, it fashions how effectively every algorithm would carry out if it have been educated independently on one job. Then it fashions how a lot every algorithm’s efficiency would degrade if it have been transferred to one another job, an idea referred to as generalization efficiency.

    Explicitly modeling generalization efficiency permits MBTL to estimate the worth of coaching on a brand new job.

    MBTL does this sequentially, selecting the duty which leads to the best efficiency acquire first, then choosing further duties that present the most important subsequent marginal enhancements to total efficiency.

    Since MBTL solely focuses on probably the most promising duties, it will probably dramatically enhance the effectivity of the coaching course of.

    Reducing coaching prices

    When the researchers examined this method on simulated duties, together with controlling site visitors alerts, managing real-time pace advisories, and executing a number of basic management duties, it was 5 to 50 occasions more efficient than different strategies.

    This means they may arrive on the similar resolution by coaching on far much less information. For occasion, with a 50x effectivity enhance, the MBTL algorithm might train on simply two duties and obtain the identical efficiency as a regular technique which makes use of information from 100 duties.

    “From the perspective of the two main approaches, that means data from the other 98 tasks was not necessary or that training on all 100 tasks is confusing to the algorithm, so the performance ends up worse than ours,” Wu says.

    With MBTL, including even a small quantity of further coaching time may lead to significantly better efficiency.

    In the longer term, the researchers plan to design MBTL algorithms that may prolong to more complicated issues, equivalent to high-dimensional job areas. They are additionally desirous about making use of their strategy to real-world issues, particularly in next-generation mobility programs.

    The analysis is funded, partly, by a National Science Foundation CAREER Award, the Kwanjeong Educational Foundation PhD Scholarship Program, and an Amazon Robotics PhD Fellowship.

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Online harassment is entering its AI era

    AI

    Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

    AI

    New method could increase LLM training efficiency | Ztoog

    AI

    The human work behind humanoid robots is being hidden

    AI

    NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    AI

    Personalization features can make LLMs more agreeable | Ztoog

    AI

    AI is already making online crimes easier. It could get much worse.

    AI

    NVIDIA Researchers Introduce KVTC Transform Coding Pipeline to Compress Key-Value Caches by 20x for Efficient LLM Serving

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    AI

    Advances in document understanding – Google Research Blog

    Posted by Sandeep Tata, Software Engineer, Google Research, Athena Team

    Science

    World’s largest known deep-sea coral reef is bigger than Vermont

    Scientists have discovered the world’s largest deep-sea coral reef off the japanese coast of the…

    Crypto

    Bitcoin Miners On The Defensive: Market Uncertainty Spurs Revenue Diversification

    Bitcoin mining operators are exploring diversification methods as they anticipate digital asset volatility forward of…

    The Future

    Lionel Messi Unveiling: How to Watch and Stream Inter Miami CF Presentation Event Live on Apple TV

    A crowd of 18,000 folks is predicted at DRV PNK Stadium in Fort Lauderdale, Florida,…

    Gadgets

    Intel’s CPU branding was already confusing, and today’s new CPUs made it worse

    Enlarge / Intel’s Core chips are right here, and they’ve dropped the i and the…

    Our Picks
    The Future

    EV Makers Are Switching to Tesla Chargers. Here’s Why and What It Means

    Technology

    Threads globalizes keyword search, takes aim at Twitter

    Technology

    Animals use physics? Let us count the ways

    Categories
    • AI (1,560)
    • Crypto (1,827)
    • Gadgets (1,870)
    • Mobile (1,910)
    • Science (1,939)
    • Technology (1,862)
    • The Future (1,716)
    Most Popular
    The Future

    China touts ultrafast internet network as homegrown breakthrough

    AI

    Technology Innovation Institute Open-Sourced Falcon LLMs: A New AI Model That Uses Only 75 Percent of GPT-3’s Training Compute, 40 Percent of Chinchilla’s, and 80 Percent of PaLM-62B’s

    AI

    Using reinforcement learning for dynamic planning in open-ended conversations – Ztoog

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2026 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.