Close Menu
Ztoog
    What's Hot
    The Future

    Advice for nations pursuing nuclear power | Ztoog

    Crypto

    Crypto enforcers wielded a heavy hand this year, but don’t expect it to get softer in 2024

    Gadgets

    Wireless TVs use built-in cameras, NFC readers to sell you stuff you see on TV

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      What is Project Management? 5 Best Tools that You Can Try

      Operational excellence strategy and continuous improvement

      Hannah Fry: AI isn’t as powerful as we think

      FanDuel goes all in on responsible gaming push with new Play with a Plan campaign

      Gettyimages.com Is the Best Website on the Internet Right Now

    • Technology

      Iran war: How could it end?

      Democratic senators question CFTC staffing cuts in Chicago enforcement office

      Google’s Cloud AI lead on the three frontiers of model capability

      AMD agrees to backstop a $300M loan from Goldman Sachs for Crusoe to buy AMD AI chips, the first known case of AMD chips used as debt collateral (The Information)

      Productivity apps failed me when I needed them most

    • Gadgets

      macOS Tahoe 26.3.1 update will “upgrade” your M5’s CPU to new “super” cores

      Lenovo Shows Off a ThinkBook Modular AI PC Concept With Swappable Ports and Detachable Displays at MWC 2026

      POCO M8 Review: The Ultimate Budget Smartphone With Some Cons

      The Mission: Impossible of SSDs has arrived with a fingerprint lock

      6 Best Phones With Headphone Jacks (2026), Tested and Reviewed

    • Mobile

      Android’s March update is all about finding people, apps, and your missing bags

      Watch Xiaomi’s global launch event live here

      Our poll shows what buyers actually care about in new smartphones (Hint: it’s not AI)

      Is Strava down for you? You’re not alone

      The Motorola Razr FIFA World Cup 2026 Edition was literally just unveiled, and Verizon is already giving them away

    • Science

      Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance

      Inside the best dark matter detector ever built

      NASA’s Artemis moon exploration programme is getting a major makeover

      Scientists crack the case of “screeching” Scotch tape

      Blue-faced, puffy-lipped monkey scores a rare conservation win

    • AI

      Online harassment is entering its AI era

      Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

      New method could increase LLM training efficiency | Ztoog

      The human work behind humanoid robots is being hidden

      NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    • Crypto

      Google paid startup Form Energy $1B for its massive 100-hour battery

      Ethereum Breakout Alert: Corrective Channel Flip Sparks Impulsive Wave

      Show Your ID Or No Deal

      Jane Street sued for alleged front-running trades that accelerated Terraform Labs meltdown

      Bitcoin Trades Below ETF Cost-Basis As MVRV Signals Mounting Pressure

    Ztoog
    Home » Beyond automatic differentiation – Ztoog
    AI

    Beyond automatic differentiation – Ztoog

    Facebook Twitter Pinterest WhatsApp
    Beyond automatic differentiation – Ztoog
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Posted by Matthew Streeter, Software Engineer, Google Research

    Derivatives play a central function in optimization and machine studying. By domestically approximating a coaching loss, derivatives information an optimizer towards decrease values of the loss. Automatic differentiation frameworks comparable to TensorFlow, PyTorch, and JAX are a necessary a part of fashionable machine studying, making it possible to make use of gradient-based optimizers to coach very advanced fashions.

    But are derivatives all we want? By themselves, derivatives solely inform us how a perform behaves on an infinitesimal scale. To use derivatives successfully, we regularly have to know greater than that. For instance, to decide on a studying charge for gradient descent, we have to know one thing about how the loss perform behaves over a small however finite window. A finite-scale analogue of automatic differentiation, if it existed, might assist us make such selections extra successfully and thereby velocity up coaching.

    In our new paper “Automatically Bounding The Taylor Remainder Series: Tighter Bounds and New Applications”, we current an algorithm referred to as AutoBound that computes polynomial higher and decrease bounds on a given perform, that are legitimate over a user-specified interval. We then start to discover AutoBound’s functions. Notably, we current a meta-optimizer referred to as SafeRate that makes use of the higher bounds computed by AutoBound to derive studying charges which are assured to monotonically scale back a given loss perform, with out the necessity for time-consuming hyperparameter tuning. We are additionally making AutoBound out there as an open-source library.

    The AutoBound algorithm

    Given a perform f and a reference level x0, AutoBound computes polynomial higher and decrease bounds on f that maintain over a user-specified interval referred to as a belief area. Like Taylor polynomials, the bounding polynomials are equal to f at x0. The bounds grow to be tighter because the belief area shrinks, and strategy the corresponding Taylor polynomial because the belief area width approaches zero.

    Automatically-derived quadratic higher and decrease bounds on a one-dimensional perform f, centered at x0=0.5. The higher and decrease bounds are legitimate over a user-specified belief area, and grow to be tighter because the belief area shrinks.

    Like automatic differentiation, AutoBound will be utilized to any perform that may be applied utilizing commonplace mathematical operations. In reality, AutoBound is a generalization of Taylor mode automatic differentiation, and is equal to it within the particular case the place the belief area has a width of zero.

    To derive the AutoBound algorithm, there have been two primary challenges we needed to tackle:

    1. We needed to derive polynomial higher and decrease bounds for numerous elementary features, given an arbitrary reference level and arbitrary belief area.
    2. We needed to provide you with an analogue of the chain rule for combining these bounds.

    Bounds for elementary features

    For quite a lot of commonly-used features, we derive optimum polynomial higher and decrease bounds in closed type. In this context, “optimum” means the bounds are as tight as potential, amongst all polynomials the place solely the maximum-degree coefficient differs from the Taylor collection. Our concept applies to elementary features, comparable to exp and log, and customary neural community activation features, comparable to ReLU and Swish. It builds upon and generalizes earlier work that utilized solely to quadratic bounds, and just for an unbounded belief area.

    Optimal quadratic higher and decrease bounds on the exponential perform, centered at x0=0.5 and legitimate over the interval [0, 2].

    A brand new chain rule

    To compute higher and decrease bounds for arbitrary features, we derived a generalization of the chain rule that operates on polynomial bounds. To illustrate the concept, suppose we have now a perform that may be written as

    and suppose we have already got polynomial higher and decrease bounds on g and h. How will we compute bounds on f?

    The key seems to be representing the higher and decrease bounds for a given perform as a single polynomial whose highest-degree coefficient is an interval relatively than a scalar. We can then plug the sure for h into the sure for g, and convert the end result again to a polynomial of the identical type utilizing interval arithmetic. Under appropriate assumptions in regards to the belief area over which the sure on g holds, it may be proven that this process yields the specified sure on f.

    The interval polynomial chain rule utilized to the features h(x) = sqrt(x) and g(y) = exp(y), with x0=0.25 and belief area [0, 0.5].

    Our chain rule applies to one-dimensional features, but in addition to multivariate features, comparable to matrix multiplications and convolutions.

    Propagating bounds

    Using our new chain rule, AutoBound propagates interval polynomial bounds by means of a computation graph from the inputs to the outputs, analogous to forward-mode automatic differentiation.

    Forward propagation of interval polynomial bounds for the perform f(x) = exp(sqrt(x)). We first compute (trivial) bounds on x, then use the chain rule to compute bounds on sqrt(x) and exp(sqrt(x)).

    To compute bounds on a perform f(x), AutoBound requires reminiscence proportional to the dimension of x. For this cause, sensible functions apply AutoBound to features with a small variety of inputs. However, as we’ll see, this doesn’t stop us from utilizing AutoBound for neural community optimization.

    Automatically deriving optimizers, and different functions

    What can we do with AutoBound that we could not do with automatic differentiation alone?

    Among different issues, AutoBound can be utilized to mechanically derive problem-specific, hyperparameter-free optimizers that converge from any start line. These optimizers iteratively scale back a loss by first utilizing AutoBound to compute an higher sure on the loss that’s tight on the present level, after which minimizing the higher sure to acquire the following level.

    Minimizing a one-dimensional logistic regression loss utilizing quadratic higher bounds derived mechanically by AutoBound.

    Optimizers that use higher bounds on this manner are referred to as majorization-minimization (MM) optimizers. Applied to one-dimensional logistic regression, AutoBound rederives an MM optimizer first printed in 2009. Applied to extra advanced issues, AutoBound derives novel MM optimizers that will be tough to derive by hand.

    We can use an identical concept to take an current optimizer comparable to Adam and convert it to a hyperparameter-free optimizer that’s assured to monotonically scale back the loss (within the full-batch setting). The ensuing optimizer makes use of the identical replace route as the unique optimizer, however modifies the training charge by minimizing a one-dimensional quadratic higher sure derived by AutoBound. We confer with the ensuing meta-optimizer as SafeRate.

    Performance of SafeRate when used to coach a single-hidden-layer neural community on a subset of the MNIST dataset, within the full-batch setting.

    Using SafeRate, we are able to create extra sturdy variants of current optimizers, at the price of a single extra ahead move that will increase the wall time for every step by a small issue (about 2x within the instance above).

    In addition to the functions simply mentioned, AutoBound can be utilized for verified numerical integration and to mechanically show sharper variations of Jensen’s inequality, a elementary mathematical inequality used regularly in statistics and different fields.

    Improvement over classical bounds

    Bounding the Taylor the rest time period mechanically will not be a brand new concept. A classical approach produces diploma okay polynomial bounds on a perform f which are legitimate over a belief area [a, b] by first computing an expression for the okayth spinoff of f (utilizing automatic differentiation), then evaluating this expression over [a,b] utilizing interval arithmetic.

    While elegant, this strategy has some inherent limitations that may result in very unfastened bounds, as illustrated by the dotted blue traces within the determine beneath.

    Quadratic higher and decrease bounds on the lack of a multi-layer perceptron with two hidden layers, as a perform of the preliminary studying charge. The bounds derived by AutoBound are a lot tighter than these obtained utilizing interval arithmetic analysis of the second spinoff.

    Looking ahead

    Taylor polynomials have been in use for over 300 years, and are omnipresent in numerical optimization and scientific computing. Nevertheless, Taylor polynomials have important limitations, which may restrict the capabilities of algorithms constructed on high of them. Our work is a part of a rising literature that acknowledges these limitations and seeks to develop a brand new basis upon which extra sturdy algorithms will be constructed.

    Our experiments up to now have solely scratched the floor of what’s potential utilizing AutoBound, and we imagine it has many functions we have now not found. To encourage the analysis neighborhood to discover such potentialities, we have now made AutoBound out there as an open-source library constructed on high of JAX. To get began, go to our GitHub repo.

    Acknowledgements

    This put up relies on joint work with Josh Dillon. We thank Alex Alemi and Sergey Ioffe for helpful suggestions on an earlier draft of the put up.

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Online harassment is entering its AI era

    AI

    Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

    AI

    New method could increase LLM training efficiency | Ztoog

    AI

    The human work behind humanoid robots is being hidden

    AI

    NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    AI

    Personalization features can make LLMs more agreeable | Ztoog

    AI

    AI is already making online crimes easier. It could get much worse.

    AI

    NVIDIA Researchers Introduce KVTC Transform Coding Pipeline to Compress Key-Value Caches by 20x for Efficient LLM Serving

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Gadgets

    Top 5 Camera Phones For Q3 2023

    In May 2023, we evaluated the perfect smartphones for his or her final digital camera…

    Science

    Plastic bag bans work, new study shows

    Get the Popular Science each day e-newsletter💡 Breakthroughs, discoveries, and DIY suggestions despatched each weekday.…

    AI

    Meet Wisdom AI: An AI Startup that Bring Insights at your Fingertips with AI-Powered Analytics

    With using knowledge, a number of of probably the most outstanding companies and operators worldwide…

    Mobile

    The OnePlus Ace 3V design revealed ahead of this week’s launch

    What you want to knowOnePlus is launching a brand new mid-range cellphone in China this…

    Science

    Why we should all be concerned about the shortage of science teachers

    TEN years in the past, I used to be requested to foretell what science instructing…

    Our Picks
    Technology

    Amazon Olympus is a alleged ChatGPT alternative: Here’s what we know

    Mobile

    iOS 17.3.1 is released to exterminate iPhone bugs including one that Apple singled out

    The Future

    The 6 best Visual Studio time tracking tools

    Categories
    • AI (1,560)
    • Crypto (1,826)
    • Gadgets (1,870)
    • Mobile (1,910)
    • Science (1,939)
    • Technology (1,862)
    • The Future (1,716)
    Most Popular
    The Future

    Apple won’t wait until next year for some iOS 18 Siri improvements

    Gadgets

    The best iPhone tripods of 2023

    Crypto

    Litecoin Whale Deposits Big To Binance, LTC’s 3% Drop To Extend?

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2026 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.