Close Menu
Ztoog
    What's Hot
    Mobile

    The OnePlus Open has been confirmed to be the same device as the OPPO Find N3

    Science

    The hunt for dark matter: The universe’s mysterious gravitational glue

    The Future

    Acer Predator Helios Neo 16 Laptop Review — Affordable mobile gaming doesn’t have to be all compromises

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

      India-Pak conflict: Pak appoints ISI chief, appointment comes in backdrop of the Pahalgam attack

    • Technology

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

      The more Google kills Fitbit, the more I want a Fitbit Sense 3

    • Gadgets

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

    • Mobile

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

      Fortnite said to return to the US iOS App Store next week following court verdict

    • Science

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

      ‘Dark photon’ theory of light aims to tear up a century of physics

    • AI

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

      The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    • Crypto

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

      Speak at Ztoog Disrupt 2025: Applications now open

    Ztoog
    Home » A framework for health equity assessment of machine learning performance – Google Research Blog
    AI

    A framework for health equity assessment of machine learning performance – Google Research Blog

    Facebook Twitter Pinterest WhatsApp
    A framework for health equity assessment of machine learning performance – Google Research Blog
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Posted by Mike Schaekermann, Research Scientist, Google Research, and Ivor Horn, Chief Health Equity Officer & Director, Google Core

    Health equity is a serious societal concern worldwide with disparities having many causes. These sources embrace limitations in entry to healthcare, variations in scientific remedy, and even elementary variations within the diagnostic know-how. In dermatology for instance, pores and skin most cancers outcomes are worse for populations equivalent to minorities, these with decrease socioeconomic standing, or people with restricted healthcare entry. While there’s nice promise in current advances in machine learning (ML) and synthetic intelligence (AI) to assist enhance healthcare, this transition from analysis to bedside have to be accompanied by a cautious understanding of whether or not and the way they impression health equity.

    Health equity is outlined by public health organizations as equity of alternative for everybody to be as wholesome as potential. Importantly, equity could also be completely different from equality. For instance, individuals with higher obstacles to enhancing their health could require extra or completely different effort to expertise this honest alternative. Similarly, equity just isn’t equity as outlined within the AI for healthcare literature. Whereas AI equity typically strives for equal performance of the AI know-how throughout completely different affected person populations, this doesn’t middle the purpose of prioritizing performance with respect to pre-existing health disparities.

    Health equity concerns. An intervention (e.g., an ML-based software, indicated in darkish blue) promotes health equity if it helps cut back current disparities in health outcomes (indicated in lighter blue).

    In “Health Equity Assessment of machine Learning performance (HEAL): a framework and dermatology AI model case study”, revealed in The Lancet eClinicalMedicine, we suggest a strategy to quantitatively assess whether or not ML-based health applied sciences carry out equitably. In different phrases, does the ML mannequin carry out effectively for these with the worst health outcomes for the situation(s) the mannequin is supposed to handle? This purpose anchors on the precept that health equity ought to prioritize and measure mannequin performance with respect to disparate health outcomes, which can be because of a quantity of components that embrace structural inequities (e.g., demographic, social, cultural, political, financial, environmental and geographic).

    The health equity framework (HEAL)

    The HEAL framework proposes a 4-step course of to estimate the chance that an ML-based health know-how performs equitably:

    1. Identify components related to health inequities and outline software performance metrics,
    2. Identify and quantify pre-existing health disparities,
    3. Measure the performance of the software for every subpopulation,
    4. Measure the chance that the software prioritizes performance with respect to health disparities.

    The closing step’s output is termed the HEAL metric, which quantifies how anticorrelated the ML mannequin’s performance is with health disparities. In different phrases, does the mannequin carry out higher with populations which have the more serious health outcomes?

    This 4-step course of is designed to tell enhancements for making ML mannequin performance extra equitable, and is supposed to be iterative and re-evaluated regularly. For instance, the supply of health outcomes information in step (2) can inform the selection of demographic components and brackets in step (1), and the framework could be utilized once more with new datasets, fashions and populations.

    Framework for Health Equity Assessment of machine Learning performance (HEAL). Our guideline is to keep away from exacerbating health inequities, and these steps assist us establish disparities and assess for inequitable mannequin performance to maneuver in direction of higher outcomes for all.

    With this work, we take a step in direction of encouraging express assessment of the health equity concerns of AI applied sciences, and encourage prioritization of efforts throughout mannequin improvement to scale back health inequities for subpopulations uncovered to structural inequities that may precipitate disparate outcomes. We ought to notice that the current framework doesn’t mannequin causal relationships and, due to this fact, can not quantify the precise impression a brand new know-how may have on lowering health end result disparities. However, the HEAL metric could assist establish alternatives for enchancment, the place the present performance just isn’t prioritized with respect to pre-existing health disparities.

    Case examine on a dermatology mannequin

    As an illustrative case examine, we utilized the framework to a dermatology mannequin, which makes use of a convolutional neural community just like that described in prior work. This instance dermatology mannequin was skilled to categorise 288 pores and skin circumstances utilizing a improvement dataset of 29k circumstances. The enter to the mannequin consists of three images of a pores and skin concern together with demographic info and a quick structured medical historical past. The output consists of a ranked checklist of potential matching pores and skin circumstances.

    Using the HEAL framework, we evaluated this mannequin by assessing whether or not it prioritized performance with respect to pre-existing health outcomes. The mannequin was designed to foretell potential dermatologic circumstances (from a listing of a whole lot) based mostly on images of a pores and skin concern and affected person metadata. Evaluation of the mannequin is completed utilizing a top-3 settlement metric, which quantifies how typically the highest 3 output circumstances match the probably situation as advised by a dermatologist panel. The HEAL metric is computed by way of the anticorrelation of this top-3 settlement with health end result rankings.

    We used a dataset of 5,420 teledermatology circumstances, enriched for variety in age, intercourse and race/ethnicity, to retrospectively consider the mannequin’s HEAL metric. The dataset consisted of “store-and-forward” circumstances from sufferers of 20 years or older from major care suppliers within the USA and pores and skin most cancers clinics in Australia. Based on a evaluate of the literature, we determined to discover race/ethnicity, intercourse and age as potential components of inequity, and used sampling methods to make sure that our analysis dataset had ample illustration of all race/ethnicity, intercourse and age teams. To quantify pre-existing health outcomes for every subgroup we relied on measurements from public databases endorsed by the World Health Organization, equivalent to Years of Life Lost (YLLs) and Disability-Adjusted Life Years (DALYs; years of life misplaced plus years lived with incapacity).

    HEAL metric for all dermatologic circumstances throughout race/ethnicity subpopulations, together with health outcomes (YLLs per 100,000), mannequin performance (top-3 settlement), and rankings for health outcomes and power performance.
    (* Higher is best; measures the chance the mannequin performs equitably with respect to the axes on this desk.)
    HEAL metric for all dermatologic circumstances throughout sexes, together with health outcomes (DALYs per 100,000), mannequin performance (top-3 settlement), and rankings for health outcomes and power performance. (* As above.)

    Our evaluation estimated that the mannequin was 80.5% prone to carry out equitably throughout race/ethnicity subgroups and 92.1% prone to carry out equitably throughout sexes.

    However, whereas the mannequin was prone to carry out equitably throughout age teams for most cancers circumstances particularly, we found that it had room for enchancment throughout age teams for non-cancer circumstances. For instance, these 70+ have the poorest health outcomes associated to non-cancer pores and skin circumstances, but the mannequin did not prioritize performance for this subgroup.

    HEAL metrics for all most cancers and non-cancer dermatologic circumstances throughout age teams, together with health outcomes (DALYs per 100,000), mannequin performance (top-3 settlement), and rankings for health outcomes and power performance. (* As above.)

    Putting issues in context

    For holistic analysis, the HEAL metric can’t be employed in isolation. Instead this metric needs to be contextualized alongside many different components starting from computational effectivity and information privateness to moral values, and facets which will affect the outcomes (e.g., choice bias or variations in representativeness of the analysis information throughout demographic teams).

    As an adversarial instance, the HEAL metric could be artificially improved by intentionally lowering mannequin performance for essentially the most advantaged subpopulation till performance for that subpopulation is worse than all others. For illustrative functions, given subpopulations A and B the place A has worse health outcomes than B, contemplate the selection between two fashions: Model 1 (M1) performs 5% higher for subpopulation A than for subpopulation B. Model 2 (M2) performs 5% worse on subpopulation A than B. The HEAL metric could be greater for M1 as a result of it prioritizes performance on a subpopulation with worse outcomes. However, M1 could have absolute performances of simply 75% and 70% for subpopulations A and B respectively, whereas M2 has absolute performances of 75% and 80% for subpopulations A and B respectively. Choosing M1 over M2 would result in worse general performance for all subpopulations as a result of some subpopulations are worse-off whereas no subpopulation is better-off.

    Accordingly, the HEAL metric needs to be used alongside a Pareto situation (mentioned additional within the paper), which restricts mannequin adjustments in order that outcomes for every subpopulation are both unchanged or improved in comparison with the established order, and performance doesn’t worsen for any subpopulation.

    The HEAL framework, in its present kind, assesses the chance that an ML-based mannequin prioritizes performance for subpopulations with respect to pre-existing health disparities for particular subpopulations. This differs from the purpose of understanding whether or not ML will cut back disparities in outcomes throughout subpopulations in actuality. Specifically, modeling enhancements in outcomes requires a causal understanding of steps within the care journey that occur each earlier than and after use of any given mannequin. Future analysis is required to handle this hole.

    Conclusion

    The HEAL framework permits a quantitative assessment of the chance that health AI applied sciences prioritize performance with respect to health disparities. The case examine demonstrates tips on how to apply the framework within the dermatological area, indicating a excessive chance that mannequin performance is prioritized with respect to health disparities throughout intercourse and race/ethnicity, but in addition revealing the potential for enhancements for non-cancer circumstances throughout age. The case examine additionally illustrates limitations within the skill to use all beneficial facets of the framework (e.g., mapping societal context, availability of information), thus highlighting the complexity of health equity concerns of ML-based instruments.

    This work is a proposed method to handle a grand problem for AI and health equity, and will present a helpful analysis framework not solely throughout mannequin improvement, however throughout pre-implementation and real-world monitoring phases, e.g., within the kind of health equity dashboards. We maintain that the power of the HEAL framework is in its future software to varied AI instruments and use circumstances and its refinement within the course of. Finally, we acknowledge {that a} profitable method in direction of understanding the impression of AI applied sciences on health equity must be greater than a set of metrics. It would require a set of targets agreed upon by a group that represents those that can be most impacted by a mannequin.

    Acknowledgements

    The analysis described right here is joint work throughout many groups at Google. We are grateful to all our co-authors: Terry Spitz, Malcolm Pyles, Heather Cole-Lewis, Ellery Wulczyn, Stephen R. Pfohl, Donald Martin, Jr., Ronnachai Jaroensri, Geoff Keeling, Yuan Liu, Stephanie Farquhar, Qinghan Xue, Jenna Lester, Cían Hughes, Patricia Strachan, Fraser Tan, Peggy Bui, Craig H. Mermel, Lily H. Peng, Yossi Matias, Greg S. Corrado, Dale R. Webster, Sunny Virmani, Christopher Semturs, Yun Liu, and Po-Hsuan Cameron Chen. We additionally thank Lauren Winer, Sami Lachgar, Ting-An Lin, Aaron Loh, Morgan Du, Jenny Rizk, Renee Wong, Ashley Carrick, Preeti Singh, Annisah Um’rani, Jessica Schrouff, Alexander Brown, and Anna Iurchenko for their assist of this undertaking.

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    AI

    The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    AI

    Novel method detects microbial contamination in cell cultures | Ztoog

    AI

    Seeing AI as a collaborator, not a creator

    AI

    “Periodic table of machine learning” could fuel AI discovery | Ztoog

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Science

    Not all underwater reefs are made of coral

    This article was initially featured on The Conversation. When individuals hear about underwater reefs, they…

    Technology

    Apple gets a hefty fine for App Store’s abusive ‘anti-steering’ provisions

    Aamir Siddiqui / Android AuthorityTL;DR The European Commission has fined Apple over €1.8 billion (~$1.95…

    Crypto

    Crypto market showing signs of recovery as prices, NFT sales rise on the month

    Welcome again to Chain Reaction. To get a roundup of Ztoog’s largest and most vital…

    Technology

    Best Telescopes for Deep Space in 2023

    Many corporations featured on ReadWrite accomplice with us. Opinions are our personal, however compensation and…

    Crypto

    Is Cardano Poised for A Surge? A Look At Its Tight Consolidation

    The Cardano value has persistently remained beneath the $0.38 zone for the previous two weeks,…

    Our Picks
    Crypto

    BlackRock’s IBIT Maintains Lead In Bitcoin ETF Race, Crosses $2 Billion In Inflows

    Mobile

    HONOR takes aim at deepfakes and eye fatigue with new AI features

    Science

    The US Has Big Plans for Wind Energy—but an Obscure 1920s Law Is Getting in the Way

    Categories
    • AI (1,482)
    • Crypto (1,744)
    • Gadgets (1,796)
    • Mobile (1,839)
    • Science (1,853)
    • Technology (1,789)
    • The Future (1,635)
    Most Popular
    Science

    The Ultra-Efficient Farm of the Future Is in the Sky

    AI

    The people paid to train AI are outsourcing their work… to AI

    Mobile

    Samsung’s affordable Galaxy A33 5G is getting updated to Android 14

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.