Close Menu
Ztoog
    What's Hot
    Mobile

    Deals: Galaxy S24 FE and S24 Ultra, new wearables

    Gadgets

    MIT Researchers Develop New Way Of Cleaning The Air And Oceans

    Science

    How to trap cosmic rays in a jar at home

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

      India-Pak conflict: Pak appoints ISI chief, appointment comes in backdrop of the Pahalgam attack

    • Technology

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

      The more Google kills Fitbit, the more I want a Fitbit Sense 3

    • Gadgets

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

      Snapdragon X Plus Could Bring Faster, More Powerful Chromebooks

    • Mobile

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

      Fortnite said to return to the US iOS App Store next week following court verdict

    • Science

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

      ‘Dark photon’ theory of light aims to tear up a century of physics

    • AI

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

      The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    • Crypto

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

      Speak at Ztoog Disrupt 2025: Applications now open

    Ztoog
    Home » On-device content distillation with graph neural networks – Google Research Blog
    AI

    On-device content distillation with graph neural networks – Google Research Blog

    Facebook Twitter Pinterest WhatsApp
    On-device content distillation with graph neural networks – Google Research Blog
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Posted by Gabriel Barcik and Duc-Hieu Tran, Research Engineers, Google Research

    In immediately’s digital age, smartphones and desktop net browsers function the first instruments for accessing information and data. However, the proliferation of web site litter — encompassing complicated layouts, navigation parts, and extraneous hyperlinks — considerably impairs each the studying expertise and article navigation. This subject is especially acute for people with accessibility necessities.

    To enhance the consumer expertise and make studying extra accessible, Android and Chrome customers might leverage the Reading Mode characteristic, which boosts accessibility by processing webpages to permit customizable distinction, adjustable textual content measurement, extra legible fonts, and to allow text-to-speech utilities. Additionally, Android’s Reading Mode is provided to distill content from apps. Expanding Reading Mode to embody a big selection of content and enhancing its efficiency, whereas nonetheless working regionally on the consumer’s gadget with out transmitting knowledge externally, poses a singular problem.

    To broaden Reading Mode capabilities with out compromising privateness, we’ve developed a novel on-device content distillation mannequin. Unlike early makes an attempt utilizing DOM Distiller — a heuristic strategy restricted to information articles — our mannequin excels in each high quality and flexibility throughout varied varieties of content. We be sure that article content would not depart the confines of the native atmosphere. Our on-device content distillation mannequin easily transforms long-form content right into a easy and customizable format for a extra nice studying journey whereas additionally outperforming the main different approaches. Here we discover particulars of this analysis highlighting our strategy, methodology, and outcomes.

    Graph neural networks

    Instead of counting on difficult heuristics which can be tough to take care of and scale to a wide range of article layouts, we strategy this process as a totally supervised studying downside. This data-driven strategy permits the mannequin to generalize higher throughout totally different layouts, with out the constraints and fragility of heuristics. Previous work for optimizing the studying expertise relied on HTML or parsing, filtering, and modeling of a doc object mannequin (DOM), a programming interface routinely generated by the consumer’s net browser from web site HTML that represents the construction of a doc and permits it to be manipulated.

    The new Reading Mode mannequin depends on accessibility timber, which offer a streamlined and extra accessible illustration of the DOM. Accessibility timber are routinely generated from the DOM tree and are utilized by assistive applied sciences to permit folks with disabilities to work together with net content. These can be found on Chrome Web browser and on Android by means of AccessibilityNodeInfo objects, that are supplied for each WebView and native utility content.

    We began by manually amassing and annotating accessibility timber. The Android dataset used for this challenge contains on the order of 10k labeled examples, whereas the Chrome dataset comprises roughly 100k labeled examples. We developed a novel device that makes use of graph neural networks (GNNs) to distill important content from the accessibility timber utilizing a multi-class supervised studying strategy. The datasets encompass long-form articles sampled from the online and labeled with courses resembling headline, paragraph, photographs, publication date, and many others.

    GNNs are a pure alternative for dealing with tree-like knowledge constructions, as a result of not like conventional fashions that usually demand detailed, hand-crafted options to know the format and hyperlinks inside such timber, GNNs be taught these connections naturally. To illustrate this, think about the analogy of a household tree. In such a tree, every node represents a member of the family and the connections denote familial relationships. If one have been to foretell sure traits utilizing typical fashions, options just like the “variety of quick members of the family with a trait” may be wanted. However, with GNNs, such guide characteristic crafting turns into redundant. By straight feeding the tree construction into the mannequin, GNNs make the most of a message-passing mechanism the place every node communicates with its neighbors. Over time, info will get shared and amassed throughout the community, enabling the mannequin to naturally discern intricate relationships.

    Returning to the context of accessibility timber, which means GNNs can effectively distill content by understanding and leveraging the inherent construction and relationships inside the tree. This functionality permits them to determine and presumably omit non-essential sections based mostly on the data movement inside the tree, guaranteeing extra correct content distillation.

    Our structure closely follows the encode-process-decode paradigm utilizing a message-passing neural community to categorise textual content nodes. The total design is illustrated within the determine under. The tree illustration of the article is the enter to the mannequin. We compute light-weight options based mostly on bounding field info, textual content info, and accessibility roles. The GNN then propagates every node’s latent illustration by means of the sides of the tree utilizing a message-passing neural community. This propagation course of permits close by nodes, containers, and textual content parts to share contextual info with one another, enhancing the mannequin’s understanding of the web page’s construction and content. Each node then updates its present state based mostly on the message obtained, offering a extra knowledgeable foundation for classifying the nodes. After a hard and fast variety of message-passing steps, the now contextualized latent representations of the nodes are decoded into important or non-essential courses. This strategy permits the mannequin to leverage each the inherent relationships within the tree and the hand-crafted options representing every node, thereby enriching the ultimate classification.

    A visible demonstration of the algorithm in motion, processing an article on a cell gadget. A graph neural community (GNN) is used to distill important content from an article. 1. A tree illustration of the article is extracted from the appliance. 2. Lightweight options are computed for every node, represented as vectors. 3. A message-passing neural community propagates info by means of the sides of the tree and updates every node illustration. 4. Leaf nodes containing textual content content are labeled as important or non-essential content. 5. A decluttered model of the appliance consists based mostly on the GNN output.

    We intentionally limit the characteristic set utilized by the mannequin to extend its broad generalization throughout languages and velocity up inference latency on consumer gadgets. This was a singular problem, as we would have liked to create an on-device light-weight mannequin that would protect privateness.

    Our ultimate light-weight Android mannequin has 64k parameters and is 334kB in measurement with a median latency of 800ms, whereas the Chrome mannequin has 241k parameters, is 928kB in measurement, and has a 378ms median latency. By using such on-device processing, we be sure that consumer knowledge by no means leaves the gadget, reinforcing our accountable strategy and dedication to consumer privateness. The options used within the mannequin might be grouped into intermediate node options, leaf-node textual content options, and factor place options. We carried out characteristic engineering and have choice to optimize the set of options for mannequin efficiency and mannequin measurement. The ultimate mannequin was reworked into TensorFlow Lite format to deploy as an on-device mannequin on Android or Chrome.

    Results

    We educated the GNN for about 50 epochs in a single GPU. The efficiency of the Android mannequin on webpages and native utility check units is introduced under:

    The desk presents the content distillation metrics in Android for webpages and native apps. We report precision, recall and F1-score for 3 courses: non-essential content, headline, and major physique textual content, together with macro common and weighted common by variety of cases in every class. Node metrics assess the classification efficiency on the granularity of the accessibility tree node, which is analogous to a paragraph stage. In distinction, phrase metrics consider classification at a person phrase stage, which means every phrase inside a node will get the identical classification.

    In assessing the outcomes’ high quality on generally visited webpage articles, an F1-score exceeding 0.9 for main-text (primarily paragraphs) corresponds to 88% of those articles being processed with out lacking any paragraphs. Furthermore, in over 95% of instances, the distillation proves to be worthwhile for readers. Put merely, the overwhelming majority of readers will understand the distilled content as each pertinent and exact, with errors or omissions being an rare incidence.

    The comparability of Chrome content distillation with different fashions resembling DOM Distiller or Mozilla Readability on a set of English language pages is introduced within the desk under. We reuse the metrics from machine translation to match the standard of those fashions. The reference textual content is from the groundtruth major content and the textual content from the fashions as speculation textual content. The outcomes present the wonderful efficiency of our fashions compared to different DOM-based approaches.

    The desk presents the comparability between DOM-Distiller, Mozilla Readability and the brand new Chrome mannequin. We report text-based metrics, resembling BLUE, CHRF and ROUGE, by evaluating the principle physique textual content distilled from every mannequin to a ground-truth textual content manually labeled by raters utilizing our annotation coverage.

    The F1-score of the Chrome content distillation mannequin for headline and major textual content content on the check units of various extensively spoken languages demonstrates that the Chrome mannequin, particularly, is ready to help a variety of languages.

    The desk presents per language of F1-scores of the Chrome mannequin for the headline and major textual content courses. The language codes correspond to the next languages: German, English, Spanish, French, Italian, Persian, Japanese, Korean, Portuguese, Vietnamese, simplified Chinese and conventional Chinese.

    Conclusion

    The digital age calls for each streamlined content presentation and an unwavering dedication to consumer privateness. Our analysis highlights the effectiveness of Reading Mode in platforms like Android and Chrome, providing an revolutionary, data-driven strategy to content parsing by means of Graph Neural Networks. Crucially, our light-weight on-device mannequin ensures that content distillation happens with out compromising consumer knowledge, with all processes executed regionally. This not solely enhances the studying expertise but in addition reinforces our dedication to consumer privateness. As we navigate the evolving panorama of digital content consumption, our findings underscore the paramount significance of prioritizing the consumer in each expertise and safety.

    Acknowledgements

    This challenge is the results of joint work with Manuel Tragut, Mihai Popa, Abodunrinwa Toki, Abhanshu Sharma, Matt Sharifi, David Petrou and Blaise Aguera y Arcas. We sincerely thank our collaborators Gang Li and Yang Li. We are very grateful to Tom Small for helping us in making ready the submit.

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    AI

    The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    AI

    Novel method detects microbial contamination in cell cultures | Ztoog

    AI

    Seeing AI as a collaborator, not a creator

    AI

    “Periodic table of machine learning” could fuel AI discovery | Ztoog

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Crypto

    PEPE Whale Makes $8.13M In Profit As Bullish Rally Continues

    Pepe coin (PEPE) has been displaying an unbelievable efficiency alongside the remainder of the crypto…

    Mobile

    Mid-range phones could get 2024 flagship power later this year

    Ryan McLeod / Android AuthorityTL;DR A veteran leaker has claimed that an upcoming Snapdragon 7…

    Technology

    From Baby Talk to Baby A.I.

    We ask quite a lot of ourselves as infants. Somehow we should develop from sensory…

    Technology

    SEC filing: Better.com plans to go public via a SPAC merger, raising $750M, after initially planning a $6B SPAC in May 2021 (Connie Kim/HousingWire)

    Connie Kim / HousingWire: SEC submitting: Better.com plans to go public via a SPAC merger,…

    Technology

    Brave appears to be selling copyrighted data for AI training and giving third parties the "rights" to that data, while not disclosing its own robot crawler (Alex Ivanovs/Stack Diary)

    Alex Ivanovs / Stack Diary: Brave appears to be selling copyrighted data for AI training…

    Our Picks
    AI

    This AI Paper Introduces StepCoder: A Novel Reinforcement Learning Framework for Code Generation

    Mobile

    It’s official: Here’s when Galaxy AI features will roll out to Samsung phones

    Science

    There’s a Huge Covid Surge Right Now and Nobody Is Talking About It

    Categories
    • AI (1,482)
    • Crypto (1,744)
    • Gadgets (1,795)
    • Mobile (1,839)
    • Science (1,853)
    • Technology (1,789)
    • The Future (1,635)
    Most Popular
    Gadgets

    Apple announces sweeping EU App Store policy changes—including sideloading

    The Future

    Win a Treblab HD77 Portable Wireless Speaker – Review Geek

    Mobile

    What it takes for a successful phone launch in North America

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.