Close Menu
Ztoog
    What's Hot
    The Future

    Twitter’s new CEO rumored to be NBC’s Linda Yaccarino

    AI

    How do you teach an AI model to give therapy?

    Gadgets

    Apple now allows retro game emulators on its App Store—but with big caveats

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Can work-life balance tracking improve well-being?

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

    • Technology

      Elon Musk tries to stick to spaceships

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Deals: the Galaxy S25 series comes with a free tablet, Google Pixels heavily discounted

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

    • Science

      June skygazing: A strawberry moon, the summer solstice… and Asteroid Day!

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      Bitcoin Maxi Isn’t Buying Hype Around New Crypto Holding Firms

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

    Ztoog
    Home » Meta AI Releases Nougat: A Visual Transformer Model that Performs OCR for Processing Scientific Documents into a Markup Language
    AI

    Meta AI Releases Nougat: A Visual Transformer Model that Performs OCR for Processing Scientific Documents into a Markup Language

    Facebook Twitter Pinterest WhatsApp
    Meta AI Releases Nougat: A Visual Transformer Model that Performs OCR for Processing Scientific Documents into a Markup Language
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    With the rising developments within the area of Artificial Intelligence, its sub-fields, together with Natural Language Processing, Natural Language Generation, Computer Vision, and many others., have quickly gained a lot of recognition as a result of their intensive use instances. Optical Character Recognition (OCR) is a well-established and closely investigated space of pc imaginative and prescient. It has a variety of makes use of, akin to doc digitization, handwriting recognition, and scene textual content identification. The recognition of mathematical expressions is one space of OCR that has obtained a lot of curiosity in educational research.

    The Portable Document Format (PDF) is among the most generally used codecs for scientific data, which is commonly preserved in books or revealed in scholarly journals. The second most used knowledge format on the web, accounting for 2.4% of the data, PDFs are steadily used for doc supply. Despite their widespread use, extracting info from PDF information may be tough, notably when coping with extremely specialised supplies like scientific analysis articles. In explicit, when these papers are transformed to PDF format, the semantic info of mathematical expressions is steadily misplaced.

    To handle the challenges, a group of researchers from Meta AI has launched a answer known as Nougat, which stands for “Neural Optical Understanding for Academic Documents.” In order to do Optical Character Recognition (OCR) on scientific texts, Nougat is a Visual Transformer mannequin. Its aim is to remodel these information into a markup language so that they could be extra simply accessed and machine-readable.

    To present the efficacy of the methodology, the group has additionally produced a contemporary dataset of educational papers. This technique presents a viable reply for enhancing scientific data accessibility within the digital age. It fills the hole between written supplies that are easy for folks to learn and textual content that computer systems can course of and analyze. Researchers, educators, and anybody concerned with scientific literature can entry and take care of scientific papers extra successfully utilizing Nougat. Nougat is principally a transformer-based mannequin designed to transform pictures of doc pages, notably these from PDFs, into formatted markup textual content.

    The group has summarized their key contributions as follows –

    1. Publication of a Pre-trained Model: The group has created a pre-trained mannequin that can remodel PDFs into a easy markup language. This pre-trained mannequin is made public on GitHub, the place the analysis neighborhood and anybody can entry it, together with the associated code.
    1. Pipeline for Dataset Creation: A technique for constructing datasets that pair PDF paperwork with their related supply code is described within the research. This dataset improvement technique is essential for testing and refining the Nougat mannequin and could also be helpful for future doc evaluation analysis and purposes.
    1. Dependency on the Page’s Image Only: One of Nougat’s standout options is its capability to function solely on the Page’s Image. This makes it a versatile instrument for extracting content material from a number of sources, even when the unique paperwork will not be obtainable in digital textual content codecs. It can course of scanned papers and books.

    Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t neglect to affix our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, the place we share the most recent AI analysis information, cool AI tasks, and extra.

    If you want our work, you’ll love our publication..


    Tanya Malhotra is a remaining 12 months undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
    She is a Data Science fanatic with good analytical and demanding pondering, together with an ardent curiosity in buying new expertise, main teams, and managing work in an organized method.


    🚀 CodiumAI allows busy builders to generate significant checks (Sponsored)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Rationale engineering generates a compact new tool for gene therapy | Ztoog

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Crypto

    Worldcoin Introduces “Personal Custody” Feature, Enhancing User Privacy

    Worldcoin, a undertaking aimed toward empowering customers, is rolling out a brand new characteristic known…

    Science

    Why NASA’s WB-57 jets are chasing the total solar eclipse

    Solar scientists have been making ready for years for a 4-minute window, throughout the total…

    Mobile

    Xiaomi 14 Ultra vs. Samsung Galaxy S24 Ultra

    With the current world launch of the Xiaomi 14 Ultra, these on a quest for…

    Mobile

    A Google dash cam is the one Nest product I’d seriously consider

    C. Scott Brown / Android AuthorityDash cams are quick changing into important for each driver,…

    AI

    Advancements in machine learning for machine learning – Google Research Blog

    Posted by Phitchaya Mangpo Phothilimthana, Staff Research Scientist, Google DeepMind, and Bryan Perozzi, Senior Staff…

    Our Picks
    Technology

    Read the Lawsuit Against Apple

    AI

    Looking back at wildfire research in 2023 – Google Research Blog

    Crypto

    Texas Votes to Require Exchanges’ Proof of Reserves; Next Stop Governor’s Desk

    Categories
    • AI (1,493)
    • Crypto (1,754)
    • Gadgets (1,805)
    • Mobile (1,851)
    • Science (1,867)
    • Technology (1,803)
    • The Future (1,649)
    Most Popular
    Crypto

    SBF’s defense puts forth a 35-minute last-ditch effort

    Technology

    Why I think prepaid is better for most

    AI

    The creative future of generative AI | Ztoog

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.