Close Menu
Ztoog
    What's Hot
    The Future

    Google Nest Wi-Fi Pro review – A solid Wi-Fi 6e offering on a friendly budget

    Mobile

    Microsoft patent application suggests a true foldable phone is coming with a thin form factor, more

    Technology

    Nvidia’s stellar 2023 performance: A decade’s best in stock market

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

      India-Pak conflict: Pak appoints ISI chief, appointment comes in backdrop of the Pahalgam attack

    • Technology

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

      The more Google kills Fitbit, the more I want a Fitbit Sense 3

    • Gadgets

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

    • Mobile

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

      Fortnite said to return to the US iOS App Store next week following court verdict

    • Science

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

      ‘Dark photon’ theory of light aims to tear up a century of physics

    • AI

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

      The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    • Crypto

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

      Speak at Ztoog Disrupt 2025: Applications now open

    Ztoog
    Home » This AI Paper Proposes A Zero-Shot Personalized Lip2Speech Synthesis Method: A Synthetic Speech Model To Match Lip Movements
    AI

    This AI Paper Proposes A Zero-Shot Personalized Lip2Speech Synthesis Method: A Synthetic Speech Model To Match Lip Movements

    Facebook Twitter Pinterest WhatsApp
    This AI Paper Proposes A Zero-Shot Personalized Lip2Speech Synthesis Method: A Synthetic Speech Model To Match Lip Movements
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    A group of researchers from the University of Science and Technology of China has developed a novel machine-learning mannequin for lip-to-speech (Lip2Speech) synthesis. The mannequin is able to producing customized synthesized speech in zero-shot situations, which means it might probably make predictions associated to knowledge lessons that it didn’t encounter throughout coaching. The researchers launched their strategy leveraging a variational autoencoder—a generative mannequin based mostly on neural networks that encode and decode knowledge.

    Lip2Speech synthesis entails predicting spoken phrases based mostly on the actions of an individual’s lips, and it has numerous real-world purposes. For instance, it might probably help sufferers who can not produce speech sounds in speaking with others, add sound to silent films, restore speech in noisy or broken movies, and even decide conversations in voice-less CCTV footage. While some machine studying fashions have proven promise in Lip2Speech purposes, they typically battle with real-time efficiency and should not skilled utilizing zero-shot studying approaches.

    Typically, to realize zero-shot Lip2Speech synthesis, machine studying fashions require dependable video recordings of audio system to extract extra details about their speech patterns. However, in circumstances the place solely silent or unintelligible movies of a speaker’s face can be found, this data can’t be accessed. The researchers’ mannequin goals to deal with this limitation by producing speech that matches the looks and id of a given speaker with out counting on recordings of their precise speech.

    🚀 JOIN the quickest ML Subreddit Community

    The group proposed a zero-shot customized Lip2Speech synthesis technique that makes use of face pictures to manage speaker identities. They employed a variational autoencoder to disentangle speaker id and linguistic content material representations, permitting speaker embeddings to manage the voice traits of artificial speech for unseen audio system. Additionally, they launched related cross-modal illustration studying to reinforce the power of face-based speaker embeddings (FSE) in voice management.

    To consider the efficiency of their mannequin, the researchers performed a sequence of exams. The outcomes had been exceptional, because the mannequin generated synthesized speech that precisely matched a speaker’s lip actions and their age, gender, and general look. The potential purposes of this mannequin are intensive, starting from assistive instruments for people with speech impairments to video modifying software program and help for police investigations. The researchers highlighted the effectiveness of their proposed technique by means of intensive experiments, demonstrating that the artificial utterances had been extra pure and aligned with the character of the enter video in comparison with different strategies. Importantly, this work represents the primary try at zero-shot customized Lip2Speech synthesis utilizing a face picture somewhat than reference audio to manage voice traits.

    In conclusion, the researchers have developed a machine-learning mannequin for Lip2Speech synthesis that excels in zero-shot situations. The mannequin can generate customized synthesized speech that aligns with a speaker’s look and id by leveraging a variational autoencoder and face pictures. The profitable efficiency of this mannequin opens up potentialities for numerous sensible purposes, akin to aiding people with speech impairments, enhancing video modifying instruments, and aiding in police investigations.

    Check Out The Paper and Reference Article. Don’t neglect to hitch our 24k+ ML SubReddit, Discord Channel, and Email Newsletter, the place we share the most recent AI analysis information, cool AI tasks, and extra. If you could have any questions relating to the above article or if we missed something, be at liberty to electronic mail us at Asif@marktechpost.com

    🚀 Check Out 100’s AI Tools in AI Tools Club


    Niharika is a Technical consulting intern at Marktechpost. She is a 3rd 12 months undergraduate, at the moment pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a extremely enthusiastic particular person with a eager curiosity in Machine studying, Data science and AI and an avid reader of the most recent developments in these fields.


    ➡️ Try: Ake: A Superb Residential Proxy Network (Sponsored)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    AI

    The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    AI

    Novel method detects microbial contamination in cell cultures | Ztoog

    AI

    Seeing AI as a collaborator, not a creator

    AI

    “Periodic table of machine learning” could fuel AI discovery | Ztoog

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Technology

    Blade Strike on Landing Ends Mars Helicopter’s Epic Journey

    The Ingenuity Mars Helicopter made its 72nd and closing flight on 18 January. “While the…

    The Future

    Samsung’s Galaxy Tab A9 Plus comes with 5G for just $269

    The A9 Plus presents an 11-inch display screen with a clean 90Hz refresh fee —…

    Technology

    Hidden underground hydrogen reserves could power the entire Earth for centuries

    Futurology A brand new research has unveiled a discovery beneath the Earth’s floor: an enormous…

    Science

    Scientists can calculate the shape colliding bubbles will form

    Two touching or “kissing” cleaning soap bubbles can detach, slide alongside one another sideways or…

    AI

    Symbol tuning improves in-context learning in language models – Google Research Blog

    Posted by Jerry Wei, Student Researcher, and Denny Zhou, Principal Scientist, Google Research

    Our Picks
    Crypto

    Expert Reveals 4 Reasons To Be Bullish On Q4

    Mobile

    WhatsApp may soon give its interface a fresh coat of paint

    Science

    Ski Resorts Are Stockpiling Snow to Get Through Warm Winters

    Categories
    • AI (1,482)
    • Crypto (1,744)
    • Gadgets (1,796)
    • Mobile (1,839)
    • Science (1,853)
    • Technology (1,789)
    • The Future (1,635)
    Most Popular
    Crypto

    Analyst Presents 4 Charts That Prove Crypto Is Not Dead

    Gadgets

    Jony Ive and OpenAI’s Altman reportedly collaborating on mysterious AI device

    Crypto

    What To Expect If Historical Bitcoin Halving Cycles Repeat

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.