Close Menu
Ztoog
    What's Hot
    Mobile

    Apple celebrates the New Year in Japan with free gift card promo and engraved AirTag trackers

    Science

    Nuclear clock: How the most precise timepiece ever could change our view of the cosmos

    Crypto

    Ethereums Future: Will Ethereum Recover?

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Are entangled qubits following a quantum Moore’s law?

      Disneyland’s 70th Anniversary Brings Cartoony Chaos to This Summer’s Celebration

      Story of military airfield in Afghanistan that Biden left in 2021

      Tencent hires WizardLM team, a Microsoft AI group with an odd history

      Today’s NYT Connections Hints, Answers for May 12, #701

    • Technology

      Crypto elite increasingly worried about their personal safety

      Deep dive on the evolution of Microsoft's relationship with OpenAI, from its $1B investment in 2019 through Copilot rollouts and ChatGPT's launch to present day (Bloomberg)

      New leak reveals iPhone Fold won’t look like the Galaxy Z Fold 6 at all

      Apple will use AI and user data in iOS 19 to extend iPhone battery life

      Today’s NYT Wordle Hints, Answer and Help for May 12, #1423

    • Gadgets

      The market’s down, but this OpenAI for the stock market can help you trade up

      We Hand-Picked the 24 Best Deals From the 2025 REI Anniversary Sale

      “Google wanted that”: Nextcloud decries Android permissions as “gatekeeping”

      Google Tests Automatic Password-to-Passkey Conversion On Android

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

    • Mobile

      The Forerunner 570 & 970 have made Garmin’s tiered strategy clearer than ever

      The iPhone Fold is now being tested with an under-display camera

      T-Mobile takes over one of golf’s biggest events, unleashes unique experiences

      Fitbit’s AI experiments just leveled up with 3 new health tracking features

      Motorola’s Moto Watch needs to start living up to the brand name

    • Science

      Risk of a star destroying the solar system is higher than expected

      Do these Buddhist gods hint at the purpose of China’s super-secret satellites?

      From Espresso to Eco-Brick: How Coffee Waste Fuels 3D-Printed Design

      Ancient three-eyed ‘sea moth’ used its butt to breathe

      Intelligence on Earth Evolved Independently at Least Twice

    • AI

      With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

      Google DeepMind’s new AI agent cracks real-world problems better than humans can

      Study shows vision-language models can’t handle queries with negation words | Ztoog

      How a new type of AI is helping police skirt facial recognition bans

      Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

    • Crypto

      Is Bitcoin Bull Run Back? Daily RSI Shows Only Mild Bullish Momentum

      Robinhood grows its footprint in Canada by acquiring WonderFi

      HashKey Group Announces Launch of HashKey Global MENA with VASP License in UAE

      Ethereum Breaks Key Resistance In One Massive Move – Higher High Confirms Momentum

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

    Ztoog
    Home » This AI Paper from China Introduces StreamVoice: A Novel Language Model-Based Zero-Shot Voice Conversion System Designed for Streaming Scenarios
    AI

    This AI Paper from China Introduces StreamVoice: A Novel Language Model-Based Zero-Shot Voice Conversion System Designed for Streaming Scenarios

    Facebook Twitter Pinterest WhatsApp
    This AI Paper from China Introduces StreamVoice: A Novel Language Model-Based Zero-Shot Voice Conversion System Designed for Streaming Scenarios
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Recent advances in language fashions showcase spectacular zero-shot voice conversion (VC) capabilities. Nevertheless, prevailing VC fashions rooted in language fashions normally make the most of offline conversion from supply semantics to acoustic options, necessitating the whole lot of the supply speech and limiting their software to real-time situations.

    In this analysis, a staff of researchers from Northwestern Polytechnical University, China, and ByteDance introduce StreamVoice. StreamVoice is a novel streaming language mannequin (LM)-based methodology for zero-shot voice conversion (VC), permitting real-time conversion with any speaker prompts and supply speech. StreamVoice achieves streaming functionality by using a completely causal context-aware LM with a temporal-independent acoustic predictor.  

    This mannequin alternately processes semantic and acoustic options at every autoregression time step, eliminating the necessity for full supply speech. To mitigate potential efficiency degradation in streaming processing because of incomplete context, two methods are employed: 

    1) teacher-guided context foresight, the place a instructor mannequin summarises current and future semantic context throughout coaching to information the mannequin’s forecasting for lacking context.

    2) semantic masking technique, selling acoustic prediction from previous corrupted semantic and acoustic enter to reinforce context-learning means. Notably, StreamVoice stands out as the primary LM-based streaming zero-shot VC mannequin with none future look-ahead. Experimental outcomes showcase StreamVoice’s streaming conversion functionality whereas sustaining zero-shot efficiency corresponding to non-streaming VC programs.

    The above determine demonstrates the idea of the streaming zero-shot VC using the broadly used recognition-synthesis framework. StreamVoice is constructed on this fashionable paradigm. The experiments carried out illustrate that StreamVoice reveals the aptitude to conduct speech conversion in a streaming style, reaching excessive speaker similarity for each acquainted and unfamiliar audio system. It maintains efficiency ranges corresponding to non-streaming voice conversion (VC) programs. As the preliminary language mannequin (LM)-based zero-shot VC mannequin with none future lookahead, StreamVoice’s complete pipeline incurs solely 124 ms latency for the conversion course of. This is notably 2.4 instances quicker than real-time on a single A100 GPU, even with out engineering optimizations. The staff’s future work entails utilizing extra coaching knowledge to enhance StreamVoice’s modeling means. They additionally plan to optimize the streaming pipeline, incorporating a high-fidelity codec with a low bitrate and a unified streaming mannequin.


    Check out the Paper. All credit score for this analysis goes to the researchers of this challenge. Also, don’t overlook to comply with us on Twitter. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

    If you want our work, you’ll love our e-newsletter..

    Don’t Forget to affix our Telegram Channel


    Janhavi Lande, is an Engineering Physics graduate from IIT Guwahati, class of 2023. She is an upcoming knowledge scientist and has been working on this planet of ml/ai analysis for the previous two years. She is most fascinated by this ever altering world and its fixed demand of people to maintain up with it. In her pastime she enjoys touring, studying and writing poems.


    🧑‍💻 [FREE AI WEBINAR] ‘Build Real-Time Document/Image Analytics with GPT-4 Vision’ (Jan 29, 2024)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    AI

    Study shows vision-language models can’t handle queries with negation words | Ztoog

    AI

    How a new type of AI is helping police skirt facial recognition bans

    AI

    Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    AI

    Inside China’s electric-vehicle-to-humanoid-robot pivot | MIT Technology Review

    Now our intrepid China reporter, Caiwei Chen, has recognized a brand new development unfolding inside…

    Gadgets

    Apple Intelligence and other features won’t launch in the EU this year

    Enlarge / Features like Image Playground will not arrive in Europe at the similar time…

    Science

    Self-healing materials: the day when screens heal their scars on their own is coming closer

    Wolverine and his capability to self-heal his wounds, because of his regenerative capability, should have…

    Mobile

    Google officially killed Driving Mode after stripping most of its features in 2024

    What it’s good to knowAssistant Driving Mode in Google Maps is officially gone for good,…

    Science

    The snow forecast for Mars: Dry ice and a meter a year

    Enlarge / Some of the ice close to the South Pole of Mars stays round…

    Our Picks
    The Future

    Having books in your Zoom background makes you seem more trustworthy

    Technology

    European VC Atomico closes $1.24B across two funds for early and growth-stage startups

    The Future

    The Shining’s Colorado Hotel Will Host Blumhouse’s New Horror Exhibit

    Categories
    • AI (1,487)
    • Crypto (1,748)
    • Gadgets (1,800)
    • Mobile (1,844)
    • Science (1,859)
    • Technology (1,795)
    • The Future (1,641)
    Most Popular
    AI

    The next generation of neural networks could live in hardware

    Mobile

    EV maker Polestar debuts its first phone, and it’s rather impressive

    Mobile

    Early Samsung Galaxy S24 release date and bold Unpacked event location choice tipped

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.