Close Menu
Ztoog
    What's Hot
    Gadgets

    Tatooine-Like Exoplanet BEBOP-1c Discovered Orbiting Twin Suns

    Mobile

    OPPO and OnePlus could leave France. Europe next?

    Mobile

    A VR headset isn’t going to bring Huawei back from the dead

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Can work-life balance tracking improve well-being?

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

    • Technology

      Elon Musk tries to stick to spaceships

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Deals: the Galaxy S25 series comes with a free tablet, Google Pixels heavily discounted

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

    • Science

      June skygazing: A strawberry moon, the summer solstice… and Asteroid Day!

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      Bitcoin Maxi Isn’t Buying Hype Around New Crypto Holding Firms

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

    Ztoog
    Home » CMU Researchers Introduce OWSM v3.1: A Better and Faster Open Whisper-Style Speech Model-Based on E-Branchformer
    AI

    CMU Researchers Introduce OWSM v3.1: A Better and Faster Open Whisper-Style Speech Model-Based on E-Branchformer

    Facebook Twitter Pinterest WhatsApp
    CMU Researchers Introduce OWSM v3.1: A Better and Faster Open Whisper-Style Speech Model-Based on E-Branchformer
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Speech recognition know-how has develop into a cornerstone for varied purposes, enabling machines to know and course of human speech. The subject repeatedly seeks developments in algorithms and fashions to enhance accuracy and effectivity in recognizing speech throughout a number of languages and contexts. The most important problem in speech recognition is growing fashions that precisely transcribe speech from varied languages and dialects. Models typically need assistance with the variability of speech, together with accents, intonation, and background noise, resulting in a requirement for extra strong and versatile options.

    Researchers have been exploring varied strategies to boost speech recognition techniques. Existing options have typically relied on complicated architectures like Transformers, which, regardless of their effectiveness, face limitations, significantly in processing pace and the nuanced process of precisely recognizing and deciphering a big selection of speech nuances, together with dialects, accents, and variations in speech patterns. 

    The Carnegie Mellon University and Honda Research Institute Japan analysis crew launched a brand new mannequin, OWSM v3.1, using the E-Branchformer structure to handle these challenges. OWSM v3.1 is an improved and sooner Open Whisper-style Speech Model that achieves higher outcomes than the earlier OWSM v3 in most analysis circumstances. 

    The earlier OWSM v3 and Whisper each make the most of the usual Transformer encoder-decoder structure. However, latest developments in speech encoders resembling Conformer and Branchformer have improved efficiency in speech processing duties. Hence, the E-Branchformer is employed because the encoder in OWSM v3.1, demonstrating its effectiveness at a scale of 1B parameters. OWSM v3.1 excludes the WSJ coaching knowledge utilized in OWSM v3, which had absolutely uppercased transcripts. This exclusion results in a considerably decrease Word Error Rate (WER) in OWSM v3.1. It additionally demonstrates as much as 25% sooner inference pace.

    OWSM v3.1 demonstrated vital achievements in efficiency metrics. It outperformed its predecessor, OWSM v3, in most analysis benchmarks, attaining greater accuracy in speech recognition duties throughout a number of languages. Compared to OWSM v3, OWSM v3.1 exhibits enhancements in English-to-X translation in 9 out of 15 instructions. Although there could also be a slight degradation in some instructions, the common BLEU rating is barely improved from 13.0 to 13.3.

    In conclusion, the analysis considerably strides in the direction of enhancing speech recognition know-how. By leveraging the E-Branchformer structure, the OWSM v3.1 mannequin improves upon earlier fashions when it comes to accuracy and effectivity and units a brand new normal for open-source speech recognition options. By releasing the mannequin and coaching particulars publicly, the researchers’ dedication to transparency and open science additional enriches the sphere and paves the way in which for future developments.


    Check out the Paper and Demo. All credit score for this analysis goes to the researchers of this venture. Also, don’t overlook to observe us on Twitter and Google News. Join our 36k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

    If you want our work, you’ll love our e-newsletter..

    Don’t Forget to hitch our Telegram Channel


    Nikhil is an intern guide at Marktechpost. He is pursuing an built-in twin diploma in Materials on the Indian Institute of Technology, Kharagpur. Nikhil is an AI/ML fanatic who’s all the time researching purposes in fields like biomaterials and biomedical science. With a robust background in Material Science, he’s exploring new developments and creating alternatives to contribute.


    🎯 [FREE AI WEBINAR] ‘Actions in GPTs: Developer Tips, Tricks & Techniques’ (Feb 12, 2024)

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Rationale engineering generates a compact new tool for gene therapy | Ztoog

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Mobile

    Forget screen protectors, the Galaxy S24 Ultra practically wears armor

    What it is advisable to knowMohs’ hardness take a look at reveals the Galaxy S24…

    Crypto

    1RoundTable bets on ‘long-term, less volatile’ strategy with new growth-stage crypto fund

    The previous couple of years have been difficult, and typically lethal, for a lot of…

    Gadgets

    OpenAI’s ChatGPT for Mac is now available to all users

    Enlarge / The app permits you to invoke ChatGPT from anyplace within the system with…

    Gadgets

    MediaTek Presents 5G RedCap Hardware Showing Impressive Power-Efficiency

    After asserting its partnership with Meta to develop System-on-Chips (SoCs) for Augmented Reality (AR) glasses,…

    The Future

    Microsoft’s Edge Copilot AI can’t really summarize every YouTube video

    One function added to Microsoft’s AI Copilot within the Edge browser this week is the…

    Our Picks
    AI

    Microsoft AI Introduces Orca: A 13-Billion Parameter Model that Learns to Imitate the Reasoning Process of LFMs (Large Foundation Models)

    Technology

    Superficial Brain Implant Could Have a Deep Impact

    Mobile

    Google’s new way to unlock your phone with your Pixel Watch may be coming soon

    Categories
    • AI (1,493)
    • Crypto (1,754)
    • Gadgets (1,805)
    • Mobile (1,851)
    • Science (1,867)
    • Technology (1,803)
    • The Future (1,649)
    Most Popular
    Mobile

    Bots vs genuine buyers: Are bots making it harder for you to get the iPhone 15 Pro Max

    Science

    A NASA astronaut will finally spend a full year in space

    Mobile

    WhatsApp will boost locked chats privacy and status updates with new features

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.