Close Menu
Ztoog
    What's Hot
    Technology

    Chia Seed Water Can Help You Boost Your Hydration. The Benefits, Risks and How to Make It

    AI

    Generating opportunities with generative AI | Ztoog

    Mobile

    OnePlus 13: Rumors, specs, and everything we want to see

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      OPPO launches A5 Pro 5G: Premium features at a budget price

      How I Turn Unstructured PDFs into Revenue-Ready Spreadsheets

      Is it the best tool for 2025?

      The clocks that helped define time from London’s Royal Observatory

      Summer Movies Are Here, and So Are the New Popcorn Buckets

    • Technology

      What It Is and Why It Matters—Part 1 – O’Reilly

      Ensure Hard Work Is Recognized With These 3 Steps

      Cicada map 2025: Where will Brood XIV cicadas emerge this spring?

      Is Duolingo the face of an AI jobs crisis?

      The US DOD transfers its AI-based Open Price Exploration for National Security program to nonprofit Critical Minerals Forum to boost Western supply deals (Ernest Scheyder/Reuters)

    • Gadgets

      Maono Caster G1 Neo & PD200X Review: Budget Streaming Gear for Aspiring Creators

      Apple plans to split iPhone 18 launch into two phases in 2026

      Upgrade your desk to Starfleet status with this $95 USB-C hub

      37 Best Graduation Gift Ideas (2025): For College Grads

      Backblaze responds to claims of “sham accounting,” customer backups at risk

    • Mobile

      Motorola’s Moto Watch needs to start living up to the brand name

      Samsung Galaxy S25 Edge promo materials leak

      What are people doing with those free T-Mobile lines? Way more than you’d expect

      Samsung doesn’t want budget Galaxy phones to use exclusive AI features

      COROS’s charging adapter is a neat solution to the smartwatch charging cable problem

    • Science

      Nothing is stronger than quantum connections – and now we know why

      Failed Soviet probe will soon crash to Earth – and we don’t know where

      Trump administration cuts off all future federal funding to Harvard

      Does kissing spread gluten? New research offers a clue.

      Why Balcony Solar Panels Haven’t Taken Off in the US

    • AI

      Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

      How to build a better AI benchmark

      Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

      This data set helps researchers spot harmful stereotypes in LLMs

      Making AI models more trustworthy for high-stakes settings | Ztoog

    • Crypto

      Ethereum Breaks Key Resistance In One Massive Move – Higher High Confirms Momentum

      ‘The Big Short’ Coming For Bitcoin? Why BTC Will Clear $110,000

      Bitcoin Holds Above $95K Despite Weak Blockchain Activity — Analytics Firm Explains Why

      eToro eyes US IPO launch as early as next week amid easing concerns over Trump’s tariffs

      Cardano ‘Looks Dope,’ Analyst Predicts Big Move Soon

    Ztoog
    Home » EfficientViT-SAM: A New Family of Accelerated Segment Anything Models
    AI

    EfficientViT-SAM: A New Family of Accelerated Segment Anything Models

    Facebook Twitter Pinterest WhatsApp
    EfficientViT-SAM: A New Family of Accelerated Segment Anything Models
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    The panorama of picture segmentation has been profoundly remodeled by the introduction of the Segment Anything Model (SAM), a paradigm identified for its exceptional zero-shot segmentation functionality. SAM’s deployment throughout a wide selection of functions, from augmented actuality to information annotation, underscores its utility. However, SAM’s computational depth, notably its picture encoder’s demand of 2973 GMACs per picture at inference, has restricted its utility in situations the place time is of the essence.

    The quest to boost SAM’s effectivity with out sacrificing its formidable accuracy has led to the event of fashions like MobileSAM, EdgeSAM, and EfficientSAM. These fashions, whereas lowering computational prices, sadly, skilled drops in efficiency, as depicted in Figure 1. Addressing this problem, the introduction of EfficientViT-SAM makes use of the EfficientViT structure to revamp SAM’s picture encoder. This adaptation preserves the integrity of SAM’s light-weight immediate encoder and masks decoder structure, culminating in two variants: EfficientViT-SAM-L and EfficientViT-SAM-XL. These fashions provide a nuanced trade-off between operational velocity and segmentation accuracy, educated end-to-end utilizing the excellent SA-1B dataset.

    EfficientViT stands on the core of this innovation, a imaginative and prescient transformer mannequin optimized for high-resolution dense prediction duties. Its distinctive multi-scale linear consideration module replaces conventional softmax consideration with ReLU linear consideration, considerably lowering computational complexity from quadratic to linear. This effectivity is achieved with out compromising the mannequin’s skill to globally understand and be taught multi-scale options, a pivotal enhancement detailed within the unique EfficientViT publication.

    The structure of EfficientViT-SAM, notably the EfficientViT-SAM-XL variant, is meticulously structured into 5 levels. Early levels make use of convolution blocks, whereas the latter levels combine EfficientViT modules, culminating in a function fusion course of that feeds into the SAM head, as illustrated in Figure 2. This architectural design ensures a seamless fusion of multi-scale options, enhancing the mannequin’s segmentation functionality.

    The coaching course of of EfficientViT-SAM is as rigorous as it’s modern. Beginning with the distillation of SAM-ViT-H’s picture embeddings into EfficientViT, the mannequin undergoes end-to-end coaching on the SA-1B dataset. This part incorporates a combination of field and level prompts, using a mixture of focal and cube loss to fine-tune the mannequin’s efficiency. The coaching technique, together with the selection of prompts and loss perform, ensures that EfficientViT-SAM not solely learns successfully but in addition adapts to varied segmentation situations.

    EfficientViT-SAM’s excellence is just not merely theoretical; its empirical efficiency, notably in runtime effectivity and zero-shot segmentation, is compelling. The mannequin demonstrates an acceleration of 17 to 69 occasions in comparison with SAM, with a major throughput benefit regardless of having extra parameters than different acceleration efforts, as proven in Table 1.

    The zero-shot segmentation functionality of EfficientViT-SAM is evaluated by meticulous checks on COCO and LVIS datasets, using each single-point and box-prompted occasion segmentation. The mannequin’s efficiency, as detailed in Tables 2 and 4, showcases its superior segmentation accuracy, notably when using further level prompts or floor reality bounding containers.

    Moreover, the segmentation within the Wild benchmark additional validates EfficientViT-SAM’s robustness in zero-shot segmentation throughout various datasets, with efficiency outcomes encapsulated in Table 3. The qualitative outcomes, depicted in Figure 3, spotlight EfficientViT-SAM’s adeptness at segmenting objects of various sizes, affirming its versatility and superior segmentation functionality.

    In conclusion, EfficientViT-SAM efficiently merges the velocity of EfficientViT into the SAM structure, leading to a considerable effectivity acquire with out sacrificing efficiency. This opens up prospects for wider-reaching functions of highly effective segmentation fashions, even in resource-constrained situations. To facilitate and encourage additional analysis and growth, pre-trained EfficientViT-SAM fashions have been made open-source.


    Check out the Paper. All credit score for this analysis goes to the researchers of this mission. Also, don’t neglect to comply with us on Twitter and Google News. Join our 37k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

    If you want our work, you’ll love our e-newsletter..

    Don’t Forget to hitch our Telegram Channel


    Vineet Kumar is a consulting intern at MarktechPost. He is at present pursuing his BS from the Indian Institute of Technology(IIT), Kanpur. He is a Machine Learning fanatic. He is keen about analysis and the newest developments in Deep Learning, Computer Vision, and associated fields.


    🚀 LLMWare Launches SLIMs: Small Specialized Function-Calling Models for Multi-Step Automation [Check out all the models]

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Hybrid AI model crafts smooth, high-quality videos in seconds | Ztoog

    AI

    How to build a better AI benchmark

    AI

    Q&A: A roadmap for revolutionizing health care through data-driven innovation | Ztoog

    AI

    This data set helps researchers spot harmful stereotypes in LLMs

    AI

    Making AI models more trustworthy for high-stakes settings | Ztoog

    AI

    The AI Hype Index: AI agent cyberattacks, racing robots, and musical models

    AI

    Novel method detects microbial contamination in cell cultures | Ztoog

    AI

    Seeing AI as a collaborator, not a creator

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    AI

    How machine learning might improve earthquake prediction

    Instances of standard intervals between earthquakes of comparable magnitudes have been famous somewhere else, together…

    Science

    Chandrayaan-3: No signals found as India searches for sleeping moon mission

    The Vikram lander on the floor of the moon, as seen by the Pragyan roverISRO…

    Technology

    Ultrahuman Ring Air vs Samsung Galaxy Ring: Which should you buy?

    The good ring enviornment continues to warmth up, with new rivals becoming a member of…

    Technology

    Mattel’s Windfall From ‘Barbie’ – The New York Times

    When Ynon Kreiz arrived at Mattel in April 2018, the newly put in chief government…

    The Future

    How IoT & Analytics are Powering Modern Shipping Logistics

    Physical infrastructure isn’t the one driver of your ecommerce cargo anymore; information is an equally…

    Our Picks
    Crypto

    ETHBTC May Capitulate, Will These Factors Support Ethereum?

    Crypto

    Over 157,000 Bitcoin Transactions Are Waiting To Be Confirmed, Here’s The Issue

    AI

    A new way to let AI chatbots converse all day without crashing | Ztoog

    Categories
    • AI (1,483)
    • Crypto (1,745)
    • Gadgets (1,796)
    • Mobile (1,840)
    • Science (1,854)
    • Technology (1,790)
    • The Future (1,636)
    Most Popular
    Crypto

    Renowned Economist Warns Of Bitcoin Crash Before Spot ETF Approvals

    Science

    A New Generation of Satellites and Balloons to Monitor the Environment

    Mobile

    Weekly poll results: the Sony Xperia 5 V is a great phone but is overpriced

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.