Close Menu
Ztoog
    What's Hot
    Technology

    Dozens of Top Scientists Sign Effort to Prevent A.I. Bioweapons

    Gadgets

    HP Wants to Rent You a Printer That It Monitors at All Times

    Science

    Burying Power Lines Prevents Wildfires. But There’s a Cost

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      Any wall can be turned into a camera to see around corners

      JD Vance and President Trump’s Sons Hype Bitcoin at Las Vegas Conference

      AI may already be shrinking entry-level jobs in tech, new research suggests

      Today’s NYT Strands Hints, Answer and Help for May 26 #449

      LiberNovo Omni: The World’s First Dynamic Ergonomic Chair

    • Technology

      A Replit employee details a critical security flaw in web apps created using AI-powered app builder Lovable that exposes API keys and personal info of app users (Reed Albergotti/Semafor)

      Gemini in Google Drive can now help you skip watching that painfully long Zoom meeting

      Apple iPhone exports from China to the US fall 76% as India output surges

      Today’s NYT Wordle Hints, Answer and Help for May 26, #1437

      5 Skills Kids (and Adults) Need in an AI World – O’Reilly

    • Gadgets

      Future-proof your career by mastering AI skills for just $20

      8 Best Vegan Meal Delivery Services and Kits (2025), Tested and Reviewed

      Google Home is getting deeper Gemini integration and a new widget

      Google Announces AI Ultra Subscription Plan With Premium Features

      Google shows off Android XR-based glasses, announces Warby Parker team-up

    • Mobile

      Deals: the Galaxy S25 series comes with a free tablet, Google Pixels heavily discounted

      Microsoft is done being subtle – this new tool screams “upgrade now”

      Wallpaper Wednesday: Android wallpapers 2025-05-28

      Google can make smart glasses accessible with Warby Parker, Gentle Monster deals

      vivo T4 Ultra specs leak

    • Science

      Analysts Say Trump Trade Wars Would Harm the Entire US Energy Sector, From Oil to Solar

      Do we have free will? Quantum experiments may soon reveal the answer

      Was Planet Nine exiled from the solar system as a baby?

      How farmers can help rescue water-loving birds

      A trip to the farm where loofahs grow on vines

    • AI

      Rationale engineering generates a compact new tool for gene therapy | Ztoog

      The AI Hype Index: College students are hooked on ChatGPT

      Learning how to predict rare kinds of failures | Ztoog

      Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

      AI learns how vision and sound are connected, without human intervention | Ztoog

    • Crypto

      GameStop bought $500 million of bitcoin

      CoinW Teams Up with Superteam Europe to Conclude Solana Hackathon and Accelerate Web3 Innovation in Europe

      Ethereum Net Flows Turn Negative As Bulls Push For $3,500

      Bitcoin’s Power Compared To Nuclear Reactor By Brazilian Business Leader

      Senate advances GENIUS Act after cloture vote passes

    Ztoog
    Home » This AI Paper Proposes A Latent Diffusion Model For 3D (LDM3D) That Generates Both Image And Depth Map Data From A Given Text Prompt
    AI

    This AI Paper Proposes A Latent Diffusion Model For 3D (LDM3D) That Generates Both Image And Depth Map Data From A Given Text Prompt

    Facebook Twitter Pinterest WhatsApp
    This AI Paper Proposes A Latent Diffusion Model For 3D (LDM3D) That Generates Both Image And Depth Map Data From A Given Text Prompt
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    In the sphere of generative AI, pc imaginative and prescient has made large strides in recent times. Stable Diffusion has reworked content material manufacturing in image era by providing free software program to supply random high-fidelity RGB pictures from textual content prompts. This analysis suggests a Latent Diffusion Model for 3D (LDM3D) constructed upon Stable Diffusion v1.4. Unlike the earlier mannequin, determine 1 illustrates how LDM3D can produce depth maps and film knowledge from a given textual content immediate. Users could create full RGBD representations of textual content prompts, bringing them to life in vibrant and engrossing 360° views. On a dataset of round 4 million tuples that included an RGB image, depth map, and outline, their LDM3D mannequin was refined. 

    A portion of the LAION-400M dataset, a big image-caption dataset with greater than 400 million image-caption pairings, was used to create this dataset. The DPT-Large depth estimation mannequin, which affords extraordinarily exact relative depth estimates for every pixel in a picture, was utilized to create the depth maps used for fine-tuning. It was important to make use of appropriate depth maps to create 360° views which are practical and immersive and permit customers to expertise their textual content prompts in nice element. Researchers from Intel Labs and Blockade Labs create on prime of LDM3D develop DepthFusion, an utility that leverages the began 2D RGB pictures and depth maps to calculate a 360° projection utilizing TouchDesigner, demonstrating the probabilities of LDM3D. 

    Figure 1: Overview of LDM3D: The 16-bit grayscale depth maps are compressed into 3-channel RGB-like depth footage, that are then concatenated with the RGB pictures alongside the channel dimension, to show the coaching workflow. The modified KL-AE is used to map the concatenated RGBD enter to the latent house. The latent illustration receives noise earlier than being repeatedly denoised by the U-Net mannequin. A frozen CLIP-text encoder is used to encrypt the textual content immediate, and crossattention is used to map it to completely different U-Net layers. The KL-decoder receives the denoised output from the latent house and maps it again to pixel house as a 6-channel RGBD output. The result’s then divided right into a 16-bit grayscale depth map and an RGB image. Text-to-image inference pathway proven in blue body.

    DepthFusion has the facility to alter how folks work together with digital materials utterly. A versatile framework known as TouchDesigner makes creating interactive and immersive multimedia experiences attainable. Their program makes use of touchdesigner’s artistic potential to supply fascinating 360° panoramas that vividly depict textual content prompts. With the assistance of DepthFusion, customers could now expertise their textual content prompts in a beforehand uns conceivable manner, whether or not it’s an outline of a serene forest, a bustling cityscape, or a sci-fi universe. This know-how can probably revolutionize numerous sectors, together with gaming, leisure, design, and structure. 

    🚀 JOIN the quickest ML Subreddit Community

    They have made three completely different contributions general. (1) They counsel LDM3D, a novel diffusion mannequin that, given a textual content immediate, generates RGBD footage (RGB pictures with matching depth maps). (2) They constructed DepthFusion, a program that makes use of RGBD pictures produced by LDM3D to offer immersive 360°-view experiences. (3) They consider the effectiveness of their produced RGBD pictures and 360-view immersive movies by complete research. The examine presents LDM3D, a cutting-edge diffusion mannequin that produces RGBD visuals from textual content cues. They additionally constructed DepthFusion, a program that makes use of the produced RGBD footage from TouchDesigner to offer immersive and interactive 360-view experiences for instance the probabilities of LDM3D additional. 

    The findings of this examine would possibly essentially alter how folks work together with digital materials, remodeling every little thing from leisure and gaming to structure and design. The contributions of this work open up new alternatives for multiview generative AI and pc imaginative and prescient analysis. They are thinking about how this space will develop additional and wish the group to profit from the work proven.


    Check out the Paper. Don’t neglect to hitch our 21k+ ML SubReddit, Discord Channel, and Email Newsletter, the place we share the newest AI analysis information, cool AI tasks, and extra. If you might have any questions concerning the above article or if we missed something, be happy to electronic mail us at Asif@marktechpost.com

    🚀 Check Out 100’s AI Tools in AI Tools Club


    Aneesh Tickoo is a consulting intern at MarktechPost. He is at present pursuing his undergraduate diploma in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time engaged on tasks geared toward harnessing the facility of machine studying. His analysis curiosity is picture processing and is obsessed with constructing options round it. He loves to attach with folks and collaborate on attention-grabbing tasks.


    ➡️ Meet Bright Data: The World’s #1 Web Data Platform

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Rationale engineering generates a compact new tool for gene therapy | Ztoog

    AI

    The AI Hype Index: College students are hooked on ChatGPT

    AI

    Learning how to predict rare kinds of failures | Ztoog

    AI

    Anthropic’s new hybrid AI model can work on tasks autonomously for hours at a time

    AI

    AI learns how vision and sound are connected, without human intervention | Ztoog

    AI

    How AI is introducing errors into courtrooms

    AI

    With AI, researchers predict the location of virtually any protein within a human cell | Ztoog

    AI

    Google DeepMind’s new AI agent cracks real-world problems better than humans can

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Science

    Taters the cat stars in first ‘ultra-HD’ video sent from deep space

    NASA’s Psyche spacecraft completed yet one more historic communications achievement lower than a month after…

    AI

    Semantic Hearing: A Machine Learning-Based Novel Capability for Hearable Devices to Focus on or Ignore Specific Sounds in Real Environments while Maintaining Spatial Awareness

    Researchers from the University of Washington and Microsoft have launched a cutting-edge idea: noise-canceling headphones…

    Technology

    Multiphysics Simulation to Improve Design of Renewable Energy Production

    Simulation performs a vital function in growing new applied sciences — and applied sciences that…

    Mobile

    Repairable headphones are the future, but they’re not perfect yet

    Lily Katz / Android AuthorityRepairable headphones aren’t new. Audio stalwarts like Sennheiser have supplied replaceable…

    Mobile

    The Galaxy Watch 6 still lags behind Garmin and Apple despite new fitness tricks

    In his official press launch for the Galaxy Watch 6, Samsung president TM Roh promised…

    Our Picks
    Technology

    Thousands of servers hacked in ongoing attack targeting Ray AI framework

    Science

    More and more Americans are skipping medical care due to money woes

    Technology

    PAX Unplugged 2023: How indie devs build and sell new board games

    Categories
    • AI (1,493)
    • Crypto (1,753)
    • Gadgets (1,805)
    • Mobile (1,851)
    • Science (1,866)
    • Technology (1,802)
    • The Future (1,648)
    Most Popular
    Gadgets

    Upgrade your summer with a deeply discounted outdoor hot tub or ice bath

    Mobile

    Forget screen protectors, the Galaxy S24 Ultra practically wears armor

    Crypto

    Ethereum Breaks Back Above $3,000, Will FOMO Lead To Top Again?

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2025 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.