Close Menu
Ztoog
    What's Hot
    The Future

    How AI is Transforming the World Around Us

    Science

    The stargazing events to look forward to in 2025

    AI

    Now you can chat with ChatGPT using your voice

    Important Pages:
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    Facebook X (Twitter) Instagram Pinterest
    Facebook X (Twitter) Instagram Pinterest
    Ztoog
    • Home
    • The Future

      What is Project Management? 5 Best Tools that You Can Try

      Operational excellence strategy and continuous improvement

      Hannah Fry: AI isn’t as powerful as we think

      FanDuel goes all in on responsible gaming push with new Play with a Plan campaign

      Gettyimages.com Is the Best Website on the Internet Right Now

    • Technology

      Iran war: How could it end?

      Democratic senators question CFTC staffing cuts in Chicago enforcement office

      Google’s Cloud AI lead on the three frontiers of model capability

      AMD agrees to backstop a $300M loan from Goldman Sachs for Crusoe to buy AMD AI chips, the first known case of AMD chips used as debt collateral (The Information)

      Productivity apps failed me when I needed them most

    • Gadgets

      macOS Tahoe 26.3.1 update will “upgrade” your M5’s CPU to new “super” cores

      Lenovo Shows Off a ThinkBook Modular AI PC Concept With Swappable Ports and Detachable Displays at MWC 2026

      POCO M8 Review: The Ultimate Budget Smartphone With Some Cons

      The Mission: Impossible of SSDs has arrived with a fingerprint lock

      6 Best Phones With Headphone Jacks (2026), Tested and Reviewed

    • Mobile

      Android’s March update is all about finding people, apps, and your missing bags

      Watch Xiaomi’s global launch event live here

      Our poll shows what buyers actually care about in new smartphones (Hint: it’s not AI)

      Is Strava down for you? You’re not alone

      The Motorola Razr FIFA World Cup 2026 Edition was literally just unveiled, and Verizon is already giving them away

    • Science

      Big Tech Signs White House Data Center Pledge With Good Optics and Little Substance

      Inside the best dark matter detector ever built

      NASA’s Artemis moon exploration programme is getting a major makeover

      Scientists crack the case of “screeching” Scotch tape

      Blue-faced, puffy-lipped monkey scores a rare conservation win

    • AI

      Online harassment is entering its AI era

      Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

      New method could increase LLM training efficiency | Ztoog

      The human work behind humanoid robots is being hidden

      NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    • Crypto

      Google paid startup Form Energy $1B for its massive 100-hour battery

      Ethereum Breakout Alert: Corrective Channel Flip Sparks Impulsive Wave

      Show Your ID Or No Deal

      Jane Street sued for alleged front-running trades that accelerated Terraform Labs meltdown

      Bitcoin Trades Below ETF Cost-Basis As MVRV Signals Mounting Pressure

    Ztoog
    Home » Space-time view synthesis from videos of dynamic scenes – Google Research Blog
    AI

    Space-time view synthesis from videos of dynamic scenes – Google Research Blog

    Facebook Twitter Pinterest WhatsApp
    Space-time view synthesis from videos of dynamic scenes – Google Research Blog
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp

    Posted by Zhengqi Li and Noah Snavely, Research Scientists, Google Research

    A cell phone’s digital camera is a robust device for capturing on a regular basis moments. However, capturing a dynamic scene utilizing a single digital camera is essentially restricted. For occasion, if we needed to regulate the digital camera movement or timing of a recorded video (e.g., to freeze time whereas sweeping the digital camera round to spotlight a dramatic second), we’d sometimes want an costly Hollywood setup with a synchronized digital camera rig. Would or not it’s potential to realize related results solely from a video captured utilizing a cell phone’s digital camera, with no Hollywood funds?

    In “DynIBaR: Neural Dynamic Image-Based Rendering”, a greatest paper honorable point out at CVPR 2023, we describe a brand new technique that generates photorealistic free-viewpoint renderings from a single video of a posh, dynamic scene. Neural Dynamic Image-Based Rendering (DynIBaR) can be utilized to generate a variety of video results, equivalent to “bullet time” results (the place time is paused and the digital camera is moved at a traditional velocity round a scene), video stabilization, depth of discipline, and gradual movement, from a single video taken with a cellphone’s digital camera. We exhibit that DynIBaR considerably advances video rendering of advanced shifting scenes, opening the door to new varieties of video modifying functions. We have additionally launched the code on the DynIBaR challenge web page, so you possibly can strive it out your self.


    Given an in-the-wild video of a posh, dynamic scene, DynIBaR can freeze time whereas permitting the digital camera to proceed to maneuver freely via the scene.

    Background

    The previous few years have seen great progress in laptop imaginative and prescient strategies that use neural radiance fields (NeRFs) to reconstruct and render static (non-moving) 3D scenes. However, most of the videos folks seize with their cell gadgets depict shifting objects, equivalent to folks, pets, and automobiles. These shifting scenes result in a way more difficult 4D (3D + time) scene reconstruction drawback that can’t be solved utilizing commonplace view synthesis strategies.


    Standard view synthesis strategies output blurry, inaccurate renderings when utilized to videos of dynamic scenes.

    Other latest strategies deal with view synthesis for dynamic scenes utilizing space-time neural radiance fields (i.e., Dynamic NeRFs), however such approaches nonetheless exhibit inherent limitations that stop their utility to casually captured, in-the-wild videos. In explicit, they wrestle to render high-quality novel views from videos that includes very long time length, uncontrolled digital camera paths and complicated object movement.

    The key pitfall is that they retailer an advanced, shifting scene in a single knowledge construction. In explicit, they encode scenes within the weights of a multilayer perceptron (MLP) neural community. MLPs can approximate any operate — on this case, a operate that maps a 4D space-time level (x, y, z, t) to an RGB shade and density that we are able to use in rendering photos of a scene. However, the capability of this MLP (outlined by the quantity of parameters in its neural community) should improve in keeping with the video size and scene complexity, and thus, coaching such fashions on in-the-wild videos might be computationally intractable. As a consequence, we get blurry, inaccurate renderings like these produced by DVS and NSFF (proven beneath). DynIBaR avoids creating such giant scene fashions by adopting a special rendering paradigm.


    DynIBaR (backside row) considerably improves rendering high quality in comparison with prior dynamic view synthesis strategies (prime row) for videos of advanced dynamic scenes. Prior strategies produce blurry renderings as a result of they should retailer the complete shifting scene in an MLP knowledge construction.

    Image-based rendering (IBR)

    A key perception behind DynIBaR is that we don’t really must retailer all of the scene contents in a video in an enormous MLP. Instead, we instantly use pixel knowledge from close by enter video frames to render new views. DynIBaR builds on an image-based rendering (IBR) technique referred to as IBRNet that was designed for view synthesis for static scenes. IBR strategies acknowledge {that a} new goal view of a scene needs to be similar to close by supply photos, and due to this fact synthesize the goal by dynamically deciding on and warping pixels from the close by supply frames, slightly than reconstructing the entire scene prematurely. IBRNet, particularly, learns to mix close by photos collectively to recreate new views of a scene inside a volumetric rendering framework.

    DynIBaR: Extending IBR to advanced, dynamic videos

    To prolong IBR to dynamic scenes, we have to take scene movement into consideration throughout rendering. Therefore, as half of reconstructing an enter video, we resolve for the movement of each 3D level, the place we symbolize scene movement utilizing a movement trajectory discipline encoded by an MLP. Unlike prior dynamic NeRF strategies that retailer the complete scene look and geometry in an MLP, we solely retailer movement, a sign that’s extra easy and sparse, and use the enter video frames to find out every thing else wanted to render new views.

    We optimize DynIBaR for a given video by taking every enter video body, rendering rays to kind a 2D picture utilizing quantity rendering (as in NeRF), and evaluating that rendered picture to the enter body. That is, our optimized illustration ought to be capable of completely reconstruct the enter video.

    We illustrate how DynIBaR renders photos of dynamic scenes. For simplicity, we present a 2D world, as seen from above. (a) A set of enter supply views (triangular digital camera frusta) observe a dice shifting via the scene (animated sq.). Each digital camera is labeled with its timestamp (t-2, t-1, and so forth). (b) To render a view from digital camera at time t, DynIBaR shoots a digital ray via every pixel (blue line), and computes colours and opacities for pattern factors alongside that ray. To compute these properties, DyniBaR tasks these samples into different views by way of multi-view geometry, however first, we should compensate for the estimated movement of every level (dashed crimson line). (c) Using this estimated movement, DynIBaR strikes every level in 3D to the related time earlier than projecting it into the corresponding supply digital camera, to pattern colours to be used in rendering. DynIBaR optimizes the movement of every scene level as half of studying easy methods to synthesize new views of the scene.

    However, reconstructing and deriving new views for a posh, shifting scene is a extremely ill-posed drawback, since there are various options that may clarify the enter video — for example, it would create disconnected 3D representations for every time step. Therefore, optimizing DynIBaR to reconstruct the enter video alone is inadequate. To get hold of high-quality outcomes, we additionally introduce a number of different strategies, together with a technique referred to as cross-time rendering. Cross-time rendering refers back to the use of the state of our 4D illustration at one time prompt to render photos from a special time prompt, which inspires the 4D illustration to be coherent over time. To additional enhance rendering constancy, we robotically factorize the scene into two parts, a static one and a dynamic one, modeled by time-invariant and time-varying scene representations respectively.

    Creating video results

    DynIBaR allows varied video results. We present a number of examples beneath.

    Video stabilization

    We use a shaky, handheld enter video to match DynIBaR’s video stabilization efficiency to current 2D video stabilization and dynamic NeRF strategies, together with FuSta, DIFRINT, HyperNeRF, and NSFF. We exhibit that DynIBaR produces smoother outputs with larger rendering constancy and fewer artifacts (e.g., flickering or blurry outcomes). In explicit, FuSta yields residual digital camera shake, DIFRINT produces flicker round object boundaries, and HyperNeRF and NSFF produce blurry outcomes.


    Simultaneous view synthesis and gradual movement

    DynIBaR can carry out view synthesis in each house and time concurrently, producing easy 3D cinematic results. Below, we exhibit that DynIBaR can take video inputs and produce easy 5X slow-motion videos rendered utilizing novel digital camera paths.


    Video bokeh

    DynIBaR may generate high-quality video bokeh by synthesizing videos with dynamically altering depth of discipline. Given an all-in-focus enter video, DynIBar can generate high-quality output videos with various out-of-focus areas that decision consideration to shifting (e.g., the operating individual and canine) and static content material (e.g., timber and buildings) within the scene.


    Conclusion

    DynIBaR is a leap ahead in our potential to render advanced shifting scenes from new digital camera paths. While it presently includes per-video optimization, we envision sooner variations that may be deployed on in-the-wild videos to allow new varieties of results for client video modifying utilizing cell gadgets.

    Acknowledgements

    DynIBaR is the consequence of a collaboration between researchers at Google Research and Cornell University. The key contributors to the work introduced on this submit embody Zhengqi Li, Qianqian Wang, Forrester Cole, Richard Tucker, and Noah Snavely.

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp

    Related Posts

    AI

    Online harassment is entering its AI era

    AI

    Meet NullClaw: The 678 KB Zig AI Agent Framework Running on 1 MB RAM and Booting in Two Milliseconds

    AI

    New method could increase LLM training efficiency | Ztoog

    AI

    The human work behind humanoid robots is being hidden

    AI

    NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    AI

    Personalization features can make LLMs more agreeable | Ztoog

    AI

    AI is already making online crimes easier. It could get much worse.

    AI

    NVIDIA Researchers Introduce KVTC Transform Coding Pipeline to Compress Key-Value Caches by 20x for Efficient LLM Serving

    Leave A Reply Cancel Reply

    Follow Us
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    Top Posts
    Science

    Storms on Saturn are so huge that their traces last hundreds of years

    A radio picture of Saturn exhibits traces of previous storms. The broad, brilliant band close…

    Science

    Where Are All the AI Drugs?

    A brand new drug often begins with a tragedy.Peter Ray is aware of that. Born…

    Technology

    Harvard’s robotic exoskeleton can improve walking, decrease falls in people with Parkinson’s

    If you comply with the world of robotic exoskeletons with any frequency, you’re little question…

    Science

    Christina Koch: ISS, Artemis II and human bowling in zero-gravity

    Part of NASA’s bold undertaking to ship individuals again to the moon, Christina Koch is…

    Science

    Rethinking space and time could let us do away with dark matter

    Post-quantum gravity could clarify the rotation pace of galaxies, which is often seen as proof…

    Our Picks
    Science

    Coffee: Unevenly packed grounds to blame for weak espresso, say mathematicians

    Gadgets

    HP’s $5,000 Spectre Foldable PC has a lot to prove

    Mobile

    Top 10 trending phones of week 36

    Categories
    • AI (1,560)
    • Crypto (1,826)
    • Gadgets (1,870)
    • Mobile (1,910)
    • Science (1,939)
    • Technology (1,862)
    • The Future (1,716)
    Most Popular
    Mobile

    Here’s how ChatGPT went from a useful tool to a time-wasting habit

    Technology

     Libby Nelson Promoted to Editorial Director at Vox

    Science

    Inside NASA’s new Moon to Mars Office

    Ztoog
    Facebook X (Twitter) Instagram Pinterest
    • Home
    • About Us
    • Contact us
    • Privacy Policy
    • Terms & Conditions
    © 2026 Ztoog.

    Type above and press Enter to search. Press Esc to cancel.