BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023309Z LOCATION:Hall B5 (2)\, B Block\, Level 5 DTSTART;TZID=Asia/Tokyo:20241203T132800 DTEND;TZID=Asia/Tokyo:20241203T134200 UID:siggraphasia_SIGGRAPH Asia 2024_sess104_papers_415@linklings.com SUMMARY:Representing Long Volumetric Video with Temporal Gaussian Hierarch y DESCRIPTION:Technical Papers\n\nZhen Xu (State Key Laboratory of CAD&CG, Z hejiang University; Zhejiang University); Yinghao Xu (Stanford University) ; Zhiyuan Yu (Department of Mathematics, Hong Kong University of Science a nd Technology); Sida Peng and Jiaming Sun (Zhejiang University); and Hujun Bao and Xiaowei Zhou (State Key Laboratory of CAD&CG, Zhejiang University )\n\nThis paper aims to address the challenge of reconstructing long volum etric videos from multi-view RGB videos.\nRecent dynamic view synthesis me thods leverage powerful 4D representations, like feature grids or point cl oud sequences, to achieve high-quality rendering results. However, they ar e typically limited to short (1$\sim$2s) video clips and often suffer from large memory footprints when dealing with longer videos.\nTo solve this i ssue, we propose a novel 4D representation, named temporal Gaussian hierar chy, to compactly model long volumetric videos.\nOur key observation is th at there are generally various degrees of temporal redundancy in dynamic s cenes, which consist of areas changing at different speeds.\nMotivated by this, our approach builds a multi-level hierarchy of Gaussian primitives, where each level separately describes scene regions with different degrees of content change, and adaptively shares Gaussian primitives to represent unchanged scene content over different temporal segments, thus effectivel y reducing the number of Gaussian primitives.\nIn addition, the tree-like structure of the Gaussian hierarchy allows us to efficiently represent the scene at a particular moment with a subset of Gaussian primitives, leadin g to nearly constant GPU memory usage during the training or rendering reg ardless of the video length.\nMoreover, we design a compact appearance mod el that mixes diffuse and view-dependent Gaussians to further minimize the model size while maintaining the rendering quality.\nWe also develop a ra sterization pipeline of Gaussian primitives based on the hardware-accelera ted technique to improve rendering speed.\nExtensive experimental results demonstrate the superiority of our method over alternative methods in term s of training cost, rendering speed, and storage usage.\nTo our knowledge, this work is the first approach capable of efficiently handling hours of volumetric video data while maintaining state-of-the-art rendering quality .\n\nRegistration Category: Full Access, Full Access Supporter\n\nLanguage Format: English Language\n\nSession Chair: Bernhard Kerbl (Technical Univ ersity of Vienna) URL:https://asia.siggraph.org/2024/program/?id=papers_415&sess=sess104 END:VEVENT END:VCALENDAR