BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Australia/Melbourne X-LIC-LOCATION:Australia/Melbourne BEGIN:DAYLIGHT TZOFFSETFROM:+1000 TZOFFSETTO:+1100 TZNAME:AEDT DTSTART:19721003T020000 RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU END:DAYLIGHT BEGIN:STANDARD DTSTART:19721003T020000 TZOFFSETFROM:+1100 TZOFFSETTO:+1000 TZNAME:AEST RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20240214T070241Z LOCATION:Darling Harbour Theatre\, Level 2 (Convention Centre) DTSTART;TZID=Australia/Melbourne:20231212T093000 DTEND;TZID=Australia/Melbourne:20231212T124500 UID:siggraphasia_SIGGRAPH Asia 2023_sess209_papers_490@linklings.com SUMMARY:SAILOR: Synergizing Radiance and Occupancy Fields for Live Human P erformance Capture DESCRIPTION:Technical Papers\n\nZheng Dong (State Key Laboratory of CAD & CG, Zhejiang University); Ke Xu (City University of Hong Kong); Yaoan Gao (State Key Laboratory of CAD & CG, Zhejiang University); Qilin Sun (The Ch inese University of Hong Kong, Shenzhen); Hujun Bao and Weiwei Xu (State K ey Laboratory of CAD & CG, Zhejiang University); and Rynson W.H. Lau (City University of Hong Kong)\n\nImmersive user experiences in live VR/AR perf ormances require a fast and accurate free-view rendering of the performers . Existing methods are mainly based on Pixel-aligned Implicit Functions (P IFu) or Neural Radiance Fields (NeRF). However, while PIFu-based methods u sually fail to produce photorealistic view-dependent textures, NeRF-based methods typically lack local geometry accuracy and are computationally hea vy (e.g., dense sampling of 3D points, additional fine-tuning, or pose est imation). In this work, we propose a novel generalizable method, named SAI LOR, to create photorealistic human free-view videos from very sparse RGBD streams with low latency. To produce photorealistic view-dependent textur es while preserving locally accurate geometry, we integrate PIFu and NeRF such that they work synergistically by conditioning the PIFu on depth and then rendering view-dependent textures through NeRF. Specifically, we prop ose a novel network, named SRONet, for this hybrid representation to recon struct and render live free-view videos. SRONet can handle unseen performe rs without fine-tuning. Both geometric and colorimetric supervision signal s are exploited to enhance SRONet's capability of capturing high-quality d etails. Besides, a neural blending-based ray interpolation scheme, a tree- based data structure, and a parallel computing pipeline are incorporated f or fast upsampling, efficient points sampling, and acceleration. To evalua te the rendering performance, we construct a real-captured RGBD benchmark from 40 performers. Experimental results show that SAILOR outperforms exis ting human reconstruction and performance capture methods.\n\nRegistration Category: Full Access, Enhanced Access, Trade Exhibitor, Experience Hall Exhibitor URL:https://asia.siggraph.org/2023/full-program?id=papers_490&sess=sess209 END:VEVENT END:VCALENDAR