BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Australia/Melbourne X-LIC-LOCATION:Australia/Melbourne BEGIN:DAYLIGHT TZOFFSETFROM:+1000 TZOFFSETTO:+1100 TZNAME:AEDT DTSTART:19721003T020000 RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU END:DAYLIGHT BEGIN:STANDARD DTSTART:19721003T020000 TZOFFSETFROM:+1100 TZOFFSETTO:+1000 TZNAME:AEST RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20240214T070244Z LOCATION:Meeting Room C4.9+C4.10\, Level 4 (Convention Centre) DTSTART;TZID=Australia/Melbourne:20231213T142500 DTEND;TZID=Australia/Melbourne:20231213T144000 UID:siggraphasia_SIGGRAPH Asia 2023_sess164_papers_490@linklings.com SUMMARY:SAILOR: Synergizing Radiance and Occupancy Fields for Live Human P erformance Capture DESCRIPTION:Technical Papers, TOG\n\nZheng Dong (State Key Laboratory of C AD & CG, Zhejiang University); Ke Xu (City University of Hong Kong); Yaoan Gao (State Key Laboratory of CAD & CG, Zhejiang University); Qilin Sun (T he Chinese University of Hong Kong, Shenzhen); Hujun Bao and Weiwei Xu (St ate Key Laboratory of CAD & CG, Zhejiang University); and Rynson W.H. Lau (City University of Hong Kong)\n\nImmersive user experiences in live VR/AR performances require a fast and accurate free-view rendering of the perfo rmers. Existing methods are mainly based on Pixel-aligned Implicit Functio ns (PIFu) or Neural Radiance Fields (NeRF). However, while PIFu-based meth ods usually fail to produce photorealistic view-dependent textures, NeRF-b ased methods typically lack local geometry accuracy and are computationall y heavy (e.g., dense sampling of 3D points, additional fine-tuning, or pos e estimation). In this work, we propose a novel generalizable method, name d SAILOR, to create photorealistic human free-view videos from very sparse RGBD streams with low latency. To produce photorealistic view-dependent t extures while preserving locally accurate geometry, we integrate PIFu and NeRF such that they work synergistically by conditioning the PIFu on depth and then rendering view-dependent textures through NeRF. Specifically, we propose a novel network, named SRONet, for this hybrid representation to reconstruct and render live free-view videos. SRONet can handle unseen per formers without fine-tuning. Both geometric and colorimetric supervision s ignals are exploited to enhance SRONet's capability of capturing high-qual ity details. Besides, a neural blending-based ray interpolation scheme, a tree-based data structure, and a parallel computing pipeline are incorpora ted for fast upsampling, efficient points sampling, and acceleration. To e valuate the rendering performance, we construct a real-captured RGBD bench mark from 40 performers. Experimental results show that SAILOR outperforms existing human reconstruction and performance capture methods.\n\nRegistr ation Category: Full Access\n\nSession Chair: Parag Chaudhuri (Indian Inst itute of Technology Bombay) URL:https://asia.siggraph.org/2023/full-program?id=papers_490&sess=sess164 END:VEVENT END:VCALENDAR