BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070244Z
LOCATION:Meeting Room C4.9+C4.10\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231213T142500
DTEND;TZID=Australia/Melbourne:20231213T144000
UID:siggraphasia_SIGGRAPH Asia 2023_sess164_papers_490@linklings.com
SUMMARY:SAILOR: Synergizing Radiance and Occupancy Fields for Live Human P
 erformance Capture
DESCRIPTION:Technical Papers, TOG\n\nZheng Dong (State Key Laboratory of C
 AD & CG, Zhejiang University); Ke Xu (City University of Hong Kong); Yaoan
  Gao (State Key Laboratory of CAD & CG, Zhejiang University); Qilin Sun (T
 he Chinese University of Hong Kong, Shenzhen); Hujun Bao and Weiwei Xu (St
 ate Key Laboratory of CAD & CG, Zhejiang University); and Rynson W.H. Lau 
 (City University of Hong Kong)\n\nImmersive user experiences in live VR/AR
  performances require a fast and accurate free-view rendering of the perfo
 rmers. Existing methods are mainly based on Pixel-aligned Implicit Functio
 ns (PIFu) or Neural Radiance Fields (NeRF). However, while PIFu-based meth
 ods usually fail to produce photorealistic view-dependent textures, NeRF-b
 ased methods typically lack local geometry accuracy and are computationall
 y heavy (e.g., dense sampling of 3D points, additional fine-tuning, or pos
 e estimation). In this work, we propose a novel generalizable method, name
 d SAILOR, to create photorealistic human free-view videos from very sparse
  RGBD streams with low latency. To produce photorealistic view-dependent t
 extures while preserving locally accurate geometry, we integrate PIFu and 
 NeRF such that they work synergistically by conditioning the PIFu on depth
  and then rendering view-dependent textures through NeRF. Specifically, we
  propose a novel network, named SRONet, for this hybrid representation to 
 reconstruct and render live free-view videos. SRONet can handle unseen per
 formers without fine-tuning. Both geometric and colorimetric supervision s
 ignals are exploited to enhance SRONet's capability of capturing high-qual
 ity details. Besides, a neural blending-based ray interpolation scheme, a 
 tree-based data structure, and a parallel computing pipeline are incorpora
 ted for fast upsampling, efficient points sampling, and acceleration. To e
 valuate the rendering performance, we construct a real-captured RGBD bench
 mark from 40 performers. Experimental results show that SAILOR outperforms
  existing human reconstruction and performance capture methods.\n\nRegistr
 ation Category: Full Access\n\nSession Chair: Parag Chaudhuri (Indian Inst
 itute of Technology Bombay)
URL:https://asia.siggraph.org/2023/full-program?id=papers_490&sess=sess164
END:VEVENT
END:VCALENDAR