BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Tokyo
X-LIC-LOCATION:Asia/Tokyo
BEGIN:STANDARD
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:JST
DTSTART:18871231T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250110T023312Z
LOCATION:Hall B7 (1)\, B Block\, Level 7
DTSTART;TZID=Asia/Tokyo:20241205T163000
DTEND;TZID=Asia/Tokyo:20241205T164100
UID:siggraphasia_SIGGRAPH Asia 2024_sess138_papers_912@linklings.com
SUMMARY:VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresenc
 e
DESCRIPTION:Technical Papers\n\nPhong Tran (MBZUAI); Egor Zakharov (ETH Zu
 rich); Long-Nhat Ho, Adilbek Karmanov, and Ariana Bermudez Venegas (MBZUAI
 ); McLean Goldwhite, Aviral Agarwal, and Liwen Hu (Pinscreen); Anh Tran (V
 inAI Research); and Hao Li (MBZUAI, Pinscreen)\n\nWe introduce VOODOO XP: 
 a 3D-aware one-shot head reenactment method that can generate highly expre
 ssive facial expressions from any input driver video and a single 2D portr
 ait. Our solution is real-time, view-consistent, and can be instantly used
  without calibration or fine-tuning. We demonstrate our solution on a mono
 cular video setting and an end-to-end VR telepresence system for two-way c
 ommunication. Compared to 2D head reenactment methods, 3D-aware approaches
  aim to preserve the identity of the subject and ensure view-consistent fa
 cial geometry for novel camera poses, which makes them suitable for immers
 ive applications. While various facial disentanglement techniques have bee
 n introduced, cutting-edge 3D-aware neural reenactment techniques still la
 ck expressiveness and fail to reproduce complex and fine-scale facial expr
 essions. We present a novel cross-reenactment architecture that directly t
 ransfers the driver's facial expressions to transformer blocks of the inpu
 t source's 3D lifting module. We show that highly effective disentanglemen
 t is possible using an innovative multi-stage self-supervision approach, w
 hich is based on a coarse-to-fine strategy, combined with an explicit face
  neutralization and 3D lifted frontalization during its initial training s
 tage. We further integrate our novel head reenactment solution into an acc
 essible high-fidelity VR telepresence system, where any person can instant
 ly build a personalized neural head avatar from any photo and bring it to 
 life using the headset. We demonstrate state-of-the-art performance in ter
 ms of expressiveness and likeness preservation on a large set of diverse s
 ubjects and capture conditions.\n\nRegistration Category: Full Access, Ful
 l Access Supporter\n\nLanguage Format: English Language\n\nSession Chair: 
 Hongbo Fu (Hong Kong University of Science and Technology)
URL:https://asia.siggraph.org/2024/program/?id=papers_912&sess=sess138
END:VEVENT
END:VCALENDAR
