BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B7 (1)\, B Block\, Level 7 DTSTART;TZID=Asia/Tokyo:20241205T163000 DTEND;TZID=Asia/Tokyo:20241205T164100 UID:siggraphasia_SIGGRAPH Asia 2024_sess138_papers_912@linklings.com SUMMARY:VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresenc e DESCRIPTION:Technical Papers\n\nPhong Tran (MBZUAI); Egor Zakharov (ETH Zu rich); Long-Nhat Ho, Adilbek Karmanov, and Ariana Bermudez Venegas (MBZUAI ); McLean Goldwhite, Aviral Agarwal, and Liwen Hu (Pinscreen); Anh Tran (V inAI Research); and Hao Li (MBZUAI, Pinscreen)\n\nWe introduce VOODOO XP: a 3D-aware one-shot head reenactment method that can generate highly expre ssive facial expressions from any input driver video and a single 2D portr ait. Our solution is real-time, view-consistent, and can be instantly used without calibration or fine-tuning. We demonstrate our solution on a mono cular video setting and an end-to-end VR telepresence system for two-way c ommunication. Compared to 2D head reenactment methods, 3D-aware approaches aim to preserve the identity of the subject and ensure view-consistent fa cial geometry for novel camera poses, which makes them suitable for immers ive applications. While various facial disentanglement techniques have bee n introduced, cutting-edge 3D-aware neural reenactment techniques still la ck expressiveness and fail to reproduce complex and fine-scale facial expr essions. We present a novel cross-reenactment architecture that directly t ransfers the driver's facial expressions to transformer blocks of the inpu t source's 3D lifting module. We show that highly effective disentanglemen t is possible using an innovative multi-stage self-supervision approach, w hich is based on a coarse-to-fine strategy, combined with an explicit face neutralization and 3D lifted frontalization during its initial training s tage. We further integrate our novel head reenactment solution into an acc essible high-fidelity VR telepresence system, where any person can instant ly build a personalized neural head avatar from any photo and bring it to life using the headset. We demonstrate state-of-the-art performance in ter ms of expressiveness and likeness preservation on a large set of diverse s ubjects and capture conditions.\n\nRegistration Category: Full Access, Ful l Access Supporter\n\nLanguage Format: English Language\n\nSession Chair: Hongbo Fu (Hong Kong University of Science and Technology) URL:https://asia.siggraph.org/2024/program/?id=papers_912&sess=sess138 END:VEVENT END:VCALENDAR