BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070246Z
LOCATION:Meeting Room C4.11\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231214T092500
DTEND;TZID=Australia/Melbourne:20231214T094000
UID:siggraphasia_SIGGRAPH Asia 2023_sess124_papers_482@linklings.com
SUMMARY:AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars
DESCRIPTION:Technical Papers, TOG\n\nMohit Mendiratta, Xingang Pan, Mohame
 d Elgharib, Kartik Teotia, and Mallikarjun B R (Max Planck Institute for I
 nformatics); Ayush Tewari (MIT CSAIL); Vladislav Golyanik (Max Planck Inst
 itute for Informatics); Adam Kortylewski (Max Planck Institute for Informa
 tics, University of Freiburg); and Christian Theobalt (Max Planck Institut
 e for Informatics)\n\nCapturing and editing full head performances enables
  the creation of virtual characters with various applications such as exte
 nded reality and media production. The past few years witnessed a steep ri
 se in the photorealism of human head avatars. Such avatars can be controll
 ed through different input data modalities, including RGB, audio, depth, I
 MUs and others. While these data modalities provide effective means of con
 trol, they mostly focus on editing the head movements such as the facial e
 xpressions, head pose and/or camera viewpoint. In this paper, we propose A
 vatarStudio, a text-based method for editing the appearance of a dynamic f
 ull head avatar. Our approach builds on existing work to capture dynamic p
 erformances of human heads using neural radiance field (NeRF) and edits th
 is representation with a text-to-image diffusion model. Specifically, we i
 ntroduce an optimization strategy for incorporating multiple keyframes rep
 resenting different camera viewpoints and time stamps of a video performan
 ce into a single diffusion model. Using this personalized diffusion model,
  we edit the dynamic NeRF by introducing view-and-time-aware Score Distill
 ation Sampling (VT-SDS) following a model-based guidance approach. Our met
 hod edits the full head in a canonical space, and then propagates these ed
 its to remaining time steps via a pretrained deformation network. We evalu
 ate our method visually and numerically via a user study, and results show
  that our method outperforms existing approaches. Our experiments validate
  the design choices of our method and highlight that our edits are genuine
 , personalized, as well as 3D- and time-consistent.\n\nRegistration Catego
 ry: Full Access\n\nSession Chair: Lin Gao (University of Chinese Academy o
 f Sciences)
URL:https://asia.siggraph.org/2023/full-program?id=papers_482&sess=sess124
END:VEVENT
END:VCALENDAR