BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070240Z
LOCATION:Darling Harbour Theatre\, Level 2 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231212T093000
DTEND;TZID=Australia/Melbourne:20231212T124500
UID:siggraphasia_SIGGRAPH Asia 2023_sess209_papers_482@linklings.com
SUMMARY:AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars
DESCRIPTION:Technical Papers\n\nMohit Mendiratta, Xingang Pan, Mohamed Elg
 harib, Kartik Teotia, and Mallikarjun B R (Max Planck Institute for Inform
 atics); Ayush Tewari (MIT CSAIL); Vladislav Golyanik (Max Planck Institute
  for Informatics); Adam Kortylewski (Max Planck Institute for Informatics,
  University of Freiburg); and Christian Theobalt (Max Planck Institute for
  Informatics)\n\nCapturing and editing full head performances enables the 
 creation of virtual characters with various applications such as extended 
 reality and media production. The past few years witnessed a steep rise in
  the photorealism of human head avatars. Such avatars can be controlled th
 rough different input data modalities, including RGB, audio, depth, IMUs a
 nd others. While these data modalities provide effective means of control,
  they mostly focus on editing the head movements such as the facial expres
 sions, head pose and/or camera viewpoint. In this paper, we propose Avatar
 Studio, a text-based method for editing the appearance of a dynamic full h
 ead avatar. Our approach builds on existing work to capture dynamic perfor
 mances of human heads using neural radiance field (NeRF) and edits this re
 presentation with a text-to-image diffusion model. Specifically, we introd
 uce an optimization strategy for incorporating multiple keyframes represen
 ting different camera viewpoints and time stamps of a video performance in
 to a single diffusion model. Using this personalized diffusion model, we e
 dit the dynamic NeRF by introducing view-and-time-aware Score Distillation
  Sampling (VT-SDS) following a model-based guidance approach. Our method e
 dits the full head in a canonical space, and then propagates these edits t
 o remaining time steps via a pretrained deformation network. We evaluate o
 ur method visually and numerically via a user study, and results show that
  our method outperforms existing approaches. Our experiments validate the 
 design choices of our method and highlight that our edits are genuine, per
 sonalized, as well as 3D- and time-consistent.\n\nRegistration Category: F
 ull Access, Enhanced Access, Trade Exhibitor, Experience Hall Exhibitor
URL:https://asia.siggraph.org/2023/full-program?id=papers_482&sess=sess209
END:VEVENT
END:VCALENDAR