BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260114T163645Z
LOCATION:Meeting Room C4.8\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231215T112500
DTEND;TZID=Australia/Melbourne:20231215T114000
UID:siggraphasia_SIGGRAPH Asia 2023_sess156_papers_265@linklings.com
SUMMARY:Decaf: Monocular Deformation Capture for Face and Hand Interaction
 s
DESCRIPTION:Soshi Shimada (Max-Planck-Institut für Informatik; Saarbrücken
  Research Center for Visual Computing, Interaction and Artificial  Intelli
 gence); Vladislav Golyanik (Max-Planck-Institut für Informatik); Patrick P
 érez (Valeo); and Christian Theobalt (Max-Planck-Institut für Informatik; 
 Saarbrücken Research Center for Visual Computing, Interaction and Artifici
 al  Intelligence)\n\nExisting methods for 3D tracking from monocular RGB v
 ideos predominantly consider articulated and rigid objects (e.g., two hand
 s or humans interacting with rigid environments).  Modelling dense non-rig
 id object deformations in this setting (e.g., when hand are interacting wi
 th a face), remained largely unaddressed so far, although such effects can
  improve the realism of the downstream applications such as AR/VR, 3D virt
 ual avatar communications, and character animations. This is due to the se
 vere ill-posedness of the monocular view setting and the associated challe
 nges (e.g., in acquiring a dataset for training and evaluation or obtainin
 g the reasonable non-uniform stiffness of the deformable object).  While i
 t is possible to na\"{i}vely track multiple non-rigid objects independentl
 y using 3D templates or parametric 3D models, such an approach would suffe
 r from multiple artefacts in the resulting 3D estimates such as depth ambi
 guity, unnatural intra-object collisions and missing or implausible deform
 ations.  \n \nHence, this paper introduces the first method that addresses
  the fundamental challenges depicted above and that allows tracking human 
 hands interacting with human faces in 3D from single monocular RGB videos.
   We model hands as articulated objects inducing non-rigid face deformatio
 ns during an active interaction.  Our method relies on a new hand-face mot
 ion and interaction capture dataset with realistic face deformations acqui
 red with a markerless multi-view camera system. As a pivotal step in its c
 reation, we process the reconstructed raw 3D shapes with position-based dy
 namics and an approach for non-uniform stiffness estimation of the head ti
 ssues, which results in plausible annotations of the surface deformations,
  hand-face contact regions and head-hand positions. At the core of our neu
 ral approach are a variational auto-encoder supplying the hand-face depth 
 prior and modules that guide the 3D tracking by estimating the contacts an
 d the deformations.\n\nRegistration Category: Full Access\n\nSession Chair
 : Sergi Pujades (National Institute for Research in Computer Science and A
 utomation (INRIA), Université Grenoble Alpes)\n\n
URL:https://asia.siggraph.org/2023/full-program?id=papers_265&sess=sess156
END:VEVENT
END:VCALENDAR