BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070250Z
LOCATION:Meeting Room C4.8\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231215T112500
DTEND;TZID=Australia/Melbourne:20231215T114000
UID:siggraphasia_SIGGRAPH Asia 2023_sess156_papers_265@linklings.com
SUMMARY:Decaf: Monocular Deformation Capture for Face and Hand Interaction
 s
DESCRIPTION:Technical Communications, Technical Papers\n\nSoshi Shimada (M
 ax-Planck-Institut für Informatik; Saarbrücken Research Center for Visual 
 Computing, Interaction and Artificial  Intelligence); Vladislav Golyanik (
 Max-Planck-Institut für Informatik); Patrick Pérez (Valeo); and Christian 
 Theobalt (Max-Planck-Institut für Informatik; Saarbrücken Research Center 
 for Visual Computing, Interaction and Artificial  Intelligence)\n\nExistin
 g methods for 3D tracking from monocular RGB videos predominantly consider
  articulated and rigid objects (e.g., two hands or humans interacting with
  rigid environments).  Modelling dense non-rigid object deformations in th
 is setting (e.g., when hand are interacting with a face), remained largely
  unaddressed so far, although such effects can improve the realism of the 
 downstream applications such as AR/VR, 3D virtual avatar communications, a
 nd character animations. This is due to the severe ill-posedness of the mo
 nocular view setting and the associated challenges (e.g., in acquiring a d
 ataset for training and evaluation or obtaining the reasonable non-uniform
  stiffness of the deformable object).  While it is possible to na\"{i}vely
  track multiple non-rigid objects independently using 3D templates or para
 metric 3D models, such an approach would suffer from multiple artefacts in
  the resulting 3D estimates such as depth ambiguity, unnatural intra-objec
 t collisions and missing or implausible deformations.  \n \nHence, this pa
 per introduces the first method that addresses the fundamental challenges 
 depicted above and that allows tracking human hands interacting with human
  faces in 3D from single monocular RGB videos.  We model hands as articula
 ted objects inducing non-rigid face deformations during an active interact
 ion.  Our method relies on a new hand-face motion and interaction capture 
 dataset with realistic face deformations acquired with a markerless multi-
 view camera system. As a pivotal step in its creation, we process the reco
 nstructed raw 3D shapes with position-based dynamics and an approach for n
 on-uniform stiffness estimation of the head tissues, which results in plau
 sible annotations of the surface deformations, hand-face contact regions a
 nd head-hand positions. At the core of our neural approach are a variation
 al auto-encoder supplying the hand-face depth prior and modules that guide
  the 3D tracking by estimating the contacts and the deformations.\n\nRegis
 tration Category: Full Access\n\nSession Chair: Sergi Pujades (National In
 stitute for Research in Computer Science and Automation (INRIA), Universit
 é Grenoble Alpes)
URL:https://asia.siggraph.org/2023/full-program?id=papers_265&sess=sess156
END:VEVENT
END:VCALENDAR