BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070240Z
LOCATION:Darling Harbour Theatre\, Level 2 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231212T093000
DTEND;TZID=Australia/Melbourne:20231212T124500
UID:siggraphasia_SIGGRAPH Asia 2023_sess209_papers_265@linklings.com
SUMMARY:Decaf: Monocular Deformation Capture for Face and Hand Interaction
 s
DESCRIPTION:Technical Papers\n\nSoshi Shimada (Max-Planck-Institut für Inf
 ormatik; Saarbrücken Research Center for Visual Computing, Interaction and
  Artificial  Intelligence); Vladislav Golyanik (Max-Planck-Institut für In
 formatik); Patrick Pérez (Valeo); and Christian Theobalt (Max-Planck-Insti
 tut für Informatik; Saarbrücken Research Center for Visual Computing, Inte
 raction and Artificial  Intelligence)\n\nExisting methods for 3D tracking 
 from monocular RGB videos predominantly consider articulated and rigid obj
 ects (e.g., two hands or humans interacting with rigid environments).  Mod
 elling dense non-rigid object deformations in this setting (e.g., when han
 d are interacting with a face), remained largely unaddressed so far, altho
 ugh such effects can improve the realism of the downstream applications su
 ch as AR/VR, 3D virtual avatar communications, and character animations. T
 his is due to the severe ill-posedness of the monocular view setting and t
 he associated challenges (e.g., in acquiring a dataset for training and ev
 aluation or obtaining the reasonable non-uniform stiffness of the deformab
 le object).  While it is possible to na\"{i}vely track multiple non-rigid 
 objects independently using 3D templates or parametric 3D models, such an 
 approach would suffer from multiple artefacts in the resulting 3D estimate
 s such as depth ambiguity, unnatural intra-object collisions and missing o
 r implausible deformations.  \n \nHence, this paper introduces the first m
 ethod that addresses the fundamental challenges depicted above and that al
 lows tracking human hands interacting with human faces in 3D from single m
 onocular RGB videos.  We model hands as articulated objects inducing non-r
 igid face deformations during an active interaction.  Our method relies on
  a new hand-face motion and interaction capture dataset with realistic fac
 e deformations acquired with a markerless multi-view camera system. As a p
 ivotal step in its creation, we process the reconstructed raw 3D shapes wi
 th position-based dynamics and an approach for non-uniform stiffness estim
 ation of the head tissues, which results in plausible annotations of the s
 urface deformations, hand-face contact regions and head-hand positions. At
  the core of our neural approach are a variational auto-encoder supplying 
 the hand-face depth prior and modules that guide the 3D tracking by estima
 ting the contacts and the deformations.\n\nRegistration Category: Full Acc
 ess, Enhanced Access, Trade Exhibitor, Experience Hall Exhibitor
URL:https://asia.siggraph.org/2023/full-program?id=papers_265&sess=sess209
END:VEVENT
END:VCALENDAR