BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Tokyo
X-LIC-LOCATION:Asia/Tokyo
BEGIN:STANDARD
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:JST
DTSTART:18871231T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250110T023312Z
LOCATION:Hall B7 (1)\, B Block\, Level 7
DTSTART;TZID=Asia/Tokyo:20241205T110800
DTEND;TZID=Asia/Tokyo:20241205T111900
UID:siggraphasia_SIGGRAPH Asia 2024_sess129_papers_1250@linklings.com
SUMMARY:FürElise: Capturing and Physically Synthesizing Hand Motion of Pia
 no Performance
DESCRIPTION:Technical Papers\n\nRuocheng Wang, Pei Xu, Haochen Shi, Elizab
 eth Schumann, and C. Karen Liu (Stanford University)\n\nPiano playing requ
 ires agile, precise, and coordinated hand control that stretches the limit
 s of dexterity. Hand motion models with the sophistication to accurately r
 ecreate piano playing have a wide range of applications in character anima
 tion, embodied AI, biomechanics, and VR/AR. In this paper, we construct a 
 first-of-its-kind large-scale dataset that contains approximately 10 hours
  of 3D hand motion and audio from 15 elite-level pianists playing 153 piec
 es of classical music. To capture natural performances, we designed a mark
 erless setup in which motions are reconstructed from multi-view videos usi
 ng state-of-the-art pose estimation models. The motion data is further ref
 ined via inverse kinematics using the high-resolution MIDI key-pressing da
 ta obtained from sensors in a specialized Yamaha Disklavier piano. Leverag
 ing the collected dataset, we developed a pipeline thatcan synthesize phys
 ically-plausible hand motions for musical scores outside of the dataset. O
 ur approach employs a combination of imitation learning and reinforcement 
 learning to obtain policies for physics-based bimanual control involving t
 he interaction between hands and piano keys. To solve the sampling efficie
 ncy problem with the large motion dataset, we use a diffusion model to gen
 erate natural reference motions, which provide high-level trajectory and f
 ingering (finger order and placement) information. However, the generated 
 reference motion alone does not provide sufficient accuracy for piano perf
 ormance modeling. We then further augmented the data by using musical simi
 larity to retrieve similar motions from the captured dataset to boost the 
 precision of the RL policy. With the proposed method, our model generates 
 natural, dexterous motions that generalize to music from outside the train
 ing dataset.\n\nRegistration Category: Full Access, Full Access Supporter\
 n\nLanguage Format: English Language\n\nSession Chair: Yuting Ye (Reality 
 Labs Research, Meta; Meta)
URL:https://asia.siggraph.org/2024/program/?id=papers_1250&sess=sess129
END:VEVENT
END:VCALENDAR