BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Tokyo
X-LIC-LOCATION:Asia/Tokyo
BEGIN:STANDARD
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:JST
DTSTART:18871231T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250110T023313Z
LOCATION:Hall B7 (1)\, B Block\, Level 7
DTSTART;TZID=Asia/Tokyo:20241206T131400
DTEND;TZID=Asia/Tokyo:20241206T132800
UID:siggraphasia_SIGGRAPH Asia 2024_sess147_papers_977@linklings.com
SUMMARY:SIGGesture: Generalized Co-Speech Gesture Synthesis via Semantic I
 njection with Large-Scale Pre-Training Diffusion Models
DESCRIPTION:Technical Papers\n\nQingrong Cheng (Tencent AI Lab, Tencent TI
 MI L1 Studio) and Xu Li and Xinghui Fu (Tencent AI Lab)\n\nThe automated s
 ynthesis of high-quality 3D gestures from speech holds significant value f
 or virtual humans and gaming. Previous methods primarily focus on synchron
 izing gestures with speech rhythm, often neglecting semantic gestures. The
 se semantic gestures are sparse and follow a long-tailed distribution acro
 ss the gesture sequence, making them challenging to learn in an end-to-end
  manner. Additionally, generating rhythmically aligned gestures that gener
 alize well to in-the-wild speech remains a significant challenge. To addre
 ss these issues, we introduce SIGGesture, a novel diffusion-based approach
  for synthesizing realistic gestures that are both high-quality and semant
 ically pertinent. Specifically, we firstly build a robust diffusion-based 
 foundation model for rhythmical gesture synthesis by pre-training it on a 
 collected large-scale dataset with pseudo labels. Secondly,  we leverage t
 he powerful generalization capabilities of Large Language Models (LLMs) to
  generate appropriate semantic gestures for various speech transcripts. Fi
 nally, we propose a semantic injection module to infuse semantic informati
 on into the synthesized results during the diffusion reverse process. Exte
 nsive experiments demonstrate that SIGGesture significantly outperforms ex
 isting baselines, exhibiting excellent generalization and controllability.
 \n\nRegistration Category: Full Access, Full Access Supporter\n\nLanguage 
 Format: English Language\n\nSession Chair: Yi Zhou (Adobe)
URL:https://asia.siggraph.org/2024/program/?id=papers_977&sess=sess147
END:VEVENT
END:VCALENDAR