BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023313Z LOCATION:Hall B5 (2)\, B Block\, Level 5 DTSTART;TZID=Asia/Tokyo:20241206T105600 DTEND;TZID=Asia/Tokyo:20241206T110800 UID:siggraphasia_SIGGRAPH Asia 2024_sess143_papers_468@linklings.com SUMMARY:GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avata rs from Coarse-to-fine Representations DESCRIPTION:Technical Papers\n\nKartik Teotia (Max Planck Institute for In formatics, Saarland Informatics Campus); Hyeongwoo Kim (Imperial College L ondon); Pablo Garrido (Flawless AI); Marc Habermann (Max Planck Institute for Informatics, Saarland Informatics Campus); Mohamed Elgharib (Max Planc k Institute for Informatics); and Christian Theobalt (Max Planck Institute for Informatics, Saarland Informatics Campus)\n\nReal-time rendering of h uman head avatars is a cornerstone of many computer graphics applications, such as augmented reality, video games, and films, to name a few. Recent approaches address this challenge with computationally efficient geometry primitives in a carefully calibrated multi-view setup. Albeit producing ph otorealistic head renderings, it often fails to represent complex motion c hanges such as the mouth interior and strongly varying head poses. We prop ose a new method to generate highly dynamic and deformable human head avat ars from multi-view imagery in real-time. At the core of our method is a h ierarchical representation of head models that allows to capture the compl ex dynamics of facial expressions and head movements. First, with rich fac ial features extracted from raw input frames, we learn to deform the coars e facial geometry of the template mesh. We then initialize 3D Gaussians on the deformed surface and refine their positions in a fine step. We train this coarse-to-fine facial avatar model along with the head pose as a lear nable parameter in an end-to-end framework. This enables not only controll able facial animation via video inputs, but also high-fidelity novel view synthesis of challenging facial expressions, such as tongue deformations a nd fine-grained teeth structure under large motion changes. Moreover, it e ncourages the learned head avatar to generalize towards new facial express ions and head poses at inference time. We demonstrate the performance of o ur method with comparisons against the related methods on different datase ts, spanning challenging facial expression sequences across multiple ident ities. We also show the potential application of our approach by demonstra ting a cross-identity facial performance transfer application.\n\nRegistra tion Category: Full Access, Full Access Supporter\n\nLanguage Format: Engl ish Language\n\nSession Chair: Iain Matthews (Epic Games, Carnegie Mellon University) URL:https://asia.siggraph.org/2024/program/?id=papers_468&sess=sess143 END:VEVENT END:VCALENDAR