BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B7 (1)\, B Block\, Level 7 DTSTART;TZID=Asia/Tokyo:20241205T171600 DTEND;TZID=Asia/Tokyo:20241205T172800 UID:siggraphasia_SIGGRAPH Asia 2024_sess138_papers_298@linklings.com SUMMARY:Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Port rait Animation DESCRIPTION:Technical Papers\n\nYue Ma and Hongyu Liu (Hong Kong Universit y of Science and Technology); Hongfa Wang and Heng Pan (Tencent); Yingqing He (Hong Kong University of Science and Technology); Junkun Yuan, Ailing Zeng, and Chengfei Cai (Tencent); Heung-Yeung Shum (Tsinghua University); Wei Liu (Tencent); and Qifeng Chen (Hong Kong University of Science and Te chnology)\n\nWe present Follow-Your-Emoji, a diffusion-based framework for portrait animation, which animates a reference portrait with target landm ark sequences. The main challenge of portrait animation is to preserve the identity of the reference portrait and transfer the target expression to this portrait while maintaining temporal consistency and fidelity. To addr ess these challenges, Follow-Your-Emoji equipped the powerful Stable Diffu sion model with two well-designed technologies. Specifically, we first ad opt a new explicit motion signal, namely expression-aware landmark, to gu ide the animation process. We discover this landmark can not only ensure t he accurate motion alignment between the reference portrait and target mot ion during inference but also increase the ability to portray exaggerated expressions (i.e., large pupil movements) and avoid identity leakage. Then , we propose a facial fine-grained loss to improve the model's ability of subtle expression perception and reference portrait appearance reconstruct ion by using both expression and facial masks. Accordingly, our method dem onstrates significant performance in controlling the expression of freesty le portraits, including real humans, cartoons, sculptures, and even animal s. By leveraging a simple and effective progressive generation strategy, w e extend our model to stable long-term animation, thus increasing its pote ntial application value. To address the lack of a benchmark for this field , we introduce EmojiBench, a comprehensive benchmark comprising diverse po rtrait images, driving videos, and landmarks. We show extensive evaluation s on EmojiBench to verify the superiority of Follow-Your-Emoji.\n\nRegistr ation Category: Full Access, Full Access Supporter\n\nLanguage Format: Eng lish Language\n\nSession Chair: Hongbo Fu (Hong Kong University of Science and Technology) URL:https://asia.siggraph.org/2024/program/?id=papers_298&sess=sess138 END:VEVENT END:VCALENDAR