BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B7 (1)\, B Block\, Level 7 DTSTART;TZID=Asia/Tokyo:20241205T131400 DTEND;TZID=Asia/Tokyo:20241205T132800 UID:siggraphasia_SIGGRAPH Asia 2024_sess132_papers_249@linklings.com SUMMARY:Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusi on for Zero-shot Motion Transfer DESCRIPTION:Technical Papers\n\nSigal Raab, Inbar Gat, Nathan Sala, Guy Te vet, and Rotem Shalev-Arkushin (Tel Aviv University); Ohad Fried (Reichman University); and Amit Haim Bermano and Daniel Cohen-Or (Tel Aviv Universi ty)\n\nGiven the remarkable results of motion synthesis with diffusion mod els, a natural question arises: how can we effectively leverage these mode ls for motion editing? Existing diffusion-based motion editing methods ove rlook the profound potential of the prior embedded within the weights of p re-trained models, which enables manipulating the latent feature space; he nce, they primarily center on handling the motion space. In this work, we explore the attention mechanism of pre-trained motion diffusion models. We uncover the roles and interactions of attention elements in capturing and representing intricate human motion patterns, and carefully integrate the se elements to transfer a leader motion to a follower one while maintainin g the nuanced characteristics of the follower, resulting in zero-shot moti on transfer. Manipulating features associated with selected motions allows us to confront a challenge observed in prior motion diffusion approaches, which use general directives (e.g., text, music) for editing, ultimately failing to convey subtle nuances effectively. Our work is inspired by how a monkey closely imitates what it sees while maintaining its unique motion patterns; hence we call it Monkey See, Monkey Do, and dub it MoMo. Employ ing our technique enables accomplishing tasks such as synthesizing out-of- distribution motions, style transfer, and spatial editing. Furthermore, di ffusion inversion is seldom employed for motions; as a result, editing eff orts focus on generated motions, limiting the editability of real ones. Mo Mo harnesses motion inversion, extending its application to both real and generated motions. Experimental results show the advantage of our approach over the current art. In particular, unlike methods tailored for specific applications through training, our approach is applied at inference time, requiring no training. Our webpage, https://monkeyseedocg.github.io, incl udes links to videos and code.\n\nRegistration Category: Full Access, Full Access Supporter\n\nLanguage Format: English Language\n\nSession Chair: Y i Zhou (Adobe) URL:https://asia.siggraph.org/2024/program/?id=papers_249&sess=sess132 END:VEVENT END:VCALENDAR