BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B5 (2)\, B Block\, Level 5 DTSTART;TZID=Asia/Tokyo:20241205T151900 DTEND;TZID=Asia/Tokyo:20241205T153100 UID:siggraphasia_SIGGRAPH Asia 2024_sess134_papers_816@linklings.com SUMMARY:I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffu sion Models DESCRIPTION:Technical Papers\n\nWenqi Ouyang (S-Lab for Advanced Intellige nce, Nanyang Technological University Singapore); Yi Dong (Nanyang Technol ogical University (NTU)); Lei Yang and Jianlou Si (SenseTime); and Xingang Pan (S-Lab for Advanced Intelligence, Nanyang Technological University Si ngapore)\n\nThe remarkable generative capabilities of diffusion models hav e motivated extensive research in both image and video editing. Compared t o video editing which faces additional challenges in the time dimension, i mage editing has witnessed the development of more diverse, high-quality a pproaches and more capable software like Photoshop. In light of this gap, we introduce a novel and generic solution that extends the applicability o f image editing tools to videos by propagating edits from a single frame t o the entire video using a pre-trained image-to-video model. Our method, d ubbed I2VEdit, adaptively preserves the visual and motion integrity of the source video depending on the extent of the edits, effectively handling g lobal edits, local edits, and moderate shape changes, which existing metho ds cannot fully achieve. At the core of our method are two main processes: Coarse Motion Extraction to align basic motion patterns with the original video, and Appearance Refinement for precise adjustments using fine-grain ed attention matching. We also incorporate a skip-interval strategy to mit igate quality degradation from auto-regressive generation across multiple video clips. Experimental results demonstrate our framework's superior per formance in fine-grained video editing, proving its capability to produce high-quality, temporally consistent outputs.\n\nRegistration Category: Ful l Access, Full Access Supporter\n\nLanguage Format: English Language\n\nSe ssion Chair: Nanxuan Zhao (Adobe Research) URL:https://asia.siggraph.org/2024/program/?id=papers_816&sess=sess134 END:VEVENT END:VCALENDAR