BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B5 (2)\, B Block\, Level 5 DTSTART;TZID=Asia/Tokyo:20241204T131100 DTEND;TZID=Asia/Tokyo:20241204T132300 UID:siggraphasia_SIGGRAPH Asia 2024_sess116_papers_884@linklings.com SUMMARY:InstantDrag: Improving Interactivity in Drag-based Image Editing DESCRIPTION:Technical Papers\n\nJoonghyuk Shin (Seoul National University) , Daehyeon Choi (POSTECH), and Jaesik Park (Seoul National University)\n\n Drag-based image editing has recently gained popularity for its interactiv ity and precision. However, despite the ability of text-to-image models to generate samples within a second, drag editing still lags behind due to t he challenge of accurately reflecting user interaction while maintaining i mage content. Some existing approaches rely on computationally intensive p er-image optimization or intricate guidance-based methods, requiring addit ional inputs such as masks for movable regions and text prompts, thereby c ompromising the interactivity of the editing process. We introduce Instant Drag, an optimization-free pipeline that enhances interactivity and speed, requiring only an image and a drag instruction as input. InstantDrag cons ists of two carefully designed networks: a drag-conditioned optical flow g enerator (FlowGen) and an optical flow-conditioned diffusion model (FlowDi ffusion). InstantDrag learns motion dynamics for drag-based image editing in real-world video datasets by decomposing the task into motion generatio n and motion-conditioned image generation. We demonstrate InstantDrag's ca pability to perform fast, photo-realistic edits without masks or text prom pts through experiments on facial video datasets and general scenes. These results highlight the efficiency of our approach in handling drag-based i mage editing, making it a promising solution for interactive, real-time ap plications.\n\nRegistration Category: Full Access, Full Access Supporter\n \nLanguage Format: English Language\n\nSession Chair: Dani Lischinski (Heb rew University of Jerusalem, Google) URL:https://asia.siggraph.org/2024/program/?id=papers_884&sess=sess116 END:VEVENT END:VCALENDAR