BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Australia/Melbourne X-LIC-LOCATION:Australia/Melbourne BEGIN:DAYLIGHT TZOFFSETFROM:+1000 TZOFFSETTO:+1100 TZNAME:AEDT DTSTART:19721003T020000 RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU END:DAYLIGHT BEGIN:STANDARD DTSTART:19721003T020000 TZOFFSETFROM:+1100 TZOFFSETTO:+1000 TZNAME:AEST RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20240214T070241Z LOCATION:Darling Harbour Theatre\, Level 2 (Convention Centre) DTSTART;TZID=Australia/Melbourne:20231212T093000 DTEND;TZID=Australia/Melbourne:20231212T124500 UID:siggraphasia_SIGGRAPH Asia 2023_sess209_papers_345@linklings.com SUMMARY:Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation DESCRIPTION:Technical Papers\n\nShuai Yang, Yifan Zhou, Ziwei Liu, and Che n Change Loy (Nanyang Technological University, Singapore)\n\nLarge text-t o-image diffusion models have exhibited impressive proficiency in generati ng high-quality images. However, when applying these models to video domai n, ensuring temporal consistency across video frames remains a formidable challenge.\nThis paper proposes a novel zero-shot text-guided video-to-vid eo translation framework to adapt image models to videos. The framework in cludes two parts: key frame translation and full video translation. The fi rst part uses an adapted diffusion model to generate key frames, with hier archical cross-frame constraints applied to enforce coherence in shapes, t extures and colors. The second part propagates the key frames to other fra mes with temporal-aware patch matching and frame blending. Our framework a chieves global style and local texture temporal consistency at a low cost (without re-training or optimization). The adaptation is compatible with e xisting image diffusion techniques, allowing our framework to take advanta ge of them, such as customizing a specific subject with LoRA, and introduc ing extra spatial guidance with ControlNet.\nExtensive experimental result s demonstrate the effectiveness of our proposed framework over existing me thods in rendering high-quality and temporally-coherent videos.\n\nRegistr ation Category: Full Access, Enhanced Access, Trade Exhibitor, Experience Hall Exhibitor URL:https://asia.siggraph.org/2023/full-program?id=papers_345&sess=sess209 END:VEVENT END:VCALENDAR