BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Australia/Melbourne X-LIC-LOCATION:Australia/Melbourne BEGIN:DAYLIGHT TZOFFSETFROM:+1000 TZOFFSETTO:+1100 TZNAME:AEDT DTSTART:19721003T020000 RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU END:DAYLIGHT BEGIN:STANDARD DTSTART:19721003T020000 TZOFFSETFROM:+1100 TZOFFSETTO:+1000 TZNAME:AEST RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20240214T070249Z LOCATION:Meeting Room C4.11\, Level 4 (Convention Centre) DTSTART;TZID=Australia/Melbourne:20231215T104000 DTEND;TZID=Australia/Melbourne:20231215T105000 UID:siggraphasia_SIGGRAPH Asia 2023_sess135_papers_345@linklings.com SUMMARY:Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation DESCRIPTION:Technical Papers, TOG\n\nShuai Yang, Yifan Zhou, Ziwei Liu, an d Chen Change Loy (Nanyang Technological University, Singapore)\n\nLarge t ext-to-image diffusion models have exhibited impressive proficiency in gen erating high-quality images. However, when applying these models to video domain, ensuring temporal consistency across video frames remains a formid able challenge.\nThis paper proposes a novel zero-shot text-guided video-t o-video translation framework to adapt image models to videos. The framewo rk includes two parts: key frame translation and full video translation. T he first part uses an adapted diffusion model to generate key frames, with hierarchical cross-frame constraints applied to enforce coherence in shap es, textures and colors. The second part propagates the key frames to othe r frames with temporal-aware patch matching and frame blending. Our framew ork achieves global style and local texture temporal consistency at a low cost (without re-training or optimization). The adaptation is compatible w ith existing image diffusion techniques, allowing our framework to take ad vantage of them, such as customizing a specific subject with LoRA, and int roducing extra spatial guidance with ControlNet.\nExtensive experimental r esults demonstrate the effectiveness of our proposed framework over existi ng methods in rendering high-quality and temporally-coherent videos.\n\nRe gistration Category: Full Access\n\nSession Chair: Chongyang Ma (ByteDance ) URL:https://asia.siggraph.org/2023/full-program?id=papers_345&sess=sess135 END:VEVENT END:VCALENDAR