BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260114T163652Z
LOCATION:Meeting Room C4.11\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231215T104000
DTEND;TZID=Australia/Melbourne:20231215T105000
UID:siggraphasia_SIGGRAPH Asia 2023_sess135_papers_345@linklings.com
SUMMARY:Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
DESCRIPTION:Shuai Yang, Yifan Zhou, Ziwei Liu, and Chen Change Loy (Nanyan
 g Technological University, Singapore)\n\nLarge text-to-image diffusion mo
 dels have exhibited impressive proficiency in generating high-quality imag
 es. However, when applying these models to video domain, ensuring temporal
  consistency across video frames remains a formidable challenge.\nThis pap
 er proposes a novel zero-shot text-guided video-to-video translation frame
 work to adapt image models to videos. The framework includes two parts: ke
 y frame translation and full video translation. The first part uses an ada
 pted diffusion model to generate key frames, with hierarchical cross-frame
  constraints applied to enforce coherence in shapes, textures and colors. 
 The second part propagates the key frames to other frames with temporal-aw
 are patch matching and frame blending. Our framework achieves global style
  and local texture temporal consistency at a low cost (without re-training
  or optimization). The adaptation is compatible with existing image diffus
 ion techniques, allowing our framework to take advantage of them, such as 
 customizing a specific subject with LoRA, and introducing extra spatial gu
 idance with ControlNet.\nExtensive experimental results demonstrate the ef
 fectiveness of our proposed framework over existing methods in rendering h
 igh-quality and temporally-coherent videos.\n\nRegistration Category: Full
  Access\n\nSession Chair: Chongyang Ma (ByteDance)\n\n
URL:https://asia.siggraph.org/2023/full-program?id=papers_345&sess=sess135
END:VEVENT
END:VCALENDAR
