BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070249Z
LOCATION:Meeting Room C4.11\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231215T104000
DTEND;TZID=Australia/Melbourne:20231215T105000
UID:siggraphasia_SIGGRAPH Asia 2023_sess135_papers_345@linklings.com
SUMMARY:Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
DESCRIPTION:Technical Papers, TOG\n\nShuai Yang, Yifan Zhou, Ziwei Liu, an
 d Chen Change Loy (Nanyang Technological University, Singapore)\n\nLarge t
 ext-to-image diffusion models have exhibited impressive proficiency in gen
 erating high-quality images. However, when applying these models to video 
 domain, ensuring temporal consistency across video frames remains a formid
 able challenge.\nThis paper proposes a novel zero-shot text-guided video-t
 o-video translation framework to adapt image models to videos. The framewo
 rk includes two parts: key frame translation and full video translation. T
 he first part uses an adapted diffusion model to generate key frames, with
  hierarchical cross-frame constraints applied to enforce coherence in shap
 es, textures and colors. The second part propagates the key frames to othe
 r frames with temporal-aware patch matching and frame blending. Our framew
 ork achieves global style and local texture temporal consistency at a low 
 cost (without re-training or optimization). The adaptation is compatible w
 ith existing image diffusion techniques, allowing our framework to take ad
 vantage of them, such as customizing a specific subject with LoRA, and int
 roducing extra spatial guidance with ControlNet.\nExtensive experimental r
 esults demonstrate the effectiveness of our proposed framework over existi
 ng methods in rendering high-quality and temporally-coherent videos.\n\nRe
 gistration Category: Full Access\n\nSession Chair: Chongyang Ma (ByteDance
 )
URL:https://asia.siggraph.org/2023/full-program?id=papers_345&sess=sess135
END:VEVENT
END:VCALENDAR