BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Tokyo
X-LIC-LOCATION:Asia/Tokyo
BEGIN:STANDARD
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:JST
DTSTART:18871231T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250110T023312Z
LOCATION:Hall B5 (2)\, B Block\, Level 5
DTSTART;TZID=Asia/Tokyo:20241205T150800
DTEND;TZID=Asia/Tokyo:20241205T151900
UID:siggraphasia_SIGGRAPH Asia 2024_sess134_papers_485@linklings.com
SUMMARY:Lumiere: A Space-Time Diffusion Model for Video Generation
DESCRIPTION:Technical Papers\n\nOmer Bar-Tal (Google Research, Weizmann In
 stitute of Science); Hila Chefer (Google Research, Tel Aviv University); O
 mer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa H
 ur, Guanghui Liu, Amit Raj, Yuanzhen Li, and Michael Rubinstein (Google Re
 search); Tomer Michaeli (Google Research, Technion – Israel Institute of T
 echnology); Oliver Wang and Deqing Sun (Google Research); Tali Dekel (Goog
 le Research, Weizmann Institute of Science); and Inbar Mosseri (Google Res
 earch)\n\nWe introduce Lumiere -- a text-to-video diffusion model designed
  for synthesizing videos that portray realistic, diverse and coherent moti
 on -- a pivotal challenge in video synthesis. To this end, we introduce a 
 Space-Time U-Net architecture that generates the entire temporal duration 
 of the video at once, through a single pass in the model. This is in contr
 ast to existing video models which synthesize distant keyframes followed b
 y temporal super-resolution -- an approach that inherently makes global te
 mporal consistency difficult to achieve. By deploying both spatial and (im
 portantly) temporal down- and up-sampling and leveraging a pre-trained tex
 t-to-image diffusion model, our model learns to directly generate a full-f
 rame-rate, low-resolution video by processing it in multiple space-time sc
 ales. We demonstrate state-of-the-art text-to-video generation results, an
 d show that our design easily facilitates a wide range of content creation
  tasks and video editing applications, including image-to-video, video inp
 ainting, and stylized generation.\n\nRegistration Category: Full Access, F
 ull Access Supporter\n\nLanguage Format: English Language\n\nSession Chair
 : Nanxuan Zhao (Adobe Research)
URL:https://asia.siggraph.org/2024/program/?id=papers_485&sess=sess134
END:VEVENT
END:VCALENDAR
