BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B5 (2)\, B Block\, Level 5 DTSTART;TZID=Asia/Tokyo:20241205T171600 DTEND;TZID=Asia/Tokyo:20241205T172800 UID:siggraphasia_SIGGRAPH Asia 2024_sess137_papers_572@linklings.com SUMMARY:Fast High-Resolution Image Synthesis with Latent Adversarial Diffu sion Distillation DESCRIPTION:Technical Papers\n\nAxel Sauer, Frederic Boesel, Tim Dockhorn, Andreas Blattmann, Patrick Esser, and Robin Rombach (Black Forest Labs)\n \nDiffusion models are the main driver of progress in image and video synt hesis, but suffer from slow inference speed. Distillation methods, like th e recently introduced adversarial diffusion distillation (ADD) aim to shif t the model from many-shot to single-step inference, albeit at the cost of expensive and difficult optimization due to its reliance on a fixed pretr ained DINOv2 discriminator. We introduce Latent Adversarial Diffusion Dist illation (LADD), a novel distillation approach overcoming the limitations of ADD. In contrast to pixel-based ADD, LADD utilizes generative features from pretrained latent diffusion models. This approach simplifies training and enhances performance, enabling high-resolution multi-aspect ratio im age synthesis. \nWe apply LADD to Stable Diffusion 3 (8B) to obtain SD3-Tu rbo, a fast model that matches the performance of state-of-the-art text-to -image generators using only four unguided sampling steps. Moreover, we sy stematically investigate its scaling behavior and demonstrate LADD's effec tiveness in various applications such as image editing and inpainting.\n\n Registration Category: Full Access, Full Access Supporter\n\nLanguage Form at: English Language\n\nSession Chair: Michael Rubinstein (Google) URL:https://asia.siggraph.org/2024/program/?id=papers_572&sess=sess137 END:VEVENT END:VCALENDAR