BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Australia/Melbourne X-LIC-LOCATION:Australia/Melbourne BEGIN:DAYLIGHT TZOFFSETFROM:+1000 TZOFFSETTO:+1100 TZNAME:AEDT DTSTART:19721003T020000 RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU END:DAYLIGHT BEGIN:STANDARD DTSTART:19721003T020000 TZOFFSETFROM:+1100 TZOFFSETTO:+1000 TZNAME:AEST RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20240214T070248Z LOCATION:Meeting Room C4.9+C4.10\, Level 4 (Convention Centre) DTSTART;TZID=Australia/Melbourne:20231214T141500 DTEND;TZID=Australia/Melbourne:20231214T143000 UID:siggraphasia_SIGGRAPH Asia 2023_sess132_papers_346@linklings.com SUMMARY:ProSpect: Prompt Spectrum for Attribute-Aware Personalization of D iffusion Models DESCRIPTION:Technical Papers\n\nYuxin Zhang (MAIS, Institute of Automation , Chinese Academy of Sciences; School of Artificial Intelligence, Universi ty of Chinese Academy of Sciences); Weiming Dong (MAIS, Institute of Autom ation, Chinese Academy of Sciences; School of AI,University of Chinese Aca demy of Sciences); Fan Tang (Institute of Computing Technology, Chinese Ac ademy of Sciences); Nisha Huang (School of AI,University of Chinese Academ y of Sciences; MAIS, Institute of Automation, Chinese Academy of Sciences) ; Haibin Huang and Chongyang Ma (Kuaishou Technology); Tong-Yee Lee (Natio nal Cheng-Kung University); Oliver Deussen (University of Konstanz); and C hangsheng Xu (MAIS, Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Scienc es)\n\nPersonalizing generative models offers a way to guide image generat ion with user-provided references. Current personalization methods can inv ert an object or concept into the textual conditioning space and compose n ew natural sentences for text-to-image diffusion models. However, represen ting and editing specific visual attributes like material, style, layout, etc. remains a challenge, leading to a lack of disentanglement and editabi lity. To address this problem, we propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate im ages from low- to high-frequency information, providing a new perspective on representing, generating, and editing images. We develop Prompt Spectr um Space, an expanded textual conditioning space, and a new image represen tation method called ProSpect. ProSpect represents an image as a collectio n of inverted textual token embeddings encoded from per-stage prompts, whe re each prompt corresponds to a specific generation stage (i.e., a group o f consecutive steps) of the diffusion model. Experimental results demonstr ate that ProSpect offers better disentanglement and controllability compar ed to existing methods. We apply ProSpect in various personalized attribut e-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unatt ainable results from a single image input without fine-tuning the diffusio n models. Code: \url{github.com/zyxElsa/ProSpect}.\n\nRegistration Categor y: Full Access\n\nSession Chair: Jun-Yan Zhu (Carnegie Mellon University) URL:https://asia.siggraph.org/2023/full-program?id=papers_346&sess=sess132 END:VEVENT END:VCALENDAR