BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070248Z
LOCATION:Meeting Room C4.9+C4.10\, Level 4 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231214T141500
DTEND;TZID=Australia/Melbourne:20231214T143000
UID:siggraphasia_SIGGRAPH Asia 2023_sess132_papers_346@linklings.com
SUMMARY:ProSpect: Prompt Spectrum for Attribute-Aware Personalization of D
 iffusion Models
DESCRIPTION:Technical Papers\n\nYuxin Zhang (MAIS, Institute of Automation
 , Chinese Academy of Sciences; School of Artificial Intelligence, Universi
 ty of Chinese Academy of Sciences); Weiming Dong (MAIS, Institute of Autom
 ation, Chinese Academy of Sciences; School of AI,University of Chinese Aca
 demy of Sciences); Fan Tang (Institute of Computing Technology, Chinese Ac
 ademy of Sciences); Nisha Huang (School of AI,University of Chinese Academ
 y of Sciences; MAIS, Institute of Automation, Chinese Academy of Sciences)
 ; Haibin Huang and Chongyang Ma (Kuaishou Technology); Tong-Yee Lee (Natio
 nal Cheng-Kung University); Oliver Deussen (University of Konstanz); and C
 hangsheng Xu (MAIS, Institute of Automation, Chinese Academy of Sciences; 
 School of Artificial Intelligence, University of Chinese Academy of Scienc
 es)\n\nPersonalizing generative models offers a way to guide image generat
 ion with user-provided references. Current personalization methods can inv
 ert an object or concept into the textual conditioning space and compose n
 ew natural sentences for text-to-image diffusion models. However, represen
 ting and editing specific visual attributes like material, style, layout, 
 etc. remains a challenge, leading to a lack of disentanglement and editabi
 lity. To address this problem, we propose a novel approach that leverages 
 the step-by-step generation process of diffusion models, which generate im
 ages from low- to high-frequency information, providing a new perspective 
 on representing, generating, and editing images.  We develop Prompt Spectr
 um Space, an expanded textual conditioning space, and a new image represen
 tation method called ProSpect. ProSpect represents an image as a collectio
 n of inverted textual token embeddings encoded from per-stage prompts, whe
 re each prompt corresponds to a specific generation stage (i.e., a group o
 f consecutive steps) of the diffusion model. Experimental results demonstr
 ate that ProSpect offers better disentanglement and controllability compar
 ed to existing methods. We apply ProSpect in various personalized attribut
 e-aware image generation applications, such as image-guided or text-driven
  manipulations of materials, style, and layout, achieving previously unatt
 ainable results from a single image input without fine-tuning the diffusio
 n models. Code: \url{github.com/zyxElsa/ProSpect}.\n\nRegistration Categor
 y: Full Access\n\nSession Chair: Jun-Yan Zhu (Carnegie Mellon University)
URL:https://asia.siggraph.org/2023/full-program?id=papers_346&sess=sess132
END:VEVENT
END:VCALENDAR