BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Australia/Melbourne X-LIC-LOCATION:Australia/Melbourne BEGIN:DAYLIGHT TZOFFSETFROM:+1000 TZOFFSETTO:+1100 TZNAME:AEDT DTSTART:19721003T020000 RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU END:DAYLIGHT BEGIN:STANDARD DTSTART:19721003T020000 TZOFFSETFROM:+1100 TZOFFSETTO:+1000 TZNAME:AEST RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20240214T070248Z LOCATION:Meeting Room C4.9+C4.10\, Level 4 (Convention Centre) DTSTART;TZID=Australia/Melbourne:20231214T145000 DTEND;TZID=Australia/Melbourne:20231214T150000 UID:siggraphasia_SIGGRAPH Asia 2023_sess132_papers_430@linklings.com SUMMARY:Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To -Image Models DESCRIPTION:Technical Papers\n\nMoab Arar (Tel-Aviv University); Rinon Gal (Tel Aviv University, NVIDIA Research); Yuval Atzmon (NVIDIA Research); G al Chechik (NVIDIA Research, Bar-Ilan University); Daniel Cohen-Or (Tel Av iv University); Ariel Shamir (Reichman University (IDC)); and Amit H. Berm ano (Tel Aviv University)\n\nText-to-image (T2I) personalization allows us ers to guide the creative image generation process by combining their own visual concepts in natural language prompts. \nRecently, encoder-based tec hniques have emerged as a new effective approach for T2I personalization, reducing the need for multiple images and long training times.\nHowever, m ost existing encoders are limited to a single-class domain, which hinders their ability to handle diverse concepts. In this work, we propose a domai n-agnostic method that does not require any specialized dataset or prior i nformation about the personalized concepts. We introduce a novel contrasti ve-based regularization technique to maintain high fidelity to the target concept characteristics while keeping the predicted embeddings close to ed itable regions of the latent space, by pushing the predicted tokens toward their nearest existing CLIP tokens. Our experimental results demonstrate the effectiveness of our approach and show how the learned tokens are more semantic than tokens predicted by unregularized models. This leads to a b etter representation that achieves state-of-the-art performance while bein g more flexible than previous methods.\n\nRegistration Category: Full Acce ss\n\nSession Chair: Jun-Yan Zhu (Carnegie Mellon University) URL:https://asia.siggraph.org/2023/full-program?id=papers_430&sess=sess132 END:VEVENT END:VCALENDAR