BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023309Z LOCATION:Hall B7 (1)\, B Block\, Level 7 DTSTART;TZID=Asia/Tokyo:20241203T131100 DTEND;TZID=Asia/Tokyo:20241203T132300 UID:siggraphasia_SIGGRAPH Asia 2024_sess105_papers_824@linklings.com SUMMARY:ReVersion: Diffusion-Based Relation Inversion from Images DESCRIPTION:Technical Papers\n\nZiqi Huang, Tianxing Wu, Yuming Jiang, Kel vin C.K. Chan, and Ziwei Liu (S-Lab for Advanced Intelligence, Nanyang Tec hnological University Singapore)\n\nDiffusion models gain increasing popul arity for their generative capabilities. Recently, there have been surging needs to generate customized images by inverting diffusion models from ex emplar images, and existing inversion methods mainly focus on capturing ob ject appearances (i.e., the "look"). However, how to invert object relatio ns, another important pillar in the visual world, remains unexplored.\nIn this work, we propose the Relation Inversion task, which aims to learn a s pecific relation (represented as "relation prompt") from exemplar images. Specifically, we learn a relation prompt with a frozen pre-trained text-to -image diffusion model. The learned relation prompt can then be applied to generate relation-specific images with new objects, backgrounds, and styl es. \n\nTo tackle the Relation Inversion task, we propose the ReVersion Fr amework.\nSpecifically, we propose a novel "relation-steering contrastive learning" scheme to steer the relation prompt towards relation-dense regio ns, and disentangle it away from object appearances. \nWe further devise " relation-focal importance sampling" to emphasize high-level interactions o ver low-level appearances (e.g., texture, color).\nTo comprehensively eval uate this new task, we contribute the ReVersion Benchmark, which provides various exemplar images with diverse relations. Extensive experiments vali date the superiority of our approach over existing methods across a wide r ange of visual relations. Our proposed task and method could be good inspi rations for future research in various domains like generative inversion, few-shot learning, and visual relation detection.\n\nRegistration Category : Full Access, Full Access Supporter\n\nLanguage Format: English Language\ n\nSession Chair: Kfir Aberman (Snap) URL:https://asia.siggraph.org/2024/program/?id=papers_824&sess=sess105 END:VEVENT END:VCALENDAR