BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Tokyo
X-LIC-LOCATION:Asia/Tokyo
BEGIN:STANDARD
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:JST
DTSTART:18871231T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250110T023312Z
LOCATION:Hall B7 (1)\, B Block\, Level 7
DTSTART;TZID=Asia/Tokyo:20241204T105600
DTEND;TZID=Asia/Tokyo:20241204T110800
UID:siggraphasia_SIGGRAPH Asia 2024_sess114_papers_716@linklings.com
SUMMARY:SGEdit: Bridging LLM with Text2Image Generative Model for Scene Gr
 aph-based Image Editing
DESCRIPTION:Technical Papers\n\nZhiyuan Zhang (City University of Hong Kon
 g), DongDong Chen (Microsoft GenAI), and Jing Liao (City University of Hon
 g Kong)\n\nScene graphs offer a structured, hierarchical representation of
  images, with nodes and edges symbolizing objects and the relationships am
 ong them. It can serve as a natural interface for image editing, dramatica
 lly improving precision and flexibility. Leveraging this benefit, we intro
 duce a new framework that integrates large language model (LLM) with Text2
 Image generative model for scene graph-based image editing. This integrati
 on enables precise modifications at the object level and creative recompos
 ition of scenes without compromising overall image integrity. Our approach
  involves two primary stages: 1) Utilizing a LLM-driven scene parser, we c
 onstruct an image's scene graph, capturing key objects and their interrela
 tionships, as well as parsing fine-grained attributes such as object masks
  and descriptions. These annotations facilitate concept learning with a fi
 ne-tuned diffusion model, representing each object with an optimized token
  and detailed description prompt. 2) During the image editing phase, a LLM
  editing controller guides the edits towards specific areas. These edits a
 re then implemented by an attention-modulated diffusion editor, utilizing 
 the fine-tuned model to perform object additions, deletions, replacements,
  and adjustments. Through extensive experiments, we demonstrate that our f
 ramework significantly outperforms existing image editing methods in terms
  of editing precision and scene aesthetics. Our code will be made publicly
  available.\n\nRegistration Category: Full Access, Full Access Supporter\n
 \nLanguage Format: English Language\n\nSession Chair: Kai Wang (Amazon)
URL:https://asia.siggraph.org/2024/program/?id=papers_716&sess=sess114
END:VEVENT
END:VCALENDAR