BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260114T163632Z
LOCATION:Darling Harbour Theatre\, Level 2 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231212T093000
DTEND;TZID=Australia/Melbourne:20231212T124500
UID:siggraphasia_SIGGRAPH Asia 2023_sess209_papers_316@linklings.com
SUMMARY:Scene-aware Activity Program Generation with Language Guidance
DESCRIPTION:Zejia Su (Shenzhen University), Qingnan Fan (Vivo), Xuelin Che
 n (Tencent AI Lab), Oliver van Kaick (Carleton University), and Hui Huang 
 and Ruizhen Hu (Shenzhen University)\n\nWe address the problem of scene-aw
 are activity program generation, which requires decomposing a given activi
 ty task into instructions that can be sequentially performed within a targ
 et scene to complete the activity. While existing methods have shown the a
 bility to generate rational or executable programs, generating programs wi
 th both high rationality and executability still remains a challenge. Henc
 e, we propose a novel method where the key idea is to explicitly combine t
 he language rationality of a powerful language model with dynamic percepti
 on of the target scene where instructions are executed, to generate progra
 ms with high rationality and executability. Our method iteratively generat
 es instructions for the activity program. Specifically, a two-branch featu
 re encoder operates on a language-based and graph-based representation of 
 the current generation progress to extract category-aware language feature
 s and instance-aware scene graph features, respectively. These features ar
 e then used by a predictor to generate the next instruction in the program
 . Subsequently, another module performs the predicted action and updates t
 he scene for perception in the next iteration. Extensive evaluations are c
 onducted on the VirtualHome-Env dataset, showing the advantages of our met
 hod over previous work. Key algorithmic designs are validated through abla
 tion studies, and results on other types of inputs are also presented to s
 how the generalizability of our method.\n\nRegistration Category: Full Acc
 ess, Enhanced Access, Trade Exhibitor, Experience Hall Exhibitor\n\n
URL:https://asia.siggraph.org/2023/full-program?id=papers_316&sess=sess209
END:VEVENT
END:VCALENDAR