BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Australia/Melbourne
X-LIC-LOCATION:Australia/Melbourne
BEGIN:DAYLIGHT
TZOFFSETFROM:+1000
TZOFFSETTO:+1100
TZNAME:AEDT
DTSTART:19721003T020000
RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:19721003T020000
TZOFFSETFROM:+1100
TZOFFSETTO:+1000
TZNAME:AEST
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240214T070241Z
LOCATION:Darling Harbour Theatre\, Level 2 (Convention Centre)
DTSTART;TZID=Australia/Melbourne:20231212T093000
DTEND;TZID=Australia/Melbourne:20231212T124500
UID:siggraphasia_SIGGRAPH Asia 2023_sess209_papers_316@linklings.com
SUMMARY:Scene-aware Activity Program Generation with Language Guidance
DESCRIPTION:Technical Papers\n\nZejia Su (Shenzhen University), Qingnan Fa
 n (Vivo), Xuelin Chen (Tencent AI Lab), Oliver van Kaick (Carleton Univers
 ity), and Hui Huang and Ruizhen Hu (Shenzhen University)\n\nWe address the
  problem of scene-aware activity program generation, which requires decomp
 osing a given activity task into instructions that can be sequentially per
 formed within a target scene to complete the activity. While existing meth
 ods have shown the ability to generate rational or executable programs, ge
 nerating programs with both high rationality and executability still remai
 ns a challenge. Hence, we propose a novel method where the key idea is to 
 explicitly combine the language rationality of a powerful language model w
 ith dynamic perception of the target scene where instructions are executed
 , to generate programs with high rationality and executability. Our method
  iteratively generates instructions for the activity program. Specifically
 , a two-branch feature encoder operates on a language-based and graph-base
 d representation of the current generation progress to extract category-aw
 are language features and instance-aware scene graph features, respectivel
 y. These features are then used by a predictor to generate the next instru
 ction in the program. Subsequently, another module performs the predicted 
 action and updates the scene for perception in the next iteration. Extensi
 ve evaluations are conducted on the VirtualHome-Env dataset, showing the a
 dvantages of our method over previous work. Key algorithmic designs are va
 lidated through ablation studies, and results on other types of inputs are
  also presented to show the generalizability of our method.\n\nRegistratio
 n Category: Full Access, Enhanced Access, Trade Exhibitor, Experience Hall
  Exhibitor
URL:https://asia.siggraph.org/2023/full-program?id=papers_316&sess=sess209
END:VEVENT
END:VCALENDAR