BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Tokyo
X-LIC-LOCATION:Asia/Tokyo
BEGIN:STANDARD
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:JST
DTSTART:18871231T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250110T023312Z
LOCATION:Hall B7 (1)\, B Block\, Level 7
DTSTART;TZID=Asia/Tokyo:20241204T110800
DTEND;TZID=Asia/Tokyo:20241204T111900
UID:siggraphasia_SIGGRAPH Asia 2024_sess114_papers_508@linklings.com
SUMMARY:CPoser: An Optimization-after-Parsing Approach for Text-to-Pose Ge
 neration Using Large Language Models.
DESCRIPTION:Technical Papers\n\nYumeng Li, Bohong Chen, Zhong Ren, and Yao
 -Xiang Ding (Zhejiang University); Libin Liu (Peking University); and Tian
 jia Shao and Kun Zhou (Zhejiang University)\n\nText-to-pose generation is 
 challenging due to the complexity of natural language and human posture se
 mantics. Utilizing large language models (LLMs) for text-to-pose generatio
 n is appealing due to their strong capabilities in text understanding and 
 reasoning. However, as LLMs are designed for general-purpose language proc
 essing and not specifically trained for pose generation, it remains nontri
 vial to generate precise articulation targets for the full body using LLMs
  directly. To this end, we propose CPoser, a novel approach to harness the
  power of LLMs for text-to-pose generation, featuring a prompt parsing sta
 ge and a pose optimization stage. The parsing stage utilizes LLMs to turn 
 text prompts into pose intermediate representations (Pose-IRs) through a s
 et of predefined structured queries. These Pose-IRs explicitly describe sp
 ecific pose conditions, such as squatting depth and knee bending angle, na
 turally forming an objective function that a target pose should satisfy. T
 he optimization stage solves for expressive poses and hand gestures based 
 on the Pose-IR objective function via robust optimization in a quantized p
 ose prior space. The results are further refined to enhance naturalness an
 d incorporate facial expressions. Experiments show that our approach effec
 tively understands diverse text prompts for pose generation, surpassing ex
 isting text-to-pose methods.\n\nRegistration Category: Full Access, Full A
 ccess Supporter\n\nLanguage Format: English Language\n\nSession Chair: Kai
  Wang (Amazon)
URL:https://asia.siggraph.org/2024/program/?id=papers_508&sess=sess114
END:VEVENT
END:VCALENDAR