BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Australia/Melbourne X-LIC-LOCATION:Australia/Melbourne BEGIN:DAYLIGHT TZOFFSETFROM:+1000 TZOFFSETTO:+1100 TZNAME:AEDT DTSTART:19721003T020000 RRULE:FREQ=YEARLY;BYMONTH=4;BYDAY=1SU END:DAYLIGHT BEGIN:STANDARD DTSTART:19721003T020000 TZOFFSETFROM:+1100 TZOFFSETTO:+1000 TZNAME:AEST RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=1SU END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20260114T163633Z LOCATION:Darling Harbour Theatre\, Level 2 (Convention Centre) DTSTART;TZID=Australia/Melbourne:20231212T093000 DTEND;TZID=Australia/Melbourne:20231212T124500 UID:siggraphasia_SIGGRAPH Asia 2023_sess209_papers_509@linklings.com SUMMARY:What is the Best Automated Metric for Text to Motion Generation? DESCRIPTION:Jordan Voas, Yili Wang, Qixing Huang, and Raymond Mooney (Univ ersity of Texas at Austin)\n\nThere is growing interest in generating skel eton-based human motions from natural language descriptions. While most ef forts have focused on developing better neural architectures for this task , there has been no significant work on determining the proper evaluation metric. Human evaluation is the ultimate accuracy measure for this task, a nd automated metrics should correlate well with human quality judgments. S ince descriptions are compatible with many motions, determining the right metric is critical for evaluating and designing effective generative model s. This paper systematically studies which metrics best align with human e valuations and proposes new metrics that align even better. Our findings i ndicate that none of the metrics currently used for this task show even a moderate correlation with human judgments on a sample level. However, for assessing average model performance, commonly used metrics such as R-Preci sion and less-used coordinate errors show strong correlations. Additionall y, several recently developed metrics are not recommended due to their low correlation compared to alternatives. We also introduce a novel metric ba sed on a multimodal BERT-like model, MoBERT, which offers strongly human- correlated sample-level evaluations while maintaining near-perfect model-l evel correlation. Our results demonstrate that this new metric exhibits ex tensive benefits over all current alternatives.\n\nRegistration Category: Full Access, Enhanced Access, Trade Exhibitor, Experience Hall Exhibitor\n \n URL:https://asia.siggraph.org/2023/full-program?id=papers_509&sess=sess209 END:VEVENT END:VCALENDAR