BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Tokyo X-LIC-LOCATION:Asia/Tokyo BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:JST DTSTART:18871231T000000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20250110T023312Z LOCATION:Hall B7 (1)\, B Block\, Level 7 DTSTART;TZID=Asia/Tokyo:20241205T165300 DTEND;TZID=Asia/Tokyo:20241205T170500 UID:siggraphasia_SIGGRAPH Asia 2024_sess138_papers_517@linklings.com SUMMARY:PersonaTalk: Bring Attention to Your Persona in Visual Dubbing DESCRIPTION:Technical Papers\n\nLonghao Zhang, Shuang Liang, Zhipeng Ge, a nd Tianshu Hu (Bytedance)\n\nFor audio-driven visual dubbing, it remains a considerable challenge to uphold and highlight speaker's persona while sy nthesizing accurate lip synchronization. Existing methods fall short of ca pturing speaker's unique speaking style or preserving facial details. In t his paper, we present PersonaTalk, an attention-based two-stage framework, including geometry construction and face rendering, for high-fidelity and personalized visual dubbing. In the first stage, we propose a style-aware audio encoding module that injects speaking style into audio features thr ough a cross-attention layer. The stylized audio features are then used to drive speaker's template geometry to obtain lip-synced geometries. In the second stage, a dual-attention face renderer is introduced to render text ures for the target geometries. It consists of two parallel cross-attentio n layers, namely lip-attention and face-attention, which respectively samp le textures from different reference frames to render the entire face. Wit h our innovative design, intricate facial details can be well preserved. C omprehensive experiments and user studies demonstrate our advantages over other state-of-the-art methods in terms of visual quality, lip-sync accura cy and persona preservation. Furthermore, as a person-generic framework, P ersonaTalk can achieve competitive performance as state-of-the-art person- specific methods.\n\nRegistration Category: Full Access, Full Access Suppo rter\n\nLanguage Format: English Language\n\nSession Chair: Hongbo Fu (Hon g Kong University of Science and Technology) URL:https://asia.siggraph.org/2024/program/?id=papers_517&sess=sess138 END:VEVENT END:VCALENDAR