Gesture Generation Via Diffusion Model with Attention Mechanism

Lingling Li, Weicong Li, Qiyuan Ding,Chengpei Tang,Keze Wang

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览2
暂无评分
摘要
Generating natural and semantically aligned gestures from speech remains a challenging task in human-computer interaction due to the intricate relationship between speech and gestures. While recent advances in learning-based methodologies have shown progress, they exhibit limitations like limited diversity and fidelity, as well as a mismatch between generated gestures and the semantic and emotional context, impacting efficacy in conveying information. To address these multifaceted challenges, this study introduces Gesture Diffusion Attention (GDA), an innovative approach for generating gestures from spoken language. Diverging from conventional methods, GDA incorporates a sophisticated denoising diffusion probability module, progressively transforming simplistic probability distributions into more complex ones, consequently yielding a repertoire of natural and diverse gestures. Furthermore, the utilization of pretrained fastText models for textual feature extraction, coupled with attention mechanisms, ensures that generated gestures align with speech in terms of semantic content and emotional nuances. To empirically validate the efficacy of the proposed approach, a series of rigorous objective experiments were conducted. The results demonstrate the exceptional performance of GDA in generating natural and diversified gestures that accurately and coherently convey the intended information, surpassing the benchmarks established by traditional methods. Code is released at https://github.com/LEELLL/GDA-icassp2024.
更多
查看译文
关键词
Gesture generation,Cross-Modal,SpeechDriven,Neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要