Frame Level Emotion Guided Dynamic Facial Expression Recognition with Emotion Grouping.

CVPR Workshops(2023)

引用 4|浏览9
暂无评分
摘要
Facial expression recognition (FER) has received considerable attention in computer vision, with "in-the-wild" environments such as human-computer interaction and video understanding. Recognizing dynamic facial expressions in videos is generally considered a more practical and reliable approach than still images. However, the dynamic FER problem in videos has challenges in terms of both data acquisition and the structural aspects of the learning model. In particular, video frames that deviate from the target facial expression class can significantly degrade the performance of dynamic FER. In this paper, we present an affectivity extraction network (AEN) for dynamic FER. AEN combines features of different semantic levels and classifies both sentiment and specific emotion categories with emotion grouping. To address the challenges of dynamic FER, we propose frame-level emotion-guided loss functions and a structural aspect of the learning model. The AEN has two branches: a bottom-up branch that learns facial expressions representation at different semantic levels and outputs pseudo labels of facial expressions for each frame using a 2D FER model, and a top-down branch that learns discriminative representations by combining feature vectors of each semantic level for recognizing facial expressions at the corresponding emotion group. Additionally, the proposed frame-level emotion-guided loss functions encourage AEN to prevent the loss of emotional information and retain the emotional probability of a video clip. Experimental results on various video datasets show that the proposed AEN consistently outperforms the state-of-the-art in Ekman and sentiment FER. Representative results demonstrate the promise of the proposed AEN for dynamic FER in the video.
更多
查看译文
关键词
AEN,corresponding emotion group,different semantic levels,dynamic facial expressions,dynamic FER problem,emotion grouping,emotional information,emotional probability,facial expressions representation,frame level emotion guided dynamic facial expression recognition,frame-level emotion-guided loss functions,human-computer interaction,learning model,particular frames,semantic level,sentiment,specific emotion categories,structural aspect,target facial expression class,video clip,video datasets,video frames
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要