Self-Supervised Facial Representation Learning with Facial Region Awareness
CVPR 2024(2024)
摘要
Self-supervised pre-training has been proved to be effective in learning
transferable representations that benefit various visual tasks. This paper asks
this question: can self-supervised pre-training learn general facial
representations for various facial analysis tasks? Recent efforts toward this
goal are limited to treating each face image as a whole, i.e., learning
consistent facial representations at the image-level, which overlooks the
consistency of local facial representations (i.e., facial regions like eyes,
nose, etc). In this work, we make a first attempt to propose a novel
self-supervised facial representation learning framework to learn consistent
global and local facial representations, Facial Region Awareness (FRA).
Specifically, we explicitly enforce the consistency of facial regions by
matching the local facial representations across views, which are extracted
with learned heatmaps highlighting the facial regions. Inspired by the mask
prediction in supervised semantic segmentation, we obtain the heatmaps via
cosine similarity between the per-pixel projection of feature maps and facial
mask embeddings computed from learnable positional embeddings, which leverage
the attention mechanism to globally look up the facial image for facial
regions. To learn such heatmaps, we formulate the learning of facial mask
embeddings as a deep clustering problem by assigning the pixel features from
the feature maps to them. The transfer learning results on facial
classification and regression tasks show that our FRA outperforms previous
pre-trained models and more importantly, using ResNet as the unified backbone
for various tasks, our FRA achieves comparable or even better performance
compared with SOTA methods in facial analysis tasks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要