Cross-Modal Attentive Recalibration and Dynamic Fusion for Multispectral Pedestrian Detection

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I(2024)

引用 0|浏览6
暂无评分
摘要
Multispectral pedestrian detection can provide accurate and reliable results from color-thermal modalities and has drawn much attention. However, how to effectively capture and leverage complementary information from multiple modalities for superior performance is still a core issue. This paper presents a Cross-Modal Attentive Recalibration and Dynamic Fusion Network (CMRF-Net) to adaptively recalibrate and dynamically fuse multi-modal features from multiple perspectives. CMRF-Net consists of a Cross-modal Attentive Feature Recalibration (CAFR) module and a Multi-Modal Dynamic Feature Fusion (MDFF) module in each feature extraction stage. The CAFR module recalibrates features by fully leveraging local and global complementary information in spatial- and channel-wise dimensions, leading to better cross-modal feature alignment and extraction. The MDFF module adopts dynamically learned convolutions to further exploit complementary information in kernel space, enabling more efficient multi-modal feature aggregation. Extensive experiments are conducted on three multispectral datasets to show the effectiveness and generalization of the proposed method and the state-of-the-art detection performance. Specifically, CMRF-Net can achieve 2.3% mAP gains over the baseline on FLIR dataset.
更多
查看译文
关键词
Multispectral pedestrian detection,Cross-modal attentive feature recalibration,Multi-modal dynamic feature fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要