Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos
CoRR(2024)
摘要
Traffic anomaly detection (TAD) in driving videos is critical for ensuring
the safety of autonomous driving and advanced driver assistance systems.
Previous single-stage TAD methods primarily rely on frame prediction, making
them vulnerable to interference from dynamic backgrounds induced by the rapid
movement of the dashboard camera. While two-stage TAD methods appear to be a
natural solution to mitigate such interference by pre-extracting
background-independent features (such as bounding boxes and optical flow) using
perceptual algorithms, they are susceptible to the performance of first-stage
perceptual algorithms and may result in error propagation. In this paper, we
introduce TTHF, a novel single-stage method aligning video clips with text
prompts, offering a new perspective on traffic anomaly detection. Unlike
previous approaches, the supervised signal of our method is derived from
languages rather than orthogonal one-hot vectors, providing a more
comprehensive representation. Further, concerning visual representation, we
propose to model the high frequency of driving videos in the temporal domain.
This modeling captures the dynamic changes of driving scenes, enhances the
perception of driving behavior, and significantly improves the detection of
traffic anomalies. In addition, to better perceive various types of traffic
anomalies, we carefully design an attentive anomaly focusing mechanism that
visually and linguistically guides the model to adaptively focus on the visual
context of interest, thereby facilitating the detection of traffic anomalies.
It is shown that our proposed TTHF achieves promising performance,
outperforming state-of-the-art competitors by +5.4
achieving high generalization on the DADA dataset.
更多查看译文
关键词
Traffic anomaly detection,multi-modality learning,high frequency,attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要