YOLO-based Adaptive Window Two-stream Convolutional Neural Network for Video Classification

Charles Han, Chao Wang, Evelyn Mei,Joseph Redmon,Santosh Divvala,Zuxuan Wu,Xi Wang,Yu-Gang Jiang,Hao Ye,Xiangyang Xue

semanticscholar（2017）

引用 4|浏览45

暂无评分

摘要

[1] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Largescale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 1725-1732). [2] Simonyan, K., & Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems (pp. 568-576). [3] Wang, Y., Song, J., Wang, L., Van Gool, L., & Hilliges, O. (2016). Two-Stream SR-CNNs for Action Recognition in Videos. BMVC. [4] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. (2016). You Only Look Once: Unified, Real-Time Object Detection. CVPR. [5] Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, Xiangyang Xue. (2015). Modeling SpatialTemporal Clues in a Hybrid Deep Learning Framework for Video Classification. ACM MM. Convolutional Neural Networks (CNN) have been adopted widely for image classification problems. As they demonstrate significant success, more and more researchers start to deploy CNN on video classification problems. The main challenge is to capture not only the appearance information present in single, static frames, but also complex temporal evolution. Among video classification tasks, human action recognition is the key problem.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要