ABSTRACT
In recent years, methods based on computer vision and deep learning become the mainstream approaches in fire detection. However, the expensive computation cost of 3D convolutional neutral network (CNN) is unbearable and it is difficult for them to capture the fire regions of videos in time. In this paper, we design a module named channel shuffle module (CSM) based on 2D CNN to keep the balance between computation cost and accuracy. By fusing RGB frame and differential frame, CSM improves the ability of 2D CNN in temporal information extraction which much less cost than methods based on 3D CNN. Four different structures of CSM are proposed and we choose the best one by experiment results. Also, experiments prove that the performances of TSN and TSM are improved with CSM in sequence classification. The accuracy of TSM with CSM is 99.2045%, false positive rate reaches 0.7890% and false negative rate reaches 0.4530%, which demonstrates the efficiency of CSM in temporal feature modeling.
- Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015). Learning Spatiotemporal Features with 3D Convolutional Networks. IEEE International Conference on Computer Vision. IEEE.Google ScholarDigital Library
- Carreira J, Zisserman A (2017). Quo vadis, action recognition? a new model and the kinetics dataset. IEEE.Google ScholarCross Ref
- Qiu Z, Yao T, Mei T (2017). Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks. 2017 IEEE International Conference on Computer Vision (ICCV). IEEE.Google ScholarCross Ref
- Feichtenhofer C, Fan H, Malik J, K He (2019). SlowFast Networks for Video Recognition. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE.Google Scholar
- X Wang, Girshick R, Gupta A, He K (2018). Non-local Neural Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.Google Scholar
- K He, X Zhang, S Ren, and J Sun (2016). Deep residual learning for image recognition. 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.Google ScholarCross Ref
- Szegedy C, Wei L, Jia Y, Sermanet P, Rabinovich A (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.Google ScholarCross Ref
- Wang L, Xiong Y, Wang Z, Qiao Y, Lin D, Tang X, (2016). Temporal segment networks: towards good practices for deep action recognition. European Conference on Computer Vision.Google ScholarCross Ref
- Zhou B, Andonian A, Oliva A, Torralba A (2018). Temporal relational reasoning in videos. European Conference on Computer Vision.Google ScholarDigital Library
- Lin J, Gan C, Han S. 2019. TSM: Temporal Shift Module for Efficient Video Understanding. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE.Google Scholar
- Zolfaghari M, Singh K, Brox T (2018). Eco: efficient convolutional network for online video understanding. European Conference on Computer Vision.Google ScholarDigital Library
- Du T, Wang H, Torresani L, Ray J, Paluri M (2018). A Closer Look at Spatiotemporal Convolutions for Action Recognition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.Google Scholar
- Xie S, Sun C, Huang J, Tu Z, Murphy K (2017). Rethinking spatiotemporal feature learning: speed-accuracy trade-offs in video classification.Google Scholar
- Sun L, Jia K, Yeung D, Shi B (2015). Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks. IEEE International Conference on Computer Vision pp 4597-4605. IEEE.Google Scholar
- Lin T, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017). Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society.Google Scholar
Recommendations
A Comparison of Deep Learning Methods for Vision-based Fire Detection in Surveillance System
ICFNDS '21: Proceedings of the 5th International Conference on Future Networks and Distributed SystemsFire detection by analyzing video and images from surveillance cameras is increasingly being used as an early warning system. Fire detection using video surveillance as input is much more efficient than fire sensors because of the broader coverage area ...
Fire detection in video surveillances using convolutional neural networks and wavelet transform
AbstractFire is one of the most frequent and common emergencies threatening public safety and social development. Recently, intelligent fire detection technologies represented by convolutional neural networks (CNNs) have been widely concerned ...
L2 regularized deep convolutional neural networks for fire detection
Soft Computing ApplicationsFire calamity is one of the worst adversarial events that can happen to the human race. Fire disaster can happen as a manmade disaster or even naturally, and it may cause environmental, social, and financial damages as well. In order to minimalize the ...
Comments