Learning spatial-temporal representation for smoke vehicle detection

Cao, Yichao; Lu, Xiaobo

doi:10.1007/s11042-019-07926-1

Learning spatial-temporal representation for smoke vehicle detection

Published: 28 June 2019

Volume 78, pages 27871–27889, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Vehicle exhaust emissions are notorious for being unhealthy both for humans and the environment. Smoke vehicle, emitting excess levels of visible black smoke, is representative heavy pollution vehicle. It is a challenging task to recognize smoke vehicles from traffic surveillance due to the large variance of smoke color, texture, and interference. To solve this problem, this paper proposes smoke vehicle detection methods by learning spatial-temporal representation from image sequences. Firstly, motion detection algorithm is used to obtain the rear section of vehicle that need to be identified. Then, space information of each suspected frame is captured by Inception V3 convolutional neural network (CNN), and a temporal Multi-Layer Perception (MLP) or Long Short Term Memory network (LSTM) is used to effectively train the smoke vehicle model. The first method attempts to jointly model spatial-temporal clues for smoke vehicle detection in the video by fully-connected layers. The second method aims to learn temporal dependencies between video frames with LSTM. LSTM networks could combine image information in video over a longer period of time. Experimental results on our dataset have shown that the LSTM-based model achieve a highly accuracy of 97.6875%, and there is 9.25% improvement over the single frame model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

ESTNet: Efficient Spatio-Temporal Network for Industrial Smoke Detection

Recurrent convolutional network for video-based smoke detection

Article 01 February 2018

QuasiVSD: efficient dual-frame smoke detection

Article 16 February 2022

References

Barnich O, Van DM (2011) ViBe: a universal background subtraction algorithm for video sequences. [J]. IEEE Trans Image Process 20(6):1709–1724
Article MathSciNet MATH Google Scholar
Bengio Y, Simard P, Frasconi P (2002) Learning long-term dependencies with gradient descent is difficult.[J]. IEEE Trans Neural Netw 5(2):157–166
Article Google Scholar
Cardoso GC, Mestha LK (2014) Image-based determination of CO and CO2 concentrations in vehicle exhaust gas emissions: U.S. Patent 8,854,223[P]
Chen J, Song X, Nie L, et al (2016) Micro tells macro: predicting the popularity of micro-videos via a transductive model[C]//Proceedings of the 24th ACM international conference on Multimedia. ACM, p 898–907
Favorskaya M, Pyataeva A, Popov A (2015) Verification of smoke detection in video sequences based on spatio-temporal local binary patterns[J]. Procedia Comput Sci 60:671–680
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biol Cybern 36(4):193–202
Article MATH Google Scholar
Gers FA, Schraudolph NN, Schmidhuber J (2002) Learning precise timing with LSTM recurrent networks[J]. J Mach Learn Res 3:115–143
MathSciNet MATH Google Scholar
Graves A (1997) Long short-term memory[J]. Neural Comput 9(8):1735–1780
Article Google Scholar
Gubbi J, Marusic S, Palaniswami M (2009) Smoke detection in video using wavelets and support vector machines[J]. Fire Saf J 44(8):1110–1115
Article Google Scholar
Hu Y, Lu X (2018) Real-time video fire smoke detection by utilizing spatial-temporal ConvNet features[J]. Multimed Tools Appl 77(8):1–19
Google Scholar
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat's visual cortex[J]. J Physiol 160(1):106–154
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift[J]. arXiv preprint arXiv:1502.03167
Ji Z, He E, Wang H et al (2019) Image-attribute reciprocally guided attention network for pedestrian attribute recognition[J]. Pattern Recogn Lett 120:89–95
Article Google Scholar
Ji Z, Xiong K, Pang Y, Li X (2017) Video summarization with attention-based encoder-decoder networks[J]. arXiv preprint arXiv:1708.09545
Kaabi R, Frizzi S, Bouchouicha M, et al (2017) Video smoke detection review: State of the art of smoke detection in visible and IR range[C]//2017 International Conference on Smart, Monitored and Controlled Cities (SM2C). IEEE, p 81–86
Lin M, Chen Q, Yan S (2013) Network in network[J]. arXiv preprint arXiv:1312.4400
Liu H, Chen S, Kubota N (2013) Intelligent video systems and analytics: a survey[J]. IEEE Trans Ind Inf 9(3):1222–1233
Article Google Scholar
Liu W , Anguelov D , Erhan D, et al (2016) SSD: single shot MultiBox detector[C]// European Conference on Computer Vision. Springer, Cham
Liu YH, Liao WY, Li L et al (2017) Vehicle emission trends in China's Guangdong Province from 1994 to 2014[J]. Sci Total Environ 586:512–521
Article Google Scholar
Mozer MC (1995) A focused backpropagation algorithm for temporal pattern recognition[M]// Backpropagation. L. Erlbaum Associates Inc., p 349–381
Pyykönen P, Peussa P, Kutila M, et al (2016) Multi-camera-based smoke detection and traffic pollution analysis system[C]// IEEE, International Conference on Intelligent Computer Communication and Processing. IEEE, p 233–238
Raj M, Semwal VB, Nandi GC (2018) Bidirectional association of joint angle trajectories for humanoid locomotion: the restricted Boltzmann machine approach[J]. Neural Comput & Applic 30(6):1747–1755
Article Google Scholar
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. p 7263–7271
Robinson A J, Fallside F (1987) The utility driven dynamic error propagation network[M]. University of Cambridge Department of Engineering
Salehinejad H, Sankar S, Barfett J, et al (2017) Recent advances in recurrent neural networks[J]. arXiv preprint arXiv:1801.01078
Semwal VB, Raj M, Nandi GC (2015) Biometric gait identification based on a multilayer perceptron[J]. Robot Auton Syst 65:65–75
Article Google Scholar
Semwal VB, Mondal K, Nandi GC (2017) Robust and accurate feature selection for humanoid push recovery and classification: deep learning approach[J]. Neural Comput & Applic 28(3):565–574
Article Google Scholar
Semwal VB, Singha J, Sharma PK et al (2017) An optimized feature selection technique based on incremental feature analysis for bio-metric gait data classification[J]. Multimed Tools Appl 76(22):24457–24475
Article Google Scholar
Semwal VB, Gaud N, Nandi GC (2019) Human gait state prediction using cellular automata and classification using ELM[M]//Machine Intelligence and Signal Analysis. Springer, Singapore, 135–145.
Song X, Feng F, Liu J, et al (2017) Neurostylist: neural compatibility modeling for clothing matching[C]//Proceedings of the 25th ACM international conference on Multimedia. ACM, p 753–761
Song X, Feng F, Han X, et al (2018) Neural compatibility modeling with attentive knowledge distillation[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, p 5–14
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting[J]. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Szegedy C, Liu W, Jia Y, et al (2015) Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. p 1–9.
Szegedy C, Vanhoucke V, Ioffe S, et al (2016) Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. p 2818–2826
Tao H, Lu X (2018) Smoky vehicle detection based on multi-scale block Tamura features[J]. SIViP 12(6):1061–1068
Article Google Scholar
Tao H, Lu X (2018) Smoky vehicle detection based on multi-feature fusion and ensemble neural networks[J]. Multimed Tools Appl 77(24):32153–32177
Article Google Scholar
Tao H, Lu X (2018) Smoky vehicle detection in surveillance video based on gray level co-occurrence matrix[C]//Tenth International Conference on Digital Image Processing (ICDIP 2018). International Society for Optics and Photonics, 10806:1080642
Tao H, Lu X (2019) Contour-based smoky vehicle detection from surveillance video for alarm systems[J]. SIViP 13(2):217–225
Article Google Scholar
Tao D, Lin X, Jin L et al (2016) Principal component 2-D long short-term memory for font recognition on single Chinese characters[J]. IEEE Trans Cybern 46(3):756–765
Tao D, Guo Y, Li Y, et al (2017) Tensor rank preserving discriminant analysis for facial recognition[J]. IEEE Trans Image Process PP(99):1–1
Tao D, Guo Y, Yu B et al (2018) Deep multi-view feature learning for person re-identification[J]. IEEE Trans Circuits Syst Video Technol 28(10):2657–2666
Article Google Scholar
Tatikonda RR, Kulkarni VB (2017) Exhaust gas emission analysis of automotive vehicles using FPGA[C]//Proceedings of the International Conference on Data Engineering and Communication Technology. Springer, Singapore, p 109–117
Tian H, Li W, Ogunbona P, et al (2011) Smoke detection in videos using Non-Redundant Local Binary Pattern-based features[C]// IEEE 13th International Workshop on Multimedia Signal Processing (MMSP 2011), Hangzhou, China, October 17–19, 2011. IEEE
Tian H, Li W, Ogunbona PO et al (2017) Detection and separation of smoke from single image frames[J]. IEEE Trans Image Process 27(3):1164–1177
Article MathSciNet MATH Google Scholar
Töreyin BU, Dedeoğlu Y, Cetin AE (2005) Wavelet based real-time smoke detection in video[C]//2005 13th European Signal Processing Conference. IEEE, p 1–4
Werbos PJ (1988) Generalization of backpropagation with application to a recurrent gas market model[J]. Neural Netw 1(4):339–356
Article Google Scholar
Yin Z, Wan B, Yuan F et al (2017) A deep normalization and convolutional neural network for image smoke detection[J]. IEEE Access 5(99):18429–18438
Article Google Scholar
Yin M, Lang C, Li Z et al (2019) Recurrent convolutional network for video-based smoke detection[J]. Multimed Tools Appl 78(1):237–256
Article Google Scholar
Yuan F (2011) Video-based smoke detection with histogram sequence of LBP and LBPV pyramids[J]. Fire Saf J 46(3):132–139
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No.61871123), Key Research and Development Program in Jiangsu Province (No.BE2016739) and a Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions.

Author information

Authors and Affiliations

School of Automation, Southeast University, Nanjing, 210096, China
Yichao Cao & Xiaobo Lu
Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing, 210096, China
Yichao Cao & Xiaobo Lu

Authors

Yichao Cao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobo Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaobo Lu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, Y., Lu, X. Learning spatial-temporal representation for smoke vehicle detection. Multimed Tools Appl 78, 27871–27889 (2019). https://doi.org/10.1007/s11042-019-07926-1

Download citation

Received: 23 August 2018
Revised: 07 June 2019
Accepted: 21 June 2019
Published: 28 June 2019
Issue Date: 15 October 2019
DOI: https://doi.org/10.1007/s11042-019-07926-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning spatial-temporal representation for smoke vehicle detection

Abstract

Access this article

Similar content being viewed by others

ESTNet: Efficient Spatio-Temporal Network for Industrial Smoke Detection

Recurrent convolutional network for video-based smoke detection

QuasiVSD: efficient dual-frame smoke detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning spatial-temporal representation for smoke vehicle detection

Abstract

Access this article

Similar content being viewed by others

ESTNet: Efficient Spatio-Temporal Network for Industrial Smoke Detection

Recurrent convolutional network for video-based smoke detection

QuasiVSD: efficient dual-frame smoke detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation