A multi-scale inputs and labels model for background subtraction

Yang, Yizhong; Li, Dajin; Li, Xiang; Zhang, Zhang; Xie, Guangjun

doi:10.1007/s11760-023-02645-5

A multi-scale inputs and labels model for background subtraction

Original Paper
Published: 26 June 2023

Volume 17, pages 4133–4141, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Yizhong Yang¹,
Dajin Li¹,
Xiang Li¹,
Zhang Zhang¹ &
…
Guangjun Xie¹

155 Accesses
Explore all metrics

Abstract

Background subtraction is a challenging and fundamental task in computer vision, which aims at segmenting moving objects from the background. Recently, the attention mechanism has become a hot topic in the neural network. The algorithms based on encoder-decoder and multi-scale type network perform impressive results in the domain of background subtraction. In this paper, we propose a multi-scale inputs and labels (MSIL) model which is based on the encoder-decoder type network and the channel attention. The multi-scale fusion encoding (MSFE) module aims to utilize multi-scale inputs effectively, which can fuse the high-level and low-level features details. The channel attention (CA) module is introduced to connect the encoder and decoder to model channel-wise attentions. The multi-label supervision decoding (MLSD) module helps to learn richer hierarchical features and achieves better performance by the new multi-label supervision. The proposed model is also evaluated on the CDnet-2014 dataset and the LASIESTA dataset, which demonstrate the effectiveness and superiority of the proposed model by an average F-Measure of 0.9851 and 0.9633, respectively. In addition, scene independent evaluation experiments on the CDnet-2014 dataset demonstrate the effectiveness of the model on unseen videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-scale feature fusion spatial–channel attention model for background subtraction

Article 28 July 2023

Nested-Net: a deep nested network for background subtraction

Article 07 March 2023

Background Learnable Cascade for Zero-Shot Object Detection

Data availability

All data generated or analyzed during this study are included in this article.

References

Zhang, Z., Fidler, S., Urtasun, R.: Instance-level segmentation for autonomous driving with deep densely connected MRFS. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 669–677 (2016)
Sen-Ching, S.C., Kamath, C.: Robust techniques for background subtraction in urban traffic video. In: Visual Communications and Image Processing 2004. International Society for Optics and Photonics, vol. 5308, pp. 881-892 (2004)
Wu, H., Liu, N., Luo, X., et al.: Real-time background subtraction-based video surveillance of people by integrating local texture patterns. SIViP 8(4), 665–676 (2014)
Article Google Scholar
Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. Acm Comput. Surveys CSUR 38(4), 13-es (2006)
Article Google Scholar
Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2013)
Google Scholar
Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149). IEEE, vol. 2 pp. 246–252 (1999)
Barnich, O., Van Droogenbroeck, M.: ViBe: a universal background subtraction algorithm for video sequences. IEEE Trans. Image Proc. 20(6), 1709–1724 (2011). https://doi.org/10.1109/TIP.2010.2101613
Article MathSciNet MATH Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, (2014)
Zhang, H., Patel, V.M.: Density-aware single image de-raining using a multi-stream dense network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 695–704 (2018)
Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV, pp. 354–370. Springer International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22
Chapter Google Scholar
Zeng, D., Zhu, M.: Multiscale fully convolutional network for foreground object detection in infrared videos. IEEE Geosci. Remote Sens. Lett. 15(4), 617–621 (2018)
Article Google Scholar
Wang, Y., Luo, Z., Jodoin, P.M.: Interactive deep learning method for segmenting moving objects. Pattern Recogn. Lett. 96, 66–75 (2017)
Article Google Scholar
Tezcan O, Ishwar P, Konrad J. BSUV-Net: A fully-convolutional neural network for background subtraction of unseen videos[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2020: 2774–2783.
Mandal, M., Dhar, V., Mishra, A., et al.: 3DCD: Scene independent end-to-end spatiotemporal feature learning framework for change detection in unseen videos. IEEE Trans. Image Process. 30, 546–558 (2020)
Article Google Scholar
Sobral, A., Bouwmans, T.: BGS library: A library framework for algorithms evaluation in foreground/background segmentation. In: Background Modeling and Foreground Detection for Video Surveillance, Boca Raton, FL, USA: CRC Press, ch. 23, pp. 1–16 (2014)
Mandal, M., Vipparthi, S.K.: An empirical review of deep learning frameworks for change detection: model design, experimental frameworks, challenges and research needs. IEEE Trans. Intell. Transp. Syst. 1–22 (2021)
Wren, C.: Real-time tracking of the human body. Photonics East, SPIE, 2615 (1995)
Ahn, H., Kang, M.: Dynamic background subtraction with masked RPCA. SIViP 15(3), 467–474 (2021)
Article Google Scholar
Braham, M., Van Droogenbroeck, M.: Deep background subtraction with scene-specific convolutional neural networks. In: 2016 International Conference on Systems, Signals and Image Processing (IWSSIP). IEEE, pp. 1–4 (2016)
Nguyen, T.P., Pham, C.C., Ha, S.V.U., et al.: Change detection by training a triplet network for motion feature extraction. IEEE Trans. Circuits Syst. Video Technol. 29(2), 433–446 (2018)
Article Google Scholar
Chen, Y., Wang, J., Zhu, B., et al.: Pixelwise deep sequence learning for moving object detection. IEEE Trans. Circuits Syst. Video Technol. 29(9), 2567–2579 (2017)
Article Google Scholar
Patil, P.W., Murala, S.: Msfgnet: a novel compact end-to-end deep network for moving object detection. IEEE Trans. Intell. Transp. Syst. 20(11), 4066–4077 (2018)
Article Google Scholar
Minematsu, T., Shimada, A., Taniguchi, R.: Rethinking background and foreground in deep neural network-based background subtraction. In: 2020 IEEE International Conference on Image Processing (ICIP). IEEE, pp. 3229–3233 (2020)
Giraldo, J.H., Bouwmans, T.: Semi-supervised background subtraction of unseen videos: minimization of the total variation of graph signals. In: 2020 IEEE International Conference on Image Processing (ICIP). IEEE, pp. 3224–3228 (2020)
Giraldo, J.H., Javed, S., Sultana, M., Jung, S.K., Bouwmans, T.: The emerging field of graph signal processing for moving object segmentation. In: Jeong, H., Sumi, K. (eds.) Frontiers of Computer Vision: 27th International Workshop, IW-FCV 2021, Daegu, South Korea, February 22–23, 2021, Revised Selected Papers, pp. 31–45. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-81638-4_3
Chapter Google Scholar
Giraldo, J.H., Javed, S., Werghi, N., et al.: Graph CNN for moving object detection in complex environments from unseen videos. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 225–233 (2021)
Zhang, J., Zhang, X., Zhang, Y., et al.: Meta-knowledge learning and domain adaptation for unseen background subtraction. IEEE Trans. Image Process. 30, 9058–9068 (2021)
Article Google Scholar
Hou, B., Liu, Y., Ling, N., et al.: A fast lightweight 3D separable convolutional neural network with multi-input multi-output for moving object detection. IEEE Access 9, 148433–148448 (2021)
Article Google Scholar
Huini, F., Ma, Z., Zhao, B., Yang, Z., Jiang, Y., Zhu, M.: Lightweight convolutional neural network for foreground segmentation. In: Jia, Y., Zhang, W., Yongling, F., Zhiyuan, Y., Zheng, S. (eds.) Proceedings of 2021 Chinese Intelligent Systems Conference: Volume I, pp. 811–819. Springer Singapore, Singapore (2022). https://doi.org/10.1007/978-981-16-6328-4_81
Chapter Google Scholar
Hou, B., Liu, Y., Ling, N.: A super-fast deep network for moving object detection. In: International Symposium on Circuits and Systems (ISCAS), pp. 1–5 (2020)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 7132–7141 (2018)
Wang, Y., Jodoin, P.M., Porikli, F., et al.: CDnet 2014: An expanded change detection benchmark dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. pp. 387–394 (2014)
Cuevas, C., Yáñez, E.M., García, N.: Labeled dataset for integral evaluation of moving object detection algorithms: LASIESTA. Comput. Vis. Image Underst. 152, 103–117 (2016)
Article Google Scholar
St-Charles, P.L., Bilodeau, G.A., Bergevin, R.: SuBSENSE: a universal change detection method with local adaptive sensitivity. IEEE Trans. Image Proc. 24, 359–373 (2015)
Article MathSciNet MATH Google Scholar
Bianco, S., Ciocca, G., Schettini, R.: Combination of video change detection algorithms by genetic programming. IEEE Trans. Evol. Comput. 21(6), 914–928 (2017)
Article Google Scholar
Hu, Z., Turki, T., Phan, N., et al.: A 3D atrous convolutional long short-term memory network for background subtraction. IEEE Access 6, 43450–43459 (2018)
Article Google Scholar
Shahbaz, A., Jo, K.H.: Deep atrous spatial features-based supervised foreground detection algorithm for industrial surveillance systems. IEEE Trans. Industr. Inf. 17(7), 4818–4826 (2020)
Article Google Scholar
Kim, J.Y., Ha, J.E.: Generation of background model image using foreground model. IEEE Access 9, 127515–127530 (2021)
Article Google Scholar
Cuevas, C., García, N.: Improved background modeling for real-time spatio-temporal non-parametric moving object detection strategies. Image Vis. Comput. 31(9), 616–630 (2013)
Article Google Scholar
Maddalena, L., Petrosino, A.: The SOBS algorithm: What are the limits? In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. IEEE, pp. 21–26 (2012)
Haines, T.S.F., Xiang, T.: Background subtraction with dirichletprocess mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 670–683 (2013)
Article Google Scholar
Berjón, D., Cuevas, C., Morán, F., et al.: Real-time nonparametric background subtraction with tracking-based foreground update. Pattern Recogn. 74, 156–170 (2018)
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the CDnet-2014 and LASIESTA datasets, which allowed us to train and evaluate the proposed model.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 61674049 and U19A2053, the Fundamental Research Funds for the Central Universities of China under Grant JZ2021HGQA0262.

Author information

Authors and Affiliations

School of Microelectronics, Hefei University of Technology, Hefei, China
Yizhong Yang, Dajin Li, Xiang Li, Zhang Zhang & Guangjun Xie

Authors

Yizhong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Dajin Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guangjun Xie
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YY supervised the project; DL and XL mainly conducted experiments, and collected and analyzed the data; ZZ and GX provided guidance in the algorithms and experiments; YY, DL and XL wrote the main manuscript; All authors discussed the results, commented on and revised the manuscript.

Corresponding authors

Correspondence to Yizhong Yang or Guangjun Xie.

Ethics declarations

Conflict of interest

The authors declare that they have no Competing interests.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, Y., Li, D., Li, X. et al. A multi-scale inputs and labels model for background subtraction. SIViP 17, 4133–4141 (2023). https://doi.org/10.1007/s11760-023-02645-5

Download citation

Received: 14 May 2022
Revised: 16 December 2022
Accepted: 25 May 2023
Published: 26 June 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11760-023-02645-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-scale inputs and labels model for background subtraction

Abstract

Access this article

Similar content being viewed by others

A multi-scale feature fusion spatial–channel attention model for background subtraction

Nested-Net: a deep nested network for background subtraction

Background Learnable Cascade for Zero-Shot Object Detection

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-scale inputs and labels model for background subtraction

Abstract

Access this article

Similar content being viewed by others

A multi-scale feature fusion spatial–channel attention model for background subtraction

Nested-Net: a deep nested network for background subtraction

Background Learnable Cascade for Zero-Shot Object Detection

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation