$$\mathcal{L}\mathcal{O}^2$$ net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

Ruan, Tao; Wei, Shikui; Zhao, Yao; Guo, Baoqing; Yu, Zujun

doi:10.1007/s10044-023-01193-5

$\mathcal{L}\mathcal{O}^2$net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

Theoretical Advances
Published: 04 October 2023

Volume 26, pages 1671–1683, (2023)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Tao Ruan^1,2,
Shikui Wei³,
Yao Zhao³,
Baoqing Guo ORCID: orcid.org/0000-0002-0110-8107^1,2 &
…
Zujun Yu^1,2

97 Accesses
Explore all metrics

Abstract

Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) and the local semantics (e.g., boundary areas) can be utilized to better distinguish foreground objects from the complex background. In this paper, we investigate how to effectively leverage the above two kinds of semantics. For global semantics, two convolutional modules are designed to take advantage of data-level background priors and feature-level multi-scale characteristics, respectively; for local semantics, another module is further put forward to be aware of the semantic edges between foreground and background. The three modules are intertwined with each other, yielding a simple yet effective deep framework named g$\mathcal{L}\mathcal{O}$bal–$\mathcal{L}\mathcal{O}$cal Semantics Coupled Network ($\mathcal{L}\mathcal{O}^2$Net), which is end-to-end trainable in a scene-specific manner. Benefiting from the $\mathcal{L}\mathcal{O}^2$Net, we achieve superior performance on multiple public datasets, with less supervision trained against several state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 4

Collaborative Video Object Segmentation by Foreground-Background Integration

(MS)2EDNet: Multiscale Motion Saliency Deep Network for Moving Object Detection

Global Spectral Filter Memory Network for Video Object Segmentation

Data Availability Statements

The datasets analyzed during the current study are available in: (1) The CDNet2014 repository, http://changedetection.net/. (2) The SBI2015 repository, https://sbmi2015.na.icar.cnr.it/SBIdataset.html. (3) The UCSD repository, http://www.svcl.ucsd.edu/projects/background_subtraction/.

References

Sharif M, Khan MA, Zahid F, Shah JH, Akram T (2020) Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. Springer Pattern Anal Appl 23(1):281–294
Article Google Scholar
Li B, Huang H, Zhang A, Liu P, Liu C (2021) Approaches on crowd counting and density estimation: a review. Springer Pattern Anal Appl 24(3):853–874
Article Google Scholar
Ding S, Li M, Yang T, Qian R, Xu H, Chen Q, Wang J, Xiong H (2022) Motion-aware contrastive video representation learning via foreground-background merging. In: IEEE conference on computer vision and pattern recognition, pp 9716–9726
Cao Q, Wang Z, Long K (2021) Traffic foreground detection at complex urban intersections using a novel background dictionary learning model. Hindawi J Adv Transp 2021:1–14
Google Scholar
Harikrishnan PM, Thomas A, Nisha JS, Gopi VP, Palanisamy P (2021) Pixel matching search algorithm for counting moving vehicle in highway traffic videos. Springer Multimed Tools Appl 80(2):3153–3172
Article Google Scholar
Tang Y, Wang Y, Qian Y (2023) Railroad crossing surveillance and foreground extraction network: Weakly supervised artificial-intelligence approach, SAGE Publications Transportation Research Record, p 03611981231159406
Chandrakar R, Raja R, Miri R, Sinha U, Kushwaha AKS, Raja H (2022) Enhanced the moving object detection and object tracking for traffic surveillance using RBF-FDLNN and CBF algorithm. Elsevier Expert Syst Appl 191:116306
Article Google Scholar
Zivkovic Z (2004) Improved adaptive gaussian mixture model for background subtraction. IEEE Int Conf Pattern Recogn 2:28–31
Google Scholar
Barnich O, Van Droogenbroeck M (2011) Vibe: a universal background subtraction algorithm for video sequences. IEEE Trans Image Process 20(6):1709–1724
Article MathSciNet MATH Google Scholar
St-Charles P-L, Bilodeau G-A, Bergevin R (2014) Subsense: a universal change detection method with local adaptive sensitivity. IEEE Trans Image Process 24(1):359–373
Article MathSciNet MATH Google Scholar
Ramirez-Quintana JA, Chacon-Murguia MI, Ramirez-Alonso GM (2018) Adaptive background modeling of complex scenarios based on pixel level learning modeled with a retinotopic self-organizing map and radial basis mapping. Springer Appl Intell 48(12):4976–4997
Article Google Scholar
Sanches SR, Oliveira C, Sementille AC, Freire V (2019) Challenging situations for background subtraction algorithms. Springer Appl Intell 49(5):1771–1784
Article Google Scholar
Braham M, Van Droogenbroeck M (2016) Deep background subtraction with scene-specific convolutional neural networks. In: IEEE international conference on systems, signals and image processing, pp 1–4
Wang Y, Luo Z, Jodoin P-M (2017) Interactive deep learning method for segmenting moving objects. Elsevier Pattern Recogn Lett 96:66–75
Article Google Scholar
Lim LA, Keles HY (2020) Learning multi-scale features for foreground segmentation. Springer Pattern Anal Appl 23(3):1369–1380
Article Google Scholar
Babaee M, Dinh DT, Rigoll G (2017) A deep convolutional neural network for background subtraction. Preprint arXiv:1702.01731
Lim LA, Keles HY (2018) Foreground segmentation using convolutional neural networks for multiscale feature encoding. Elsevier Pattern Recogn Lett 112:256–262
Article Google Scholar
Wang Y, Jodoin P-M, Porikli F, Konrad J, Benezeth Y, Ishwar P (2014) Cdnet 2014: an expanded change detection benchmark dataset. In: IEEE conference on computer vision and pattern recognition workshops, pp 387–394
Maddalena L, Petrosino A (2015) Towards benchmarking scene background initialization. In: Springer international conference on image analysis and processing, pp 469–476
Mahadevan V, Vasconcelos N (2009) Spatiotemporal saliency in dynamic scenes. IEEE Trans Pattern Anal Mach Intell 32(1):171–177
Article Google Scholar
Shimada A, Arita D, Taniguchi R-i (2006) Dynamic control of adaptive mixture-of-gaussians background model. In: IEEE international conference on video and signal based surveillance, pp 5–5
Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. IEEE Conf Comput Vis Pattern Recogn 2:246–252
Google Scholar
Mittal A, Paragios N (2004) Motion-based background subtraction using adaptive kernel density estimation. In: IEEE conference on computer vision and pattern recognition, vol 2
Ianasi C, Gui V, Toma CI, Pescaru D (2005) A fast algorithm for background tracking in video surveillance, using nonparametric kernel density estimation. Facta Univ Ser Electron Energ 18(1):127–144
Article Google Scholar
Kim K, Chalidabhongse TH, Harwood D, Davis L (2005) Real-time foreground-background segmentation using codebook model. Elsevier Real-Time Imag 11(3):172–185
Article Google Scholar
Ilyas A, Scuturici M, Miguet S (2009) Real time foreground-background segmentation using a modified codebook model. In: IEEE international conference on advanced video and signal based surveillance, pp 454–459
Tuzel O, Porikli F, Meer P (2005) A Bayesian approach to background modeling. In: IEEE conference on computer vision and pattern recognition workshops, pp 58–58
Yu S-Y, Wang F-L, Xue Y-F, Yang J (2009) Bayesian moving object detection in dynamic scenes using an adaptive foreground model. Springer J Zhejiang Univ Sci A 10(12):1750–1758
Article MATH Google Scholar
Acharya S, Nanda PK (2021) Adjacent LBP and LTP based background modeling with mixed-mode learning for foreground detection. Springer Pattern Anal Appl 24(3):1047–1074
Article Google Scholar
Boufares O, Boussif M, Aloui N (2021) Moving object detection system based on the modified temporal difference and otsu algorithm. In: IEEE international multi-conference on systems, signals & devices (SSD), pp 1378–1382
Kerfa D (2023) Moving objects detection in thermal scene videos using unsupervised Bayesian classifier with bootstrap Gaussian expectation maximization algorithm, Springer Multimedia Tools and Applications, pp 1–16
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. IEEE Proc IEEE 86(11):2278–2324
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Preprint arXiv:1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE conference on computer vision and pattern recognition, pp 2881–2890
Pal SK, Pramanik A, Maiti J, Mitra P (2021) Deep learning in multi-object detection and tracking: state of the art. Springer Appl Intell 51(9):6400–6429
Article Google Scholar
Wang Y, Ye H, Cao F (2022) A novel multi-discriminator deep network for image segmentation. Springer Appl Intell 52(1):1092–1109
Article Google Scholar
Sakkos D, Liu H, Han J, Shao L (2018) End-to-end video background subtraction with 3d convolutional neural networks. Springer Multimed Tools Appl 77(17):23023–23041
Article Google Scholar
Jiang R, Zhu R, Su H, Li Y, Xie Y, Zou W (2023) Deep learning-based moving object segmentation: recent progress and research prospects, Springer Machine Intelligence Research, pp 1–35
An Y, Zhao X, Yu T, Guo H, Zhao C, Tang M, Wang J (2023) Zbs: Zero-shot background subtraction via instance-level background modeling and foreground selection. In: IEEE conference on computer vision and pattern recognition, pp 6355–6364
Kajo I, Kas M, Ruichek Y, Kamel N (2023) Tensor based completion meets adversarial learning: a win-win solution for change detection on unseen videos. Elsevier Comput Vis Image Underst 226:103584
Article Google Scholar
Zhang H, Qu S, Li H, Xu W, Du X (2022) A motion-appearance-aware network for object change detection. Elsevier Knowl Based Syst 255:109612
Article Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statistics, pp 315–323
Xie S, Tu Z (2015) Holistically-nested edge detection. In: IEEE international conference on computer vision, pp 1395–1403
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al (2019) Pytorch: an imperative style, high-performance deep learning library. Preprint arXiv:1912.01703
Berman M, Rannen Triki A, Blaschko MB (2018) The lovász-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In: IEEE conference on computer vision and pattern recognition, pp 4413–4421
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Preprint arXiv:1412.6980
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Springer European conference on computer vision, pp 801–818
Bianco S, Ciocca G, Schettini R (2017) How far can you get by combining change detection algorithms? In: Springer international conference on image analysis and processing, pp 96–107

Download references

Funding

This work was supported in part by Talent Fund of Beijing Jiaotong University(2022RC012), National Natural Science Foundation of China(52202486, 52072026), and Science and Technology Innovation Project of Shuohuang Railway Development Co., Ltd. under China Energy(GJNY-21-65).

Author information

Authors and Affiliations

School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
Tao Ruan, Baoqing Guo & Zujun Yu
Frontiers Science Center for Smart High-speed Railway System, Beijing Jiaotong University, Beijing, 100044, China
Tao Ruan, Baoqing Guo & Zujun Yu
Institute of Information Science at Beijing Jiaotong University, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing, 100044, China
Shikui Wei & Yao Zhao

Authors

Tao Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Shikui Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Baoqing Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zujun Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baoqing Guo.

Ethics declarations

Competing Interests

The authors declare that they have no competing interests.

Financial Interests

The authors declare that they have no financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ruan, T., Wei, S., Zhao, Y. et al. $\mathcal{L}\mathcal{O}^2$net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision. Pattern Anal Applic 26, 1671–1683 (2023). https://doi.org/10.1007/s10044-023-01193-5

Download citation

Received: 20 March 2022
Accepted: 06 September 2023
Published: 04 October 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10044-023-01193-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

\(\mathcal{L}\mathcal{O}^2\)net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

Abstract

Access this article

Similar content being viewed by others

Collaborative Video Object Segmentation by Foreground-Background Integration

(MS)2EDNet: Multiscale Motion Saliency Deep Network for Moving Object Detection

Global Spectral Filter Memory Network for Video Object Segmentation

Data Availability Statements

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Financial Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

\(\mathcal{L}\mathcal{O}^2\)net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

Abstract

Access this article

Similar content being viewed by others

Collaborative Video Object Segmentation by Foreground-Background Integration

(MS)2EDNet: Multiscale Motion Saliency Deep Network for Moving Object Detection

Global Spectral Filter Memory Network for Video Object Segmentation

Data Availability Statements

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Financial Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation