Elsevier

Neurocomputing

Volume 394, 21 June 2020, Pages 178-200
Neurocomputing

A novel background subtraction algorithm based on parallel vision and Bayesian GANs

https://doi.org/10.1016/j.neucom.2019.04.088Get rights and content

Abstract

To address the challenges of change detection in the wild, we present a novel background subtraction algorithm based on parallel vision and Bayesian generative adversarial networks (GANs). First, we use the median filtering algorithm for background image extraction. Then, we build the background subtraction model by using Bayesian GANs to classify all pixels into foreground and background, and use parallel vision theory to improve the background subtraction results in complex scenes. The proposed algorithm has been evaluated on the well-known, publicly available changedetection.net dataset. Experiment results show that the proposed algorithm results in better performance than many state-of-the-art ones. In addition, our model trained on CDnet dataset can generalize very well to unseen datasets, outperforming multiple state-of-art methods. The major contribution of this work is to apply parallel vision and Bayesian GANs to solve the difficulties in background subtraction, achieving high detection accuracy.

Introduction

Background subtraction is one of the key technologies in computer vision and video analysis. It is widely used in video monitoring [1], intelligent transportation [2], [3], [4], sports video analysis [5], industrial vision [6] and other fields. Background subtraction aims to extract foreground from video sequence. The main problem of background subtraction is to set up a self-adaptive background model to accurately describe the background information [7]. A well-performing background model can effectively describe the background changes in the space domain and time domain. It is very challenging, because actual background changes are very complicated, such as the diversity of brightness, the mutation of the light, shadow, the swaying leaves, the water ripples, intermittent motion, etc.

Many researchers have proposed their own methods for addressing the challenges of background subtraction, mainly based on statistical background modeling methods. Stauffer and Grimson [8] and Zivkovic [9] firstly proposed the background subtraction method based on Gaussian Mixture Models (GMM). The method based on GMM solves the background subtraction under some complex conditions, and can adapt to slow change of illumination, but it cannot model the background of rapid change. St-Charles and Bilodeau [10] improved local binary patterns (LBP) by setting threshold to the comparison between the grey value of center pixel and its neighboring pixels, which is called local binary similarity patterns (LBSP). The method based LBSP uses only local texture information, problem of shadow can be handled in most cases, but the performance becomes poor in the absence of texture information such as single color. The results of the method base on LBSP and the method based on GMM are satisfactory when dealing static background, but the performance is not good when dealing with dynamic background. For dealing with dynamic background, Bianco et al. [11] proposed IUTIS (In Unity There Is Strength), which combines results of every single feature algorithm according to their characteristics. Sajid and Cheung [12] presented a complete change detection system named Multimode Background Subtraction (MBS). This method creates multiple background models of the scene followed by an initial foreground/background probability estimation for each pixel. Martins et al. [13] proposed a background subtraction algorithm named BMOG, which is based on an adaptive GMM background model and Mixture of Gaussians. Chen et al. [14] presented a background subtraction algorithm named ShareModel. Through dynamically establishing many-to-one relationship between pixels and models, this method allows pixels having similar feature to share the same model. Wang et al. [15] presented a moving object detection system named Flux Tensor with Split Gaussian models (FTSG) that exploits the benefits of fusing a motion computation method based on spatio-temporal tensor formulation, a novel foreground and background modeling scheme, and a multicue appearance comparison. These methods have the problem that their multi-feature processing algorithms are not comprehensive, resulting in poor performance in a certain type of scenarios. For background subtraction, sparse signal recovery is another popular approach [16], [17], [18], [19], which assumes that a new frame can be modeled as sparse linear combination of several previous frames plus a sparse outlier. According to the sparse signal recovery principle, the background should be modeled as a sparse linear combination of previous frames, but real-world observations of surveillance video are often corrupted by foreground and noise [20].

With the development of machine learning, the machine learning methods have been widely used for background subtraction. Yan et al. [21] proposed a background modeling method based on local fusion feature and variational Bayesian. This method uses U-LBSP (uniform-local binary similarity patterns) texture feature, updates LFGMM (Gaussian mixture model based on local fusion feature) and is learned by variational Bayes. Although the background modeling is more reliable and adaptive to the most dynamic scene, the background of fixed update rate still limits the target detection in some special scenarios. The algorithm based on traditional neural network [22], [23] needs to design the model that follows any kind of factorization, which can only be useful for particular scenarios, so it is not practical.

Bayesian generative adversarial networks (Bayesian GANs) [24] have implemented any complex nonlinear mapping function, and has the ability of automatic learning. Compared with traditional neural network algorithm, Bayesian GANs do not need to design the model to obey any kind of factorization [25]. Any generator network and discriminator network will work [26]. So Bayesian GANs also need not repeated sampling, and need not be inferred in the learning process, to avoid the approximate calculation of probability problem [27], [28].

Wang et al. [29], [30] presented parallel vision which is an extension of ACP approach [31] into the vision computing field. Parallel vision aims to establish systematic theories and methods for visual perception and understanding of complex scenes. In parallel vision based background subtraction, the generation of virtual foreground/background segmentation images can be realized by Bayesian GANs. Bayesian GANs can generate virtual foreground/background segmentation images, which can be combined with source images and real foreground/background segmentation images to train the vision models. This helps improve the generalization ability of vision models. There are many complex scenes in the background subtraction, such as shadow interference, illumination changes, indigent texture and camera jitter. We use the theory of parallel vision to improve the results of background subtraction in complex scenes.

To resist the adverse effects of shadow interference, illumination changes, indigent texture and camera jitter in object detection and improve performance, we propose the background subtraction algorithm based on Bayesian generative adversarial networks and parallel vision theory (BSPVGAN). We firstly use the median filtering algorithm for background image extraction and then train the network based on Bayesian GAN. Our work uses Bayesian GAN to classify each pixel effectively, thereby addressing the issues of sudden and slow illumination changes, non-stationary background, and ghost. We adopt deep convolutional neural networks to construct the generator and the discriminator of Bayesian GAN. Experimental results show that the proposed algorithm results in better performance than other ones in most cases.

In short, the contributions of our work are threefold.

  • (1)

    We apply Bayesian GANs to the background subtraction task, which achieves strong robustness when the surveillance scene changes.

  • (2)

    We use deep convolutional generative adversarial network (DCGAN) [32] to build Bayesian GANs-based background subtraction model and get robust results through training the generative adversarial networks.

  • (3)

    The algorithm is evaluated on the CDnet 2014 [33] dataset and results in better performance than many existing methods, including GMM-Staufier [8], GMM-Zivkovic [9], LBSP [10], IUTIS [11], MBS [12], FTSG [15], LFGMM [21], LFVBGM [21], ArunVarghese [34], BMOG [13], DeepBS [22], ShareModel [14], SSOBS [23], WeSamBE [35] and Cascade CNN [36].

To better present our algorithm, the rest of this paper is arranged as follows. Section 2 reviews the theory of Bayesian generative adversarial networks. Section 3 presents our algorithm based on Bayesian generative adversarial networks and parallel vision. Section 4 provides experimental results. Finally, Section 6 concludes the paper.

Section snippets

Bayesian generative adversarial networks

It is obvious that deep learning is dependent of a large amount of labeling data, which also becomes one of the potential factors to inhibit the development of deep learning [37], [38]. For a long time, many scientists have been exploring the use of as little labeled data as possible to train vision models, and working on training mode transition that is from supervised learning to semi-supervised learning and to unsupervised learning later. In general, we do not have big labeled data as it

Proposed algorithm based on Bayesian generative adversarial networks and parallel vision

Background subtraction is a challenging task under complex scenes. There are many knotty scenes, such as shadow interference, illumination changes, indigent texture and camera jitter. We introduce parallel vision theory to solve it. Parallel vision is an extension of ACP approach [31] into the vision computing field. Fig. 1 shows the basic framework and architecture of parallel vision in background subtraction. Parallel vision first constructs virtual foreground/background segmentation images

Experimental evaluation

We evaluate the performance of our algorithm both in terms of its accuracy and runtime. All experiments are conducted using a 4-core PC with an NVIDIA GTX 970 GPU, 16GB of RAM, and Ubuntu 16.

We design the experiments on the CDnet datasets [33]. Our algorithm is compared with some state-of-the-art methods including GMM-Staufer [8], GMM-Zivkovic [9], LBSP [10], IUTIS [11], MBS [12], FTSG [15], LFGMM [21], LFVBGM [21], ArunVarghese [34], B-MOG [13], DeepBS [22], ShareModel [14], SSOBS [23],

Verification experiment of generalization performance of proposed algorithm

To verify the generalization performance of our proposed algorithm when applied to other complex background environments, we fix the model trained on the CDnet dataset (i.e., without any extra training or finetuning) and test its performance on the scene background modeling.net dataset [57], UCSD dataset [58], [59] and SBI2015 dataset [60], [61].

Our first experiment is conducted on scene background modeling.net dataset [57]. The dataset includes 79 video sequences with resolution varying from

Conclusion

A new background subtraction algorithm is proposed in this paper. First, we use the median filtering algorithm for background image extraction. Then, we build the background subtraction model by using Bayesian GANs to classify all pixels into foreground and background, and use parallel vision theory to improve the background subtraction results in complex scenes. Experimental results show that the method achieves good change detection performance. What is more interesting, our model trained on

Conflict of Interest

None.

Acknowledgments

This work was supported by National Natural Science Foundation of China (61533019, U1811463).

Wenbo Zheng received his bachelor degree in software engineering from Wuhan University of Technology, Wuhan, China, in 2017. He is currently a Ph.D. student in the School of Software Engineering, Xi’an Jiaotong University as well as the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include computer vision and machine learning.

References (66)

  • A. Khan et al.

    Multilevel Chinese takeaway process and label-based processes for rule induction in the context of automated sports video annotation

    IEEE Trans. Cybern.

    (2014)
  • WangK. et al.

    A multi-view learning approach to foreground detection for traffic surveillance applications

    IEEE Trans. Veh. Technol.

    (2016)
  • LiuY. et al.

    Visual tracking based on dynamic coupled conditional random field model

    IEEE Trans. Intell. Transp. Syst.

    (2016)
  • GouC. et al.

    Vehicle license plate recognition based on class-specific ers and sae-elm

    Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems (ITSC)

    (2014)
  • P. Napoletano et al.

    Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy

    IEEE Trans. Image Process.

    (2015)
  • J.P. Heather

    Calibration of an industrial vision system using an ellipsoid

    Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP)

    (2014)
  • WangK. et al.

    M4CD: a robust change detection method for intelligent visual surveillance

    IEEE Access

    (2018)
  • C. Stauffer et al.

    Adaptive background mixture models for real-time tracking

    Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition

    (1999)
  • Z. Zivkovic

    Improved adaptive gaussian mixture model for background subtraction

    Proceedings of the 17th International Conference on Pattern Recognition, ICPR

    (2004)
  • P.L. St-Charles et al.

    Improving background subtraction using local binary similarity patterns

    Proceedings of IEEE Winter Conference on Applications of Computer Vision

    (2014)
  • S. Bianco, G. Ciocca, R. Schettini, How far can you get by combining change detection Algorithms? CoRR...
  • H. Sajid et al.

    Universal multimode background subtraction

    IEEE Trans. Image Process.

    (2017)
  • I. Martins, P. Carvalho, L. Corte-Real, J.L. Alba-Castro, BMOG: boosted Gaussian mixture model with controlled...
  • ChenY. et al.

    Learning sharable models for robust background subtraction

    Proceedings of 2015 IEEE International Conference on Multimedia and Expo (ICME)

    (2015)
  • E.J. Candès et al.

    Robust principal component analysis?

    J. ACM

    (2011)
  • ZhouX. et al.

    Moving object detection by detecting contiguous outliers in the low-rank representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • XinB. et al.

    Background subtraction via generalized fused lasso foreground modeling

    Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    (2015)
  • LiuX. et al.

    Background subtraction based on low-rank and structured sparse decomposition

    IEEE Trans. Image Process.

    (2015)
  • LiuX. et al.

    Background subtraction using spatio-temporal group sparsity recovery

    IEEE Trans. Circuits Syst. Video Technol.

    (2018)
  • YanJ. et al.

    Variational Bayesian learning for background subtraction based on local fusion feature

    IET Comput. Vis.

    (2016)
  • K. Sehairi et al.

    Comparative study of motion detection methods for video surveillance systems

    J. Electron. Imaging

    (2017)
  • M. Babaee et al.

    A deep convolutional neural network for video sequence background subtraction

    Pattern Recognit.

    (2017)
  • Y. Saatchi et al.

    Proceedings of the 31st International Conference on Neural Information Processing Systems

    (2017)
  • Cited by (72)

    • Tensor based completion meets adversarial learning: A win–win solution for change detection on unseen videos

      2023, Computer Vision and Image Understanding
      Citation Excerpt :

      Also, this approach follows the scene dependency training protocol where 50% of images are used for training, and the remaining 50% of images are used for testing. Zheng et al. (2020) introduced a Bayesian GANs based background /foreground separation approach whose background subtraction results are subsequently improved by parallel vision. This approach demonstrated good separation performance in several complex scenarios such as shadows, illumination changes, and camera jitter.

    • Object Helps U-Net Based Change Detectors

      2024, IEEE/CAA Journal of Automatica Sinica
    • History Based Incremental Singular Value Decomposition for Background Initialization and Foreground Segmentation

      2024, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus

    Wenbo Zheng received his bachelor degree in software engineering from Wuhan University of Technology, Wuhan, China, in 2017. He is currently a Ph.D. student in the School of Software Engineering, Xi’an Jiaotong University as well as the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include computer vision and machine learning.

    Kunfeng Wang received his Ph.D. degree in control theory and control engineering from the Graduate University of Chinese Academy of Sciences, Beijing, China, in 2008. From December 2015 to January 2017, he was a Visiting Scholar at the School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA, USA. He is currently an Associate Professor at the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include intelligent transportation systems, intelligent vision computing, and machine learning.

    Fei-Yue Wang received his Ph.D. in computer and systems engineering from Rensselaer Polytechnic Institute, Troy, New York in 1990. He joined the University of Arizona in 1990 and became a Professor and Director of the Robotics and Automation Lab (RAL) and Program in Advanced Research for Complex Systems (PARCS). In 1999, he founded the Intelligent Control and Systems Engineering Center at the Institute of Automation, Chinese Academy of Sciences (CAS), Beijing, China, under the support of the Outstanding Oversea Chinese Talents Program from the State Planning Council and “100 Talent Program” from CAS, and in 2002, was appointed as the Director of the Key Lab of Complex Systems and Intelligence Science, CAS. In 2011, he became the State Specially Appointed Expert and the Director of The State Key Laboratory for Management and Control of Complex Systems. Dr. Wang’s current research focuses on methods and applications for parallel systems, social computing, and knowledge automation. He was the Founding Editor-in-Chief of the International Journal of Intelligent Control and Systems (1995–2000), Founding EiC of IEEE ITS Magazine (2006- 2007), EiC of IEEE Intelligent Systems (2009–2012), and EiC of IEEE Transactions on ITS (2009–2016). Currently he is EiC of China’s Journal of Command and Control. Since 1997, he has served as General or Program Chair of more than 20 IEEE, INFORMS, ACM, ASME conferences. He was the President of IEEE ITS Society (2005–2007), Chinese Association for Science and Technology (CAST, USA) in 2005, the American Zhu Kezhen Education Foundation (2007–2008), and the Vice President of the ACM China Council (2010–2011). Since 2008, he is the Vice President and Secretary General of Chinese Association of Automation. Dr. Wang is elected Fellow of IEEE, INCOSE, IFAC, ASME, and AAAS. In 2007, he received the 2nd Class National Prize in Natural Sciences of China and awarded the Outstanding Scientist by ACM for his work in intelligent control and social computing. He received IEEE ITS Outstanding Application and Research Awards in 2009 and 2011, and IEEE SMC Norbert Wiener Award in 2014.

    View full text