A novel background subtraction algorithm based on parallel vision and Bayesian GANs
Introduction
Background subtraction is one of the key technologies in computer vision and video analysis. It is widely used in video monitoring [1], intelligent transportation [2], [3], [4], sports video analysis [5], industrial vision [6] and other fields. Background subtraction aims to extract foreground from video sequence. The main problem of background subtraction is to set up a self-adaptive background model to accurately describe the background information [7]. A well-performing background model can effectively describe the background changes in the space domain and time domain. It is very challenging, because actual background changes are very complicated, such as the diversity of brightness, the mutation of the light, shadow, the swaying leaves, the water ripples, intermittent motion, etc.
Many researchers have proposed their own methods for addressing the challenges of background subtraction, mainly based on statistical background modeling methods. Stauffer and Grimson [8] and Zivkovic [9] firstly proposed the background subtraction method based on Gaussian Mixture Models (GMM). The method based on GMM solves the background subtraction under some complex conditions, and can adapt to slow change of illumination, but it cannot model the background of rapid change. St-Charles and Bilodeau [10] improved local binary patterns (LBP) by setting threshold to the comparison between the grey value of center pixel and its neighboring pixels, which is called local binary similarity patterns (LBSP). The method based LBSP uses only local texture information, problem of shadow can be handled in most cases, but the performance becomes poor in the absence of texture information such as single color. The results of the method base on LBSP and the method based on GMM are satisfactory when dealing static background, but the performance is not good when dealing with dynamic background. For dealing with dynamic background, Bianco et al. [11] proposed IUTIS (In Unity There Is Strength), which combines results of every single feature algorithm according to their characteristics. Sajid and Cheung [12] presented a complete change detection system named Multimode Background Subtraction (MBS). This method creates multiple background models of the scene followed by an initial foreground/background probability estimation for each pixel. Martins et al. [13] proposed a background subtraction algorithm named BMOG, which is based on an adaptive GMM background model and Mixture of Gaussians. Chen et al. [14] presented a background subtraction algorithm named ShareModel. Through dynamically establishing many-to-one relationship between pixels and models, this method allows pixels having similar feature to share the same model. Wang et al. [15] presented a moving object detection system named Flux Tensor with Split Gaussian models (FTSG) that exploits the benefits of fusing a motion computation method based on spatio-temporal tensor formulation, a novel foreground and background modeling scheme, and a multicue appearance comparison. These methods have the problem that their multi-feature processing algorithms are not comprehensive, resulting in poor performance in a certain type of scenarios. For background subtraction, sparse signal recovery is another popular approach [16], [17], [18], [19], which assumes that a new frame can be modeled as sparse linear combination of several previous frames plus a sparse outlier. According to the sparse signal recovery principle, the background should be modeled as a sparse linear combination of previous frames, but real-world observations of surveillance video are often corrupted by foreground and noise [20].
With the development of machine learning, the machine learning methods have been widely used for background subtraction. Yan et al. [21] proposed a background modeling method based on local fusion feature and variational Bayesian. This method uses U-LBSP (uniform-local binary similarity patterns) texture feature, updates LFGMM (Gaussian mixture model based on local fusion feature) and is learned by variational Bayes. Although the background modeling is more reliable and adaptive to the most dynamic scene, the background of fixed update rate still limits the target detection in some special scenarios. The algorithm based on traditional neural network [22], [23] needs to design the model that follows any kind of factorization, which can only be useful for particular scenarios, so it is not practical.
Bayesian generative adversarial networks (Bayesian GANs) [24] have implemented any complex nonlinear mapping function, and has the ability of automatic learning. Compared with traditional neural network algorithm, Bayesian GANs do not need to design the model to obey any kind of factorization [25]. Any generator network and discriminator network will work [26]. So Bayesian GANs also need not repeated sampling, and need not be inferred in the learning process, to avoid the approximate calculation of probability problem [27], [28].
Wang et al. [29], [30] presented parallel vision which is an extension of ACP approach [31] into the vision computing field. Parallel vision aims to establish systematic theories and methods for visual perception and understanding of complex scenes. In parallel vision based background subtraction, the generation of virtual foreground/background segmentation images can be realized by Bayesian GANs. Bayesian GANs can generate virtual foreground/background segmentation images, which can be combined with source images and real foreground/background segmentation images to train the vision models. This helps improve the generalization ability of vision models. There are many complex scenes in the background subtraction, such as shadow interference, illumination changes, indigent texture and camera jitter. We use the theory of parallel vision to improve the results of background subtraction in complex scenes.
To resist the adverse effects of shadow interference, illumination changes, indigent texture and camera jitter in object detection and improve performance, we propose the background subtraction algorithm based on Bayesian generative adversarial networks and parallel vision theory (BSPVGAN). We firstly use the median filtering algorithm for background image extraction and then train the network based on Bayesian GAN. Our work uses Bayesian GAN to classify each pixel effectively, thereby addressing the issues of sudden and slow illumination changes, non-stationary background, and ghost. We adopt deep convolutional neural networks to construct the generator and the discriminator of Bayesian GAN. Experimental results show that the proposed algorithm results in better performance than other ones in most cases.
In short, the contributions of our work are threefold.
- (1)
We apply Bayesian GANs to the background subtraction task, which achieves strong robustness when the surveillance scene changes.
- (2)
We use deep convolutional generative adversarial network (DCGAN) [32] to build Bayesian GANs-based background subtraction model and get robust results through training the generative adversarial networks.
- (3)
The algorithm is evaluated on the CDnet 2014 [33] dataset and results in better performance than many existing methods, including GMM-Staufier [8], GMM-Zivkovic [9], LBSP [10], IUTIS [11], MBS [12], FTSG [15], LFGMM [21], LFVBGM [21], ArunVarghese [34], BMOG [13], DeepBS [22], ShareModel [14], SSOBS [23], WeSamBE [35] and Cascade CNN [36].
To better present our algorithm, the rest of this paper is arranged as follows. Section 2 reviews the theory of Bayesian generative adversarial networks. Section 3 presents our algorithm based on Bayesian generative adversarial networks and parallel vision. Section 4 provides experimental results. Finally, Section 6 concludes the paper.
Section snippets
Bayesian generative adversarial networks
It is obvious that deep learning is dependent of a large amount of labeling data, which also becomes one of the potential factors to inhibit the development of deep learning [37], [38]. For a long time, many scientists have been exploring the use of as little labeled data as possible to train vision models, and working on training mode transition that is from supervised learning to semi-supervised learning and to unsupervised learning later. In general, we do not have big labeled data as it
Proposed algorithm based on Bayesian generative adversarial networks and parallel vision
Background subtraction is a challenging task under complex scenes. There are many knotty scenes, such as shadow interference, illumination changes, indigent texture and camera jitter. We introduce parallel vision theory to solve it. Parallel vision is an extension of ACP approach [31] into the vision computing field. Fig. 1 shows the basic framework and architecture of parallel vision in background subtraction. Parallel vision first constructs virtual foreground/background segmentation images
Experimental evaluation
We evaluate the performance of our algorithm both in terms of its accuracy and runtime. All experiments are conducted using a 4-core PC with an NVIDIA GTX 970 GPU, 16GB of RAM, and Ubuntu 16.
We design the experiments on the CDnet datasets [33]. Our algorithm is compared with some state-of-the-art methods including GMM-Staufer [8], GMM-Zivkovic [9], LBSP [10], IUTIS [11], MBS [12], FTSG [15], LFGMM [21], LFVBGM [21], ArunVarghese [34], B-MOG [13], DeepBS [22], ShareModel [14], SSOBS [23],
Verification experiment of generalization performance of proposed algorithm
To verify the generalization performance of our proposed algorithm when applied to other complex background environments, we fix the model trained on the CDnet dataset (i.e., without any extra training or finetuning) and test its performance on the scene background modeling.net dataset [57], UCSD dataset [58], [59] and SBI2015 dataset [60], [61].
Our first experiment is conducted on scene background modeling.net dataset [57]. The dataset includes 79 video sequences with resolution varying from
Conclusion
A new background subtraction algorithm is proposed in this paper. First, we use the median filtering algorithm for background image extraction. Then, we build the background subtraction model by using Bayesian GANs to classify all pixels into foreground and background, and use parallel vision theory to improve the background subtraction results in complex scenes. Experimental results show that the method achieves good change detection performance. What is more interesting, our model trained on
Conflict of Interest
None.
Acknowledgments
This work was supported by National Natural Science Foundation of China (61533019, U1811463).
Wenbo Zheng received his bachelor degree in software engineering from Wuhan University of Technology, Wuhan, China, in 2017. He is currently a Ph.D. student in the School of Software Engineering, Xi’an Jiaotong University as well as the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include computer vision and machine learning.
References (66)
- et al.
Static and moving object detection using flux tensor with split gaussian models
Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops
(2014) - et al.
Interactive deep learning method for segmenting moving objects
Pattern Recognit. Lett.
(2017) - et al.
Online learning control using adaptive critic designs with sparse kernel machines
IEEE Trans. Neural Netw. Learn. Syst.
(2013) - et al.
Max-margin deep generative models for (semi-)supervised learning
IEEE Trans. Pattern Anal. Mach. Intell.
(2017) - Z.M. Erickson, S. Chernova, C.C. Kemp, Semi-supervised haptic material recognition for robots using generative...
- et al.
An end-to-end generative adversarial network for crowd counting under complicated scenes
Proceedings of the 2017 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)
(2017) - et al.
Convolutional neural networks at constrained time cost
Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
(2015) - et al.
Foreground segmentation using convolutional neural networks for multiscale feature encoding
Pattern Recognit. Lett.
(2018) - et al.
ViBe: a powerful random technique to estimate the background in video sequences
Proceedings of 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
(2009) - et al.
Semantic background subtraction
Proceedings of 2017 IEEE International Conference on Image Processing (ICIP)
(2017)
Multilevel Chinese takeaway process and label-based processes for rule induction in the context of automated sports video annotation
IEEE Trans. Cybern.
A multi-view learning approach to foreground detection for traffic surveillance applications
IEEE Trans. Veh. Technol.
Visual tracking based on dynamic coupled conditional random field model
IEEE Trans. Intell. Transp. Syst.
Vehicle license plate recognition based on class-specific ers and sae-elm
Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems (ITSC)
Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy
IEEE Trans. Image Process.
Calibration of an industrial vision system using an ellipsoid
Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP)
M4CD: a robust change detection method for intelligent visual surveillance
IEEE Access
Adaptive background mixture models for real-time tracking
Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Improved adaptive gaussian mixture model for background subtraction
Proceedings of the 17th International Conference on Pattern Recognition, ICPR
Improving background subtraction using local binary similarity patterns
Proceedings of IEEE Winter Conference on Applications of Computer Vision
Universal multimode background subtraction
IEEE Trans. Image Process.
Learning sharable models for robust background subtraction
Proceedings of 2015 IEEE International Conference on Multimedia and Expo (ICME)
Robust principal component analysis?
J. ACM
Moving object detection by detecting contiguous outliers in the low-rank representation
IEEE Trans. Pattern Anal. Mach. Intell.
Background subtraction via generalized fused lasso foreground modeling
Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Background subtraction based on low-rank and structured sparse decomposition
IEEE Trans. Image Process.
Background subtraction using spatio-temporal group sparsity recovery
IEEE Trans. Circuits Syst. Video Technol.
Variational Bayesian learning for background subtraction based on local fusion feature
IET Comput. Vis.
Comparative study of motion detection methods for video surveillance systems
J. Electron. Imaging
A deep convolutional neural network for video sequence background subtraction
Pattern Recognit.
Proceedings of the 31st International Conference on Neural Information Processing Systems
Cited by (72)
Tensor based completion meets adversarial learning: A win–win solution for change detection on unseen videos
2023, Computer Vision and Image UnderstandingCitation Excerpt :Also, this approach follows the scene dependency training protocol where 50% of images are used for training, and the remaining 50% of images are used for testing. Zheng et al. (2020) introduced a Bayesian GANs based background /foreground separation approach whose background subtraction results are subsequently improved by parallel vision. This approach demonstrated good separation performance in several complex scenarios such as shadows, illumination changes, and camera jitter.
DeepFTSG: Multi-stream Asymmetric USE-Net Trellis Encoders with Shared Decoder Feature Fusion Architecture for Video Motion Segmentation
2024, International Journal of Computer VisionObject Helps U-Net Based Change Detectors
2024, IEEE/CAA Journal of Automatica SinicaHistory Based Incremental Singular Value Decomposition for Background Initialization and Foreground Segmentation
2024, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Wenbo Zheng received his bachelor degree in software engineering from Wuhan University of Technology, Wuhan, China, in 2017. He is currently a Ph.D. student in the School of Software Engineering, Xi’an Jiaotong University as well as the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include computer vision and machine learning.
Kunfeng Wang received his Ph.D. degree in control theory and control engineering from the Graduate University of Chinese Academy of Sciences, Beijing, China, in 2008. From December 2015 to January 2017, he was a Visiting Scholar at the School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA, USA. He is currently an Associate Professor at the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interests include intelligent transportation systems, intelligent vision computing, and machine learning.
Fei-Yue Wang received his Ph.D. in computer and systems engineering from Rensselaer Polytechnic Institute, Troy, New York in 1990. He joined the University of Arizona in 1990 and became a Professor and Director of the Robotics and Automation Lab (RAL) and Program in Advanced Research for Complex Systems (PARCS). In 1999, he founded the Intelligent Control and Systems Engineering Center at the Institute of Automation, Chinese Academy of Sciences (CAS), Beijing, China, under the support of the Outstanding Oversea Chinese Talents Program from the State Planning Council and “100 Talent Program” from CAS, and in 2002, was appointed as the Director of the Key Lab of Complex Systems and Intelligence Science, CAS. In 2011, he became the State Specially Appointed Expert and the Director of The State Key Laboratory for Management and Control of Complex Systems. Dr. Wang’s current research focuses on methods and applications for parallel systems, social computing, and knowledge automation. He was the Founding Editor-in-Chief of the International Journal of Intelligent Control and Systems (1995–2000), Founding EiC of IEEE ITS Magazine (2006- 2007), EiC of IEEE Intelligent Systems (2009–2012), and EiC of IEEE Transactions on ITS (2009–2016). Currently he is EiC of China’s Journal of Command and Control. Since 1997, he has served as General or Program Chair of more than 20 IEEE, INFORMS, ACM, ASME conferences. He was the President of IEEE ITS Society (2005–2007), Chinese Association for Science and Technology (CAST, USA) in 2005, the American Zhu Kezhen Education Foundation (2007–2008), and the Vice President of the ACM China Council (2010–2011). Since 2008, he is the Vice President and Secretary General of Chinese Association of Automation. Dr. Wang is elected Fellow of IEEE, INCOSE, IFAC, ASME, and AAAS. In 2007, he received the 2nd Class National Prize in Natural Sciences of China and awarded the Outstanding Scientist by ACM for his work in intelligent control and social computing. He received IEEE ITS Outstanding Application and Research Awards in 2009 and 2011, and IEEE SMC Norbert Wiener Award in 2014.