Abstract
Foreground segmentation is a fundamental problem in many artificial intelligence and computer vision based applications. However, robust foreground segmentation with high precision is still a challenging problem in complex scenes. Currently, many of the existing algorithms process the input data in RGB space only, where the foreground segmentation performance is most likely degraded by various challenges like shadows, color camouflage, illumination changes, out of range camera sensors and bootstrapping. Cameras capturing RGBD data are highly active visual sensors as they provide depth information along with RGB of the given input images. Therefore, to address the challenging problem we propose a foreground segmentation algorithm based on conditional generative adversarial networks using RGB and depth data. The goal of our proposed model is to perform robust foreground segmentation in the presence of various complex scenes with high accuracy. For this purpose, we trained our GAN based CNN model with RGBD input data conditioned on ground-truth information in an adversarial fashion. During training, our proposed model aims to learn the foreground segmentation on the basis of cross-entropy loss and euclidean distance loss to identify between real vs fake samples. While during testing the model is given RGBD input to the trained generator network that performs robust foreground segmentation. Our proposed method is evaluated using two RGBD benchmark datasets that are SBM-RGBD and MULTIVISION kinect. Various experimental evaluations and comparative analysis of our proposed model with eleven existing methods confirm its superior performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bouwmans, T., Javed, S., Sultana, M., Jung, S.K.: Deep neural network concepts for background subtraction: a systematic review and comparative evaluation. Neural Netw. 117, 8–66 (2019)
Bouwmans, T., Zahzah, E.H.: Robust PCA via principal component pursuit: a review for a comparative evaluation in video surveillance. Comput. Vis. Image Underst. 122, 22–34 (2014)
Camplani, M., Maddalena, L., Gabriel, M., Petrosino, A., Salgado, L.: RGB-D dataset: background learning for detection and tracking from RGBD videos. In: IEEE ICIAP-Workshops (2017)
Chacon-Murguia, M.I., Orozco-Rodriguez, H.E., Ramirez-Quintana, J.A.: Self-adapting fuzzy model for dynamic object detection using RGB-D information. IEEE Sens. J. 17(23), 7961–7970 (2017)
Chen, Y., Zou, W., Tang, Y., Li, X., Xu, C., Komodakis, N.: SCOM: spatiotemporal constrained optimization for salient object detection. IEEE Trans. Image Process. 27(7), 3345–3357 (2018)
De Gregorio, M., Giordano, M.: Cwisardh+: background detection in RGBD videos by learning of weightless neural networks. In: Battiato, S., Farinella, G., Leo, M., Gallo, G. (eds.) International Conference on Image Analysis and Processing, vol. 10590, pp. 242–253. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-70742-6_23
Demir, U., Unal, G.: Patch-based image inpainting with generative adversarial networks. arXiv preprint arXiv:1803.07422 (2018)
Fernandez-Sanchez, E.J., Diaz, J., Ros, E.: Background subtraction based on color and depth using active sensors. Sensors 13(7), 8895–8915 (2013)
Garcia-Garcia, B., Bouwmans, T., Silva, A.J.R.: Background subtraction in real applications: challenges, current models and future directions. Comput. Sci. Rev. 35, 100204 (2020)
Giraldo, J.H., Javed, S., Bouwmans, T.: Graph moving object segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Huang, J., Wu, H., Gong, Y., Gao, D.: Random sampling-based background subtraction with adaptive multi-cue fusion in RGBD videos. In: 2016 9th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (CISP-BMEI), pp. 30–35. IEEE (2016)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Javed, S., Bouwmans, T., Sultana, M., Jung, S.K.: Moving object detection on RGB-D videos using graph regularized spatiotemporal RPCA. In: Battiato, S., Farinella, G., Leo, M., Gallo, G. (eds.) International Conference on Image Analysis and Processing, vol. 10590, pp. 230–241. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-70742-6_22
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)
Maddalena, L., Petrosino, A.: The sobs algorithm: what are the limits? In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 21–26. IEEE (2012)
Maddalena, L., Petrosino, A.: Background subtraction for moving object detection in RGBD data: a survey. J. Imaging 4(5), 71 (2018)
Maddalena, L., Petrosino, A., et al.: A self-organizing approach to background subtraction for visual surveillance applications. IEEE Trans. Image Process. 17(7), 1168 (2008)
Midoh, Y., Nakamae, K.: Image quality enhancement of a cd-sem image using conditional generative adversarial networks. In: Metrology, Inspection, and Process Control for Microlithography XXXIII, vol. 10959, p. 109590B. International Society for Optics and Photonics (2019)
Minematsu, T., Shimada, A., Taniguchi, R.: Rethinking background and foreground in deep neural network-based background subtraction. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 3229–3233. IEEE (2020)
Minematsu, T., Shimada, A., Uchiyama, H., Taniguchi, R.: Analytics of deep neural network-based background subtraction. J. Imaging 4(6), 78 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds.) International Conference on Medical image computing and computer-assisted intervention, vol. 9351, pp. 234–241. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Sultana, M., Mahmood, A., Bouwmans, T., Jung, S.K.: Dynamic background subtraction using least square adversarial learning. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 3204–3208. IEEE (2020)
Sultana, M., Mahmood, A., Bouwmans, T., Ki Jung, S.: Complete moving object detection in the context of robust subspace learning. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (2019)
Sultana, M., Mahmood, A., Javed, S., Jung, S.K.: Unsupervised deep context prediction for background estimation and foreground segmentation. Mach. Vis. Appl. (2018). https://doi.org/10.1007/s00138-018-0993-0
Sultana, M., Mahmood, A., Javed, S., Jung, S.K.: Unsupervised deep context prediction for background foreground separation. arXiv preprint arXiv:1805.07903 (2018)
Sultana, M., Mahmood, A., Javed, S., Jung, S.K.: Unsupervised RGBD video object segmentation using GANs. In: Asian Conference on Computer Vision (2018)
Trabelsi, R., Jabri, I., Smach, F., Bouallegue, A.: Efficient and fast multi-modal foreground-background segmentation using RGBD data. Pattern Recogn. Lett. 97, 13–20 (2017)
Wu, Y., He, X., Nguyen, T.Q.: Moving object detection with a freely moving camera via background motion subtraction. IEEE Trans. Circuits Syst. Video Technol. 27(2), 236–248 (2017)
Xin, B., Tian, Y., Wang, Y., Gao, W.: Background subtraction via generalized fused lasso foreground modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4676–4684 (2015)
Zhang, T., Liu, S., Ahuja, N., Yang, M.H., Ghanem, B.: Robust visual tracking via consistent low-rank sparse learning. Int. J. Comput. Vision 111(2), 171–190 (2015)
Zhou, X., Yang, C., Yu, W.: Moving object detection by detecting contiguous outliers in the low-rank representation. IEEE T-PAMI 35(3), 597–610 (2013)
Acknowledgment
This study was supported by the BK21 FOUR project (AI-driven Convergence Software Education Research Program) funded by the Ministry of Education, School of Computer Science and Engineering, Kyungpook National University, Korea (4199990214394).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Sultana, M., Bouwmans, T., Giraldo, J.H., Jung, S.K. (2021). Robust Foreground Segmentation in RGBD Data from Complex Scenes Using Adversarial Networks. In: Jeong, H., Sumi, K. (eds) Frontiers of Computer Vision. IW-FCV 2021. Communications in Computer and Information Science, vol 1405. Springer, Cham. https://doi.org/10.1007/978-3-030-81638-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-81638-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81637-7
Online ISBN: 978-3-030-81638-4
eBook Packages: Computer ScienceComputer Science (R0)