Abstract
Background subtraction from color and depth data is a fundamental task for video surveillance applications that use data acquired by RGBD sensors. We present a method that adopts a self-organizing neural background model previously adopted for RGB videos to model the color and depth background separately. The resulting color and depth detection masks are combined to guide the selective model update procedure and to achieve the final result. Extensive experimental results and comparisons with several state-of-the-art methods on a publicly available dataset show that the exploitation of depth information allows achieving much higher performance than just using color, accurately handling color and depth background maintenance challenges.
Similar content being viewed by others
References
Almazan EJ, Jones GA (2013) Tracking people across multiple non-overlapping RGB-D sensors. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW 2013). Portland, pp 831–837
Barnich O, Droogenbroeck MV (2009) Vibe: a powerful random technique to estimate the background in video sequences. In: 2009 IEEE international conference on acoustics, speech and signal processing, pp 945–948. https://doi.org/10.1109/ICASSP.2009.4959741
Bouwmans T, Maddalena L, Petrosino A (2017) Scene background initialization: a taxonomy. Pattern Recogn Lett 96:3–11
Cai Z, Han J, Liu L, Shao L (2017) RGB-D datasets using microsoft Kinect or similar sensors: a survey. Multimed Tools Appl 76(3):4313–4355
Camplani M, del Blanco CR, Salgado L, Jaureguizar F, García N (2014) Multi-sensor background subtraction by fusing multiple region-based probabilistic classifiers. Pattern Recogn Lett 50:23–33. https://doi.org/10.1016/j.patrec.2013.09.022. Depth Image Analysis
Camplani M, Salgado L (2014) Background foreground segmentation with RGB-D Kinect data: An efficient combination of classifiers. J Vis Commun Image Represent 25(1):122–136. https://doi.org/10.1016/j.jvcir.2013.03.009. Visual Understanding and Applications with RGB-D Cameras
Camplani M, Maddalena L, Moyá Alcover G, Petrosino A, Salgado L (2017) SBM-RGBD Dataset. http://rgbd2017.na.icar.cnr.it/SBM-RGBDdataset.html
Camplani M, Maddalena L, Moyá Alcover G, Petrosino A, Salgado L (2017) A Benchmarking framework for background subtraction in RGBD videos. In: Battiato S, Farinella GM, Leo M, Gallo G (eds) New trends in image analysis and processing – ICIAP 2017. Springer International Publishing, pp 219–229
Camplani M, Paiement A, Mirmehdi M, Damen D, Hannuna S, Burghardt T, Tao L (2017) Multiple human tracking in rgb-depth data: a survey. IET Comput Vis 11(4):265–285
Clapés A, Reyes M, Escalera S (2013) Multi-modal user identification and object recognition surveillance system. Pattern Recogn Lett 34(7):799–808
Crabb R, Tracey C, Puranik A, Davis J (2008) Real-time foreground segmentation via range and color imaging. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW 2008), pp 1–5. https://doi.org/10.1109/CVPRW.2008.4563170
De Gregorio M, Giordano M (2017) WiSARD-based learning and classification of background in RGBD videos. In: Battiato S, Farinella GM, Leo M, Gallo G (eds) New trends in image analysis and processing – ICIAP 2017. Springer International Publishing
Ding J, Ma R, Chen S (2008) A scale-based connected coherence tree algorithm for image segmentation. IEEE Trans Image Process 17(2):204–216
Dollȧr P, Zitnick CL (2015) Fast edge detection using structured forests. IEEE Trans Pattern Anal Mach Intell 37(8):1558–1570
Elgammal AM, Harwood D, Davis LS (2000) Non-parametric model for background subtraction. In: Proceedings of ECCV. Springer-Verlag, pp 751–767
Fernandez-Sanchez EJ, Diaz J, Ros E (2013) Background subtraction based on color and depth using active sensors. Sensors 13:8895–8915
Fernandez-Sanchez EJ, Rubio L, Diaz J, Ros E (2014) Background subtraction model based on color and depth cues. Mach Vis Appl 25(5):1211–1225. https://doi.org/10.1007/s00138-013-0562-5
Firman M (2016) RGBD datasets: past, present and future. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW 2016), pp 661–673
Frick A, Kellner F, Bartczak B, Koch R (2009) Generation of 3d-tv ldv-content with time-of-flight camera. In: 2009 3DTV conference: the true vision - capture, transmission and display of 3d video, pp 1–4. https://doi.org/10.1109/3DTV.2009.5069624
Galanakis G, Zabulis X, Koutlemanis P, Paparoulis S, Kouroumalis V (2014) Tracking persons using a network of rgbd cameras. In: Proceedings of the 7th international conference on PErvasive technologies related to assistive environments, PETRA ’14. ACM, New York, pp 63:1–63:4
Gallego J, Pardás M (2014) Region based foreground segmentation combining color and depth sensors via logarithmic opinion pool decision, vol 25. https://doi.org/10.1016/j.jvcir.2013.03.019. Visual Understanding and Applications with RGB-D Cameras
Gordon G, Darrell T, Harville M, Woodfill J (1999) Background estimation and removal based on range and color. In; IEEE conference on computer vision and pattern recognition (CVPR ’99), Ft. Collins, pp 2459–2464. https://doi.org/10.1109/CVPR.1999.784721
Goyette N, Jodoin P, Porikli F, Konrad J, Ishwar P (2012) Changedetection.net: a new change detection Benchmark dataset. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW 2012), pp 1–8. https://doi.org/10.1109/CVPRW.2012.6238919
Goyette N, Jodoin P, Porikli F, Konrad J, Ishwar P (2014) A novel video dataset for change detection Benchmarking. IEEE Trans Image Process 23 (11):4663–4679
Guomundsson SA, Larsen R, Aanaes H, Pardas M, Casas JR (2008) Tof imaging in smart room environments towards improved people tracking. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW 2008), pp 1–6. https://doi.org/10.1109/CVPRW.2008.4563154
Han J, Shao L, Xu D, Shotton J (2013) Enhanced computer vision with microsoft Kinect sensor: A review. IEEE Trans Cybern 43(5):1318–1334. https://doi.org/10.1109/TCYB.2013.2265378
Harville M, Gordon G, Woodfill J (2001) Foreground segmentation using adaptive mixture models in color and depth. In: Proceedings IEEE workshop on detection and recognition of events in video, pp 3–11. https://doi.org/10.1109/EVENT.2001.938860
Huang J, Wu H, Gong Y, Gao D (2016) Random sampling-based background subtraction with adaptive multi-cue fusion in RGBD videos. In: 2016 9th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI), pp 30–35. https://doi.org/10.1109/CISP-BMEI.2016.7852677
Javed S, Bouwmans T, Sultana M, Jung SK (2017) Moving object detection on rgb-d videos using graph regularized spatiotemporal rpca. In: Battiato S, Farinella GM, Leo M, Gallo G (eds) New trends in image analysis and processing – ICIAP 2017. Springer International Publishing, pp 230–241
Jodoin P, Maddalena L, Petrosino A, Wang Y (2017) Extensive Benchmark and survey of modeling methods for scene background initialization. IEEE Trans Image Process 26(11):5244–5256. https://doi.org/10.1109/TIP.2017.2728181
Kim Y Unpublished
Kwolek B, Kepski M (2014) Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput Methods Programs Biomed 117 (3):489–501
Laugraud B, Piérard S, Braham M, Van Droogenbroeck M (2015) Simple median-based method for stationary background generation using background subtraction algorithms. In: New trends in image analysis and processing-ICIAP 2015 workshops, LNCS, vol 9281. Springer, pp 477–484. https://doi.org/10.1007/978-3-319-23222-5_58
Leens J, Piérard S, Barnich O, Van Droogenbroeck M, Wagner JM (2009) Combining color, depth, and motion for video segmentation. In: Fritz M, Schiele B, Piater JH (eds) Proceedings of computer vision systems: 7th international conference on computer vision systems, ICVS 2009 Liège, Belgium. Springer Berlin Heidelberg, Berlin, pp 104–113. https://doi.org/10.1007/978-3-642-04667-4_11
Li GL, Wang X Avgm-d. Unpublished
Liang Z, Liu X, Liu H, Chen W (2016) A refinement framework for background subtraction based on color and depth data. In: 2016 IEEE international conference on image processing (ICIP), pp 271–275. https://doi.org/10.1109/ICIP.2016.7532361
Maddalena L, Petrosino A RGBD-SOBS Software. http://www.na.icar.cnr.it/maddalena.l/MODLab/SoftwareRGBD-SOBS.html
Maddalena L, Petrosino A (2008) A self-organizing approach to background subtraction for visual surveillance applications. IEEE Trans Image Process 17 (7):1168–1177
Maddalena L, Petrosino A (2010) A fuzzy spatial coherence-based approach to background/foreground separation for moving object detection. Neural Comput Appl 19:179–186
Maddalena L, Petrosino A (2012) The SOBS algorithm: what are the limits? In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW 2012), pp 21–26. https://doi.org/10.1109/CVPRW.2012.6238922
Maddalena L, Petrosino A (2017) Exploiting color and depth for background subtraction. In: Battiato S, Farinella GM, Leo M, Gallo G (eds) New trends in image analysis and processing – ICIAP 2017. Springer International Publishing, pp 254–265
Maddalena L, Petrosino A (2018) Background subtraction for moving object detection in RGBD data: a survey. J Imag 4(5). https://doi.org/10.3390/jimaging4050071. http://www.mdpi.com/2313-433X/4/5/71
Mahbub U, Imtiaz H, Roy T, Rahman MS, Ahad MAR (2013) A template matching approach of one-shot-learning gesture recognition. Pattern Recogn Lett 34 (15):1780–1788. Smart Approaches for Human Action Recognition
Minematsu T, Shimada A, Uchiyama H, Taniguchi R (2017) Simple combination of appearance and depth for foreground segmentation. In: Battiato S, Farinella GM, Leo M, Gallo G (eds) New trends in image analysis and processing – ICIAP 2017. Springer International Publishing
Moyá-Alcover G, Elgammal A, Jaume-i-Capó A, Varona J (2017) Modeling depth for nonparametric foreground segmentation using RGBD devices. Pattern Recogn Lett 96:76–85
Nguyen VT, Vu H, Tran TH (2015) An efficient combination of RGB and depth for background subtraction. In: Dang QA, Nguyen XH, Le HB, Nguyen VH, Bao VNQ (eds) Some current advanced researches on information and computer science in Vietnam: post-proceedings of the first NAFOSTED conference on information and computer science. https://doi.org/10.1007/978-3-319-14633-1_4. Springer International Publishing, pp 49–63
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Schiller I, Koch R (2011) Improved video segmentation by adaptive combination of depth keying and mixture-of-gaussians. In: Proceedings of the 17th Scandinavian conference on image analysis, SCIA 2011, Ystad, pp 59–68. https://doi.org/10.1007/978-3-642-21227-7_6
Song S, Xiao J (2013) Tracking revisited using RGBD camera: unified Benchmark and baselines. In: IEEE international conference on computer vision (ICCV 2013), pp 233–240
Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. In: Proceedings of 1999 IEEE computer society conference on computer vision and pattern recognition (cat. no PR00149), vol 2, pp 252. https://doi.org/10.1109/CVPR.1999.784637
Stormer A, Hofmann M, Rigoll G (2010) Depth gradient based segmentation of overlapping foreground objects in range images. In: 2010 13th international conference on information fusion, pp 1–4. https://doi.org/10.1109/ICIF.2010.5712108
Toyama K, Krumm J, Brumitt B, Meyers B (1999) Wallflower: principles and practice of background maintenance. In: Proceedings of the seventh IEEE international conference on computer vision, vol 1, pp 255–261. https://doi.org/10.1109/ICCV.1999.791228
Trabelsi R, Jabri I, Smach F, Bouallegue A (2017) Efficient and fast multi-modal foreground-background segmentation using RGBD data. Pattern Recogn Lett 97:13–20
Xia L, Chen CC, Aggarwal JK (2011) Human detection using depth information by Kinect. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW 2011), pp 15–22. https://doi.org/10.1109/CVPRW.2011.5981811
Zhang Z (2012) Microsoft Kinect sensor and its effect. IEEE MultiMedia 19 (2):4–10
Acknowledgements
L. Maddalena acknowledges the GNCS (Gruppo Nazionale di Calcolo Scientifico) and the INTEROMICS Flagship Project funded by MIUR, Italy. A. Petrosino wishes to acknowledge Project VIRTUALOG Horizon 2020-PON 2014/2020.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Maddalena, L., Petrosino, A. Self-organizing background subtraction using color and depth data. Multimed Tools Appl 78, 11927–11948 (2019). https://doi.org/10.1007/s11042-018-6741-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6741-7