Abstract
Image segmentation is one of the most important topics in the field of computer vision. As a result, many image segmentation approaches have been proposed, and interactive methods based on energy minimization such as GrabCut, have shown successful results. Automating the entire segmentation process is, however, very difficult because virtually all interactive methods require a considerable amount of user interaction. We believe that if additional information is provided to users in order to guide them effectively, the amount of interaction required can be reduced. Consequently, in this paper we propose an efficient foreground extraction algorithm, which utilizes depth information from RGB-D sensors such as Microsoft Kinect and offers users guidance in the foreground extraction process. Our approach can be applied as a pre-processing step for interactive and energy-minimization-based segmentation approaches. Our proposed method is able to segment the foreground from images and give hints that reduce interaction with users. In our method, we make use of the characteristics of depth information captured by RGB-D sensors and describe them using information from the structure tensor. Further, we show experimentally that our proposed method separates foreground from background sufficiently well for real world images.
Similar content being viewed by others
References
Boykov YY, Jolly MP (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. IEEE Conf Comp Vis 1:105–112
Boykov Y, Kolmogorov V (2004) An experimental comparison of Min-Cut/Max-Flow algorithms for energy minimization in vision. IEEE Trans Patt Anal Mach Intel 26(9):1124–1137
Chuang YY, Curless B, Salesin DH, Szeliski R (2001) A Bayesian approach to digital matting. IEEE Conf Comp Vis Patt Recog 2:264–271
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comp Vis 59(2):167–181
Freixenet J, Munoz X, Raba D, Marti J, Cufi X (2002) Yet another survey on image segmentation: region and boundary information integration. European Conf Comp Vis 2352:408–422
Gonzalez RC, Woods RE (1992) Digital image processing, 3rd ed., Addison-Wesley Pub
Greig D, Porteous B, Seheult A (1989) Exact maximum a posteriori estimation for binary images. Royal Statistic Soc Series B 51(2):271–279
Hernandez-Vela A, Zlateva N, Marinov A (2012) Graph cuts optimization for multi-limb human segmentation in depth maps, IEEE Conf Comp Vis Patt Recog pp. 726–732
Lee SW, Seo YH, Yang HS (2013) Foreground extraction algorithm using depth information for image segmentation, 8th Int Conf Broadband and Wireless Computing, Communication and Applications (BWCCA 2013)
Levin A, Rav-Acha A, Lischinski D (2008) Spectral matting. IEEE Trans Patt Anal Mach Intel 30(10):1699–1712
Li Y, Sun J, Tang CK, Shum HY (2004) Lazy snapping, ACM Trans. on Graphics, pp. 303–308
Microsoft Kinect, http://www.xbox.com/kinect/
OpenCV Library, http://www.opencv.org/
Peng B, Zhang L, Zhang D (2013) A survey of graph theoretical approaches to image segmentation. Pattern Recognition, pp. 1020–1038
Rother C, Kolmogorov V, Blake A (2004) GrabCut—interactive foreground extraction using iterated graph cuts, Proc. of ACM SIGGRAPH, pp. 309–314
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Patt Anal Mach Intel 22(8):888–905
Structure Tensor in Wikipedia, http://en.wikipedia.org/wiki/Structure_tensor
Wang L, Zhang C, Yang R, Zhang C (2010) TofCut: towards robust real-time foreground extraction using a time-of-flight camera, Int. Symp. on 3D data processing, visualization and transmission
Wasza J, Bauer E, Hornegger J (2011) Real-time preprocessing for dense 3-D range imaging on the GPU: defect interpolation, bilateral temporal averaging and guided filtering, IEEE Int. Conf. on Computer Vision Workshops, pp. 1221–1227
Acknowledgments
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (2013R1A1A2064233 and 2011-0013776) and the IT R&D program of MKE & KEIT [10041610, The development of the recognition technology for user identity, behavior and location that has a performance approaching recognition rates of 99% on 30 people by using perception sensor network in the real environment].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, SW., Seo, YH. & Yang, H.S. Efficient foreground extraction using RGB-D imaging. Multimed Tools Appl 75, 4969–4980 (2016). https://doi.org/10.1007/s11042-013-1789-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1789-x