Unsupervised Segmentation of Stereoscopic Video Objects: Constrained Segmentation Fusion Versus Greedy Active Contours

Ntalianis, Klimis S.; Doulamis, Anastasios D.; Doulamis, Nikolaos D.; Mastorakis, Nikos E.; Drigas, Athanasios S.

doi:10.1007/s11265-014-0921-0

Unsupervised Segmentation of Stereoscopic Video Objects: Constrained Segmentation Fusion Versus Greedy Active Contours

Published: 09 July 2014

Volume 81, pages 153–181, (2015)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Klimis S. Ntalianis¹,
Anastasios D. Doulamis²,
Nikolaos D. Doulamis³,
Nikos E. Mastorakis⁴ &
…
Athanasios S. Drigas⁵

392 Accesses
Explore all metrics

Abstract

In this paper two efficient unsupervised video object segmentation approaches are proposed and thoroughly compared. Both methods are based on the exploitation of depth information, estimated from stereoscopic pairs. Depth is a more efficient semantic descriptor of visual content, since usually an object is located on one depth plane. However, depth information fails to accurately represent the contours of an object mainly due to erroneous disparity estimation and occlusion issues. For this reason, the first approach projects color segments onto depth information in order to address the limitations of both depth and color segmentation; color segmentation usually over-partitions an object into several regions, while depth fails to precisely represent object contours. Depth information is produced through an occlusion compensated disparity field and then a depth map is generated. On the contrary, color segmentation is accomplished by incorporating a modified version of the Multiresolution Recursive Shortest Spanning Tree segmentation algorithm (M-RSST). Next considering the first “Constrained Fusion of Color Segments” (CFCS) approach, a color segments map is created, by applying the M-RSST to one of the stereoscopic channels. In this case video objects are extracted by fusing color segments according to depth similarity criteria. The second method also utilizes the depth segments map. In particular an active contour is automatically initialized onto the boundary of each depth segment, which is usually different from a video object’s boundary. Initialization is accomplished by a fitness function that considers different color areas and preserves the shapes of depth segments’ boundaries. For acceleration purposes each point of the active contour is associated to an “attractive edge” point and a greedy approach is incorporated so that the active contour converges to its final position. Several experiments on real life stereoscopic sequences are performed and extensive comparisons in terms of speed and accuracy indicate the promising performance of both methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive stereo image segmentation via adaptive prior selection

Article 03 May 2018

Color image segmentation by combining the convex active contour and the Chan Vese model

Article 03 August 2017

Fast interactive stereo image segmentation

Article 06 August 2015

References

Ho, P.-G. (2011). Image Segmentation. InTech, ISBN 978-953-307-228-9.
Doulamis, N., Doulamis, A., Avrithis, Y., Ntalianis, K., & Kollias, S. (2000). Efficient summarization of stereoscopic video sequences. IEEE Transaction Circuits and Systems for Video Technology, 10(4), 501–517.
Article Google Scholar
He, H. McKinnon, D. & Upcroft, B. (2011). Towards automatic object segmentation with sequential multiple views, ACRA 2011 Proceedings, Australian Robotics & Automation Association, (pp. 1–7).
Boukov,Y. & Jolly, M.-P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In Proc. IEEE Int. Conf. on Computer Vision.
Rother,C., Kolmogorov, V., Blake, A. (2004) GrabCut — Interactive Foreground Extraction using Iterated Graph Cuts. ACM Transactions on Graphics (SIGGRAPH).
Zhang, G., Jia, J. & Bao, H. (2011). Simultaneous Multi-Body Stereo and Segmentation. In Proc. of the 13th International Conference on Computer Vision, Barcelona, Spain, Nov.
C. Zhang, L. Wang, and R. Yang, “Semantic segmentation of urban scenes using dense depth maps,” In ECCV, p.p. 708–721, 2010
Prisacariu, V. A., & Reid, I. D. (2012). PWP3D: real-time segmentation and tracking of 3D objects. International Journal of Computer Vision, 98(3), 335–354.
Article MathSciNet Google Scholar
S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison and A. Fitzgibbon, “KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera,” In Proc. of ACM UIST, p.p. 559–568, 2011
Wang, L., Zhang, C., Yang, R. & Zhang, C. (2010). TofCut: Towards Robust Real-time Foreground Extraction Using a Time-of-Flight Camera. Fifth International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT).
Zhang, G., Jia, J., Hua, W., & Bao, H. (2011). Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 603–617.
Article Google Scholar
Bleyer, M., Rother, C., Kohli, P., Scharstein, D., & Sinha, S. (2011). “Object stereo - joint stereo matching and object segmentation”, in proc. Colorado: IEEE Computer Vision & Pattern Recognition.
Google Scholar
Guillemaut J.Y. & Hilton, A. (2011). “oint multi-layer segmentation and reconstruction for free-viewpoint video applications. International Journal of Computer Vision, (pp. 1–28).
Xiao J. & Quan, L. (2009). Multiple view semantic segmentation for street view images. In Proc. of the IEEE 12th International Conference on Computer Vision, (pp. 686–693).
Liu, Z., Shi, R., Shen, L., Xue, Y., Ngan, K. N. & Zhang, Z. (2012). Unsupervised Salient Object Segmentation Based on Kernel Density Estimation and Two-Phase Graph Cut. IEEE Trans. on Multimedia, Vol. 14, No. 4, Aug.
Zhang, G., Jia, J., Hua, W. & Bao, H. (2011). Robust Bilayer Segmentation and Motion/Depth Estimation with a Handheld Camera. IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 33, No. 3, March.
Szeliski, R. (2010). Computer Vision: Algorithms and Applications (Texts in Computer Science), Springer, Nov.
Luenberger, D. G. & Ye, Y. (2010). Linear and Nonlinear Programming (International Series in Operations Research & Management Science), Springer, Nov.
Doulamis, A. D., Doulamis, N. D., Ntalianis,K. S. & Kollias, S. D. (1999). Unsupervised Semantic Object Segmentation of Stereoscopic Video Sequences. In Proc. of the IEEE International Conference on Information, Intelligence and Systems (ICIIS), Washington D.C., U.S.A, November.
Avrithis, Y., Doulamis, A., Doulamis N. & Kollias, S. (1999). A Stochastic Framework for Optimal Key Frame Extraction from MPEG Video Databases. Computer Vision and Image Understanding, Academic Press, Vol. 75, Nos 1/2, (pp. 3–24), July/August.
Busin, L., Vandenbroucke N. & Macaire, L. (2008). Color spaces and image segmentation,” Adances in Imaging and Electron Physics, Vol. 151, Chapter 2, (pp. 65–168). Orlando, FL, USA: Elsevier Inc.. (ISSN: 1076–5670).
Kass, M., Witkin, A., & Terzopoulos, D. (1987). Snakes: active contour models. International Journal of Computer Vision, 1, 321–331.
Article Google Scholar
Xu, C., & Prince, J. L. (1998). Snakes, shapes and gradient vector flow. IEEE Transaction Image Processing, 7(3), 359–369.
Article MathSciNet Google Scholar
Amini, A. A., Tehrani, S. & Weymouth, T. E. (1988). Using Dynamic Programming for Minimizing the Energy of Active Contours in the Presence of Hard Constraints. In Proc. of the Second International Conference on Computer Vision (ICCV), (pp. 95–99).
Williams, D. J., & Shah, M. (1992). A fast algorithm for active contours and curvature estimation. GVGIP: Image Understanding, 55(1), 14–26.
Google Scholar
Slater,J. (1996). Eye to Eye with Stereoscopic TV. Image Technology, p. 23, Nov./Dec..
Girdwood, C. & Chiwy, P. (1996). MIRAGE: An ACTS Project in Virtual Production and Stereoscopy. IBC Conference Publication, No. 428, (pp. 155–160), Sept.
Wollborn, M. & Mech, R. (1997). Procedure for Objective Evaluation of VOP Generation Algorithms. Doc. ISO/IEC JTC1/SC29/WG11 MPEG97/2704, Fribourg, Switzerland, October.
Correia P. & Pereira, F. (2000). Objective Evaluation of Relative Segmentation Quality. In Proc. International Conference on Image Processing (ICIP), Vancouver, Canada, September.
P. Villegas, X. Marichal and A. Salcedo, “Objective Evaluation of Segmentation Masks in Video Sequences”, in Proc. Of Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Berlin, Germany, May-June 1999.
Mitiche A. & Ayed, I. B. (2010). Variational and Level Set Methods in Image Segmentation (Springer Topics in Signal Processing), Springer, Oct.
Dufaux, F., Popescu, B. P. & Cagnazzo, M. (2013). Emerging Technologies for 3D Video: Creation, Coding, Transmission and Rendering, Wiley, May.
Dhond, U. R., & Aggarwal, J. K. (1989). Structure from stereo - a review. IEEE Transactions on Systems, Man, and Cybernetics, 19(6), 1489–1510.
Article MathSciNet Google Scholar
Liu, D., Xiong, Y., Pulli, K. & Shapiro, L. (2011). Estimating image segmentation difficulty. In Proceedings of the 7th International Conference on Machine Learning and Data Mining in Pattern Recognition, (pp. 484–495).

Download references

Acknowledgments

The authors wish to thank Dr. Chas Girdwood, the project manager of the ITC (Winchester), for providing the 3D video sequence “Eye to Eye”, which was produced in the framework of ACTS MIRAGE project. Furthermore the authors want to thank very much Dr. Siegmund Pastoor of the HHI (Berlin), for providing the video sequences of the DISTIMA project.

Author information

Authors and Affiliations

Technological Educational Institute of Athens, Department of Marketing (Chair of Computing), Athens, Greece
Klimis S. Ntalianis
Department of Production Engineering and Management, Technical University of Crete, Chania, Greece
Anastasios D. Doulamis
School of Electrical & Computer Engineering, National Technical University of Athens, Athens, Greece
Nikolaos D. Doulamis
Industrial Engineering Department, Technical University of Sofia, Sofia, Bulgaria & Hellenic Naval Academy, Athens, Greece
Nikos E. Mastorakis
Net Media Lab, NCSR Demokritos, Athens, Greece
Athanasios S. Drigas

Authors

Klimis S. Ntalianis
View author publications
You can also search for this author inPubMed Google Scholar
Anastasios D. Doulamis
View author publications
You can also search for this author inPubMed Google Scholar
Nikolaos D. Doulamis
View author publications
You can also search for this author inPubMed Google Scholar
Nikos E. Mastorakis
View author publications
You can also search for this author inPubMed Google Scholar
Athanasios S. Drigas
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Klimis S. Ntalianis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ntalianis, K.S., Doulamis, A.D., Doulamis, N.D. et al. Unsupervised Segmentation of Stereoscopic Video Objects: Constrained Segmentation Fusion Versus Greedy Active Contours. J Sign Process Syst 81, 153–181 (2015). https://doi.org/10.1007/s11265-014-0921-0

Download citation

Received: 24 May 2013
Revised: 28 February 2014
Accepted: 20 June 2014
Published: 09 July 2014
Issue Date: November 2015
DOI: https://doi.org/10.1007/s11265-014-0921-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Segmentation of Stereoscopic Video Objects: Constrained Segmentation Fusion Versus Greedy Active Contours

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Interactive stereo image segmentation via adaptive prior selection

Color image segmentation by combining the convex active contour and the Chan Vese model

Fast interactive stereo image segmentation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now