E-GrabCut: an economic method of iterative video object extraction

Dong, Le; Feng, Ning; Mao, Mengdie; He, Ling; Wang, Jingjing

doi:10.1007/s11704-016-5558-7

E-GrabCut: an economic method of iterative video object extraction

Research Article
Published: 29 April 2017

Volume 11, pages 649–660, (2017)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Le Dong¹,
Ning Feng¹,
Mengdie Mao¹,
Ling He¹ &
…
Jingjing Wang¹

77 Accesses
3 Citations
6 Altmetric
Explore all metrics

Abstract

Efficient, interactive foreground/background segmentation in video is of great practical importance in video editing. This paper proposes an interactive and unsupervised video object segmentation algorithm named E-GrabCut concentrating on achieving both of the segmentation quality and time efficiency as highly demanded in the related filed. There are three features in the proposed algorithms. Firstly, we have developed a powerful, non-iterative version of the optimization process for each frame. Secondly, more user interaction in the first frame is used to improve the Gaussian Mixture Model (GMM). Thirdly, a robust algorithm for the following frame segmentation has been developed by reusing the previous GMM. Extensive experiments demonstrate that our method outperforms the state-of-the-art video segmentation algorithm in terms of integration of time efficiency and segmentation quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-Time Video Segmentation by Means of Finite GMMs and Background Subtraction

Article 31 January 2025

Unsupervised Video Object Segmentation Based on Mixture Models and Saliency Detection

Article 28 August 2019

Video Object Segmentation Using Convex Optimization of Foreground and Background Distributions

References

Wang M, Hong R C, Li G D, Zha Z J, Yan S C, Chua T S. Event driven web video summarization by tag localization and key-shot identification. IEEE Transactions on Multimedia, 2012, 14(4): 975–985
Article Google Scholar
O’Reilly R C, Wyatte D, Herd S, Mingus B, Jilk D J. Recurrent processing during object recognition. Frontiers in Psychology, 2013, 4: 124
Google Scholar
Carreira J, Li F X, Sminchisescu C. Object recognition by sequential figure-ground ranking. International Journal of Computer Vision, 2012, 98(3): 243–262
Article MathSciNet Google Scholar
Priya R, Shanmugam T N. A comprehensive review of significant researches on content based indexing and retrieval of visual information. Frontiers of Computer Science, 2013, 7(5): 782–799
Article MathSciNet Google Scholar
Dong X, Wen J T. A pixel-based outlier-free motion estimation algorithm for scalable video quality enhancement. Frontiers of Computer Science, 2015, 9(5): 729–740
Article Google Scholar
Yuan Y, Mou L C, Lu X Q. Scene recognition by manifold regularized deep learning architecture. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(10): 2222–2233
Article MathSciNet Google Scholar
Lu X Q, Yuan Y, Zheng X T. Joint dictionary learning for change detection in multispectral imagery. IEEE Transactions on Cybernetics, 2016, 47(4): 884–897
Article Google Scholar
Lu X Q, Li X L, Mou L C. Semi-supervised multitask learning for scene recognition. IEEE Transactions on Cybernetics, 2015, 45(9): 1967–1976
Article Google Scholar
Huang Y C, Liu Q S, Metaxas D. Video object segmentation by hypergraph cut. Computer Vision and Pattern Recognition, 2009
Google Scholar
Grundmann M, Kwatra V, Han M, Essa I. Efficient hierarchical graphbased video segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2141–2148
Google Scholar
Brendel W, Todorovic S. Video Object Segmentation by Tracking Regions. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 833–840
Google Scholar
Dong L, Feng N, Zhang Q N. LSI: semantic label inference for nature image segmentation. Pattern Recognition, 2016
Google Scholar
Vazquez-Reina A, Avidan S, Pfiter H, Miller E. Multiple hypothesis video segmentation from superpixel flows. In: Proceedings of European Conference on Computer Vision. 2010, 268–281
Google Scholar
Lee Y J, Kim J, Grauman K. Key-segments for video object segmentation. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 1995–2002
Google Scholar
Boykov Y Y, Jolly M P. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In: Proceedings of the 8th IEEE International Conference on Computer Vision. 2011, 105–112
Google Scholar
Chang X J, Nie F P, Ma Z G, Yang Y, Zhou X F. A convex formulation for spectral shrunk clustering. 2014, arXiv preprint arXiv:1411.6308
Google Scholar
Liu H Q, Jiao L C, Zhao F. Non-local spatial spectral clustering for image segmentation. Neurocomputing, 2010, 74(1): 461–471
Article Google Scholar
Jia J H, Liu B X, Jiao L C. Soft spectral clustering ensemble applied to image segmentation. Frontiers of Computer Science in China, 2011, 5(1): 66–78
Article MathSciNet Google Scholar
Zhao F, Jiao L C, Liu H Q. Fuzzy c-means clustering with non local spatial information for noisy image segmentation. Frontiers of Computer Science in China, 2011, 5(1): 45–56
Article MathSciNet MATH Google Scholar
Bezdek J C, Ehrlich R, Full W. FCM: the fuzzy c-means clustering algorithm. Computers & Geosciences, 1984, 10(2–3): 191–203
Article Google Scholar
Rother C, Kolmogorov V, Blake A. GrabCut-Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 2004, 23(30): 309–314
Article Google Scholar
Kohli P, Torr P H S. Dynamic graph cuts for efficient inference in markov random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(12): 2079–2088
Article Google Scholar
Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M F, Rother C. A comparative study of energy minimization methods for Markov random fields. In: Proceedings of European Conference on Computer Vision. 2006, 16–29
Google Scholar
Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(11): 1222–1239
Article Google Scholar
Boykov Y, Funka-Lea G. Graph cut and efficient N-D image segmentation. International Journal of Computer Vision, 2006, 70(2): 109–131
Article Google Scholar
Li Y, Sun J, Shum H Y. Video object cut and paste. ACM Transactions on Graphics, 2005, 24(3): 595–600
Article Google Scholar
Wang J, Bhat P, Colburn R A, Agrawala M, Cohen M F. Interactive video cutout. ACM Transactions on Graphics, 2005, 24(3): 585–594
Article Google Scholar
Wang J J, Xu W, Zhu S H, Gong Y H. Efficient video object segmentation by graph-cut. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2007, 496–499
Google Scholar
Yang L, Wu X Y, Guo Y M, Li S B. An interactive video segmentation approach based on GrabCut algorithm. In: Proceedings of the 4th International Congress on Image and Signal Processing. 2011, 367–370
Google Scholar
Talbot J F, Xu X Q. Implementing GrabCut. Provo, UT: Brigham Young University, 2006
Google Scholar
Martin D, Fowlkes C, Tal D, Malik J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of IEEE International Conference on Computer Vision. 2001, 416–423
Google Scholar
Xiang S M, Nie F P, Zhang C S. Semi-supervised classification via local spline regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(11): 2039–2053
Article Google Scholar
Pan Y, Nie F P, Xu D, Luo J B, Zhuang Y T, Pan Y H. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 349(4): 723–742
Google Scholar
Dong L, He L, Zhang Q N. Discriminative light unsupervised learning network for image representation and classification. In: Proceeding of the 23rd ACM International Conference on Multimedia. 2015, 1235–1238
Chapter Google Scholar
Wang Z, Lu L G, Bovik A C. Video quality assessment based on structural distortion measurement. Signal Processing Image Communication, 2004, 19(2): 121–132
Article Google Scholar
Blank M, Gorelick L, Shechtman E, IraniM, Basri R. Actions as spacetime shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 29(12): 2247–2253
Google Scholar
Reddy K, Shah M. Recognizing 50 human action categories of web videos. Machine Vision and Applications, 2013, 24(5): 971–981
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant No. 61370149), in part by the Fundamental Research Funds for the Central Universities (ZYGX2013J083), and in part by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry.

Author information

Authors and Affiliations

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Le Dong, Ning Feng, Mengdie Mao, Ling He & Jingjing Wang

Authors

Le Dong
View author publications
You can also search for this author in PubMed Google Scholar
Ning Feng
View author publications
You can also search for this author in PubMed Google Scholar
Mengdie Mao
View author publications
You can also search for this author in PubMed Google Scholar
Ling He
View author publications
You can also search for this author in PubMed Google Scholar
Jingjing Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Le Dong.

Additional information

Le Dong received the PhD degree in electronic engineering and computer science from Queen Mary University of London, UK in 2009. She is now an associate professor in University of Electronic Science and Technology of China, China. Her research interests include computer vision, big data analysis, and biologically inspired system.

Ning Feng is a PhD Student at University of Electronic Science and Technology of China, China. His major is computer science and technology, and his research interests are computer vision and image segmentation.

Mengdie Mao is an undergraduate from University of Electronic Science and Technology of China, China. Her major is computer science and technology, and her research interests focus on image retrieval and deep learning.

Ling He is an undergraduate from University of Electronic Science and Technology of China, China. Her major is computer science and technology, and her research interest is mainly in image classification.

Jingjing Wang received his master degree in University of Electronic Science and Technology of China, China in 2013. He is now working at Beijing Qihu Technology Company, China. His research interests are computer vision and image segmentation.

Electronic supplementary material

Supplementary material, approximately 674 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, L., Feng, N., Mao, M. et al. E-GrabCut: an economic method of iterative video object extraction. Front. Comput. Sci. 11, 649–660 (2017). https://doi.org/10.1007/s11704-016-5558-7

Download citation

Received: 26 December 2015
Accepted: 17 June 2016
Published: 29 April 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s11704-016-5558-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

E-GrabCut: an economic method of iterative video object extraction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Real-Time Video Segmentation by Means of Finite GMMs and Background Subtraction

Unsupervised Video Object Segmentation Based on Mixture Models and Saliency Detection

Video Object Segmentation Using Convex Optimization of Foreground and Background Distributions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 674 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now