Skip to main content
Log in

E-GrabCut: an economic method of iterative video object extraction

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Efficient, interactive foreground/background segmentation in video is of great practical importance in video editing. This paper proposes an interactive and unsupervised video object segmentation algorithm named E-GrabCut concentrating on achieving both of the segmentation quality and time efficiency as highly demanded in the related filed. There are three features in the proposed algorithms. Firstly, we have developed a powerful, non-iterative version of the optimization process for each frame. Secondly, more user interaction in the first frame is used to improve the Gaussian Mixture Model (GMM). Thirdly, a robust algorithm for the following frame segmentation has been developed by reusing the previous GMM. Extensive experiments demonstrate that our method outperforms the state-of-the-art video segmentation algorithm in terms of integration of time efficiency and segmentation quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wang M, Hong R C, Li G D, Zha Z J, Yan S C, Chua T S. Event driven web video summarization by tag localization and key-shot identification. IEEE Transactions on Multimedia, 2012, 14(4): 975–985

    Article  Google Scholar 

  2. O’Reilly R C, Wyatte D, Herd S, Mingus B, Jilk D J. Recurrent processing during object recognition. Frontiers in Psychology, 2013, 4: 124

    Google Scholar 

  3. Carreira J, Li F X, Sminchisescu C. Object recognition by sequential figure-ground ranking. International Journal of Computer Vision, 2012, 98(3): 243–262

    Article  MathSciNet  Google Scholar 

  4. Priya R, Shanmugam T N. A comprehensive review of significant researches on content based indexing and retrieval of visual information. Frontiers of Computer Science, 2013, 7(5): 782–799

    Article  MathSciNet  Google Scholar 

  5. Dong X, Wen J T. A pixel-based outlier-free motion estimation algorithm for scalable video quality enhancement. Frontiers of Computer Science, 2015, 9(5): 729–740

    Article  Google Scholar 

  6. Yuan Y, Mou L C, Lu X Q. Scene recognition by manifold regularized deep learning architecture. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(10): 2222–2233

    Article  MathSciNet  Google Scholar 

  7. Lu X Q, Yuan Y, Zheng X T. Joint dictionary learning for change detection in multispectral imagery. IEEE Transactions on Cybernetics, 2016, 47(4): 884–897

    Article  Google Scholar 

  8. Lu X Q, Li X L, Mou L C. Semi-supervised multitask learning for scene recognition. IEEE Transactions on Cybernetics, 2015, 45(9): 1967–1976

    Article  Google Scholar 

  9. Huang Y C, Liu Q S, Metaxas D. Video object segmentation by hypergraph cut. Computer Vision and Pattern Recognition, 2009

    Google Scholar 

  10. Grundmann M, Kwatra V, Han M, Essa I. Efficient hierarchical graphbased video segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2141–2148

    Google Scholar 

  11. Brendel W, Todorovic S. Video Object Segmentation by Tracking Regions. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 833–840

    Google Scholar 

  12. Dong L, Feng N, Zhang Q N. LSI: semantic label inference for nature image segmentation. Pattern Recognition, 2016

    Google Scholar 

  13. Vazquez-Reina A, Avidan S, Pfiter H, Miller E. Multiple hypothesis video segmentation from superpixel flows. In: Proceedings of European Conference on Computer Vision. 2010, 268–281

    Google Scholar 

  14. Lee Y J, Kim J, Grauman K. Key-segments for video object segmentation. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 1995–2002

    Google Scholar 

  15. Boykov Y Y, Jolly M P. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In: Proceedings of the 8th IEEE International Conference on Computer Vision. 2011, 105–112

    Google Scholar 

  16. Chang X J, Nie F P, Ma Z G, Yang Y, Zhou X F. A convex formulation for spectral shrunk clustering. 2014, arXiv preprint arXiv:1411.6308

    Google Scholar 

  17. Liu H Q, Jiao L C, Zhao F. Non-local spatial spectral clustering for image segmentation. Neurocomputing, 2010, 74(1): 461–471

    Article  Google Scholar 

  18. Jia J H, Liu B X, Jiao L C. Soft spectral clustering ensemble applied to image segmentation. Frontiers of Computer Science in China, 2011, 5(1): 66–78

    Article  MathSciNet  Google Scholar 

  19. Zhao F, Jiao L C, Liu H Q. Fuzzy c-means clustering with non local spatial information for noisy image segmentation. Frontiers of Computer Science in China, 2011, 5(1): 45–56

    Article  MathSciNet  MATH  Google Scholar 

  20. Bezdek J C, Ehrlich R, Full W. FCM: the fuzzy c-means clustering algorithm. Computers & Geosciences, 1984, 10(2–3): 191–203

    Article  Google Scholar 

  21. Rother C, Kolmogorov V, Blake A. GrabCut-Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 2004, 23(30): 309–314

    Article  Google Scholar 

  22. Kohli P, Torr P H S. Dynamic graph cuts for efficient inference in markov random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(12): 2079–2088

    Article  Google Scholar 

  23. Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M F, Rother C. A comparative study of energy minimization methods for Markov random fields. In: Proceedings of European Conference on Computer Vision. 2006, 16–29

    Google Scholar 

  24. Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(11): 1222–1239

    Article  Google Scholar 

  25. Boykov Y, Funka-Lea G. Graph cut and efficient N-D image segmentation. International Journal of Computer Vision, 2006, 70(2): 109–131

    Article  Google Scholar 

  26. Li Y, Sun J, Shum H Y. Video object cut and paste. ACM Transactions on Graphics, 2005, 24(3): 595–600

    Article  Google Scholar 

  27. Wang J, Bhat P, Colburn R A, Agrawala M, Cohen M F. Interactive video cutout. ACM Transactions on Graphics, 2005, 24(3): 585–594

    Article  Google Scholar 

  28. Wang J J, Xu W, Zhu S H, Gong Y H. Efficient video object segmentation by graph-cut. In: Proceedings of IEEE International Conference on Multimedia and Expo. 2007, 496–499

    Google Scholar 

  29. Yang L, Wu X Y, Guo Y M, Li S B. An interactive video segmentation approach based on GrabCut algorithm. In: Proceedings of the 4th International Congress on Image and Signal Processing. 2011, 367–370

    Google Scholar 

  30. Talbot J F, Xu X Q. Implementing GrabCut. Provo, UT: Brigham Young University, 2006

    Google Scholar 

  31. Martin D, Fowlkes C, Tal D, Malik J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of IEEE International Conference on Computer Vision. 2001, 416–423

    Google Scholar 

  32. Xiang S M, Nie F P, Zhang C S. Semi-supervised classification via local spline regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(11): 2039–2053

    Article  Google Scholar 

  33. Pan Y, Nie F P, Xu D, Luo J B, Zhuang Y T, Pan Y H. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 349(4): 723–742

    Google Scholar 

  34. Dong L, He L, Zhang Q N. Discriminative light unsupervised learning network for image representation and classification. In: Proceeding of the 23rd ACM International Conference on Multimedia. 2015, 1235–1238

    Chapter  Google Scholar 

  35. Wang Z, Lu L G, Bovik A C. Video quality assessment based on structural distortion measurement. Signal Processing Image Communication, 2004, 19(2): 121–132

    Article  Google Scholar 

  36. Blank M, Gorelick L, Shechtman E, IraniM, Basri R. Actions as spacetime shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 29(12): 2247–2253

    Google Scholar 

  37. Reddy K, Shah M. Recognizing 50 human action categories of web videos. Machine Vision and Applications, 2013, 24(5): 971–981

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant No. 61370149), in part by the Fundamental Research Funds for the Central Universities (ZYGX2013J083), and in part by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Le Dong.

Additional information

Le Dong received the PhD degree in electronic engineering and computer science from Queen Mary University of London, UK in 2009. She is now an associate professor in University of Electronic Science and Technology of China, China. Her research interests include computer vision, big data analysis, and biologically inspired system.

Ning Feng is a PhD Student at University of Electronic Science and Technology of China, China. His major is computer science and technology, and his research interests are computer vision and image segmentation.

Mengdie Mao is an undergraduate from University of Electronic Science and Technology of China, China. Her major is computer science and technology, and her research interests focus on image retrieval and deep learning.

Ling He is an undergraduate from University of Electronic Science and Technology of China, China. Her major is computer science and technology, and her research interest is mainly in image classification.

Jingjing Wang received his master degree in University of Electronic Science and Technology of China, China in 2013. He is now working at Beijing Qihu Technology Company, China. His research interests are computer vision and image segmentation.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dong, L., Feng, N., Mao, M. et al. E-GrabCut: an economic method of iterative video object extraction. Front. Comput. Sci. 11, 649–660 (2017). https://doi.org/10.1007/s11704-016-5558-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-016-5558-7

Keywords

Navigation