Skip to main content
Log in

Efficient frame-sequential label propagation for video object segmentation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this work, we present an approach for segmenting objects in videos taken in complex scenes. It propagates initial object label through the entire video by a frame-sequential manner where the initial label is usually given by the user. The proposed method has several contributions which make the propagation much more robust and accurate than other methods. First, a novel supervised motion estimation algorithm is employed between each pair of neighboring frames, by which a predicted shape model can be warped in order to segment the similar color around object boundary. Second, unlike previous methods with fixed modeling range, we design a novel range-adaptive appearance model to handle the tough problem of occlusion. Last, the paper gives a reasonable framework based on GraphCut algorithm for obtaining the final label of the object by combining the clues from both appearance and motion. In the experiments, the proposed approach is evaluated qualitatively and quantitatively with some recent methods to show it achieves state-of-art results on multiple videos from benchmark data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Avinash Ramakanth S, Venkatesh Babu R (2014) Seamseg: video object segmentation using patch seams. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition

  2. Bai X, Wang J, Simons D, Sapiro G (2009) Video snapcut: robust video object cutout using localized classifiers. In: ACM SIGGRAPH 2009 Papers, SIGGRAPH ’09, pp 70:1–70:11

  3. Barnes C, Shechtman E, Finkelstein A, Goldman DB (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28(3):24:1–24:11

    Article  Google Scholar 

  4. Boykov Y, Jolly M (2001) Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In: Proceedings of the 2001 IEEE international conference on computer vision, ICCV ’01. IEEE Computer Society, pp 105–112

  5. Endres I, Hoiem D (2010) Category independent object proposals. In: Proceedings of the 11th European conference on computer vision: part V, ECCV’10. Springer, Berlin, pp 575–588

  6. Faktor A, Irani M (2014) Video segmentation by non-local consensus voting. In: Proceedings of the 2014 British machine vision conference

  7. Fan Q, Zhong F, Lischinski D, Cohen-Or D, Chen B (2015) Jumpcut: non-successive mask transfer and interpolation for video cutout. ACM Trans Graph 34 (6):195:1–195:10

    Article  Google Scholar 

  8. Giordano D, Murabito F, Palazzo S, Spampinato C (2015) Superpixel-based video object segmentation using perceptual organization and location prior. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition, CVPR’15, pp 4814–4822

  9. Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph based video segmentation. In: Proceedings of the 2010 IEEE conference on computer vision and pattern recognition

  10. Jain SD, Grauman K (2014) Supervoxel-consistent foreground propagation in video. In: Proceedings of the 2014 European conference on computer vision: part IV, Lecture Notes in Computer Science. Springer, pp 656–671

  11. Jang WD, Lee C, Kim CS (2016) Primary object segmentation in videos via alternate convex optimization of foreground and background distributions. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition

  12. Jiang H, Zhang G, Wang H, Hujun B (2015) Spatio-temporal video segmentation of static scenes and its applications. IEEE Trans Multimed 17(1):3–15

    Article  Google Scholar 

  13. Khoreva A, Galasso F, Hein M, Schiele B (2015) Classifier based graph construction for video segmentation. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 951–960

  14. Lee YJ, Kim J, Grauman K (2011) Key-segments for video object segmentation. In: Metaxas DN, Quan L, Sanfeliu A, Gool LJV (eds) Proceedings of the 2011 IEEE international conference on computer vision. IEEE Computer Society, pp 1995–2002

  15. Li F, Kim T, Humayun A, Tsai D, Rehg JM (2013) Video segmentation by tracking many figure-ground segments. In: Proceedings of the 2013 IEEE international conference on computer vision, pp 2192– 2199

  16. Maerki N, Perazzi F, Wang O, Sorkine-Hornung A (2016) Bilateral space video segmentation. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition

  17. Nagaraja N, Schmidt FR, Brox T (2015) Video segmentation with just a few strokes. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV ’15. Santiago

  18. Pan S, Sun W, Zheng Z (2016) Video segmentation algorithm based on superpixel link weight model. In: Multimedia tools and applications published online first, pp 1–20

  19. Pan Z, Lei J, Zhang Y, Sun X, Kwong S (2016) Fast motion estimation based on content property for low-complexity h.265/hevc encoder. IEEE Trans Broadcast 62 (3):675–684

    Article  Google Scholar 

  20. Papazoglou A, Ferrari V (2013) Fast object segmentation in unconstrained video. In: Proceedings of the 2013 IEEE international conference on computer vision. IEEE Computer Society, Los Alamitos, pp 1777–1784

  21. Perazzi F, Pont-Tuset J, McWilliams B, Gool LV, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: The 2016 IEEE conference on computer vision and pattern recognition (CVPR)

  22. Perazzi F, Wang O, Gross M, Sorkine-Hornung A (2015) Fully connected object proposals for video segmentation. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV ’15. IEEE Computer Society, Washington, DC, pp 3227–3234

  23. Poppe R (2010) A survey on vision-based human action recognition. Image Vision Comput 28(6):976–990

    Article  Google Scholar 

  24. Prest A, Leistner C, Civera J, Schmid C, Ferrari V (2012) Learning object class detectors from weakly annotated video. In: Proceedings of the 2012 IEEE conference on computer vision and pattern recognition, pp 3282–3289. IEEE Computer Society

  25. Rother C, Kolmogorov V, Blake A (2004) “grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314

    Article  Google Scholar 

  26. Silverman B (1986) Patchmatch: a randomized correspondence algorithm for structural image editing. Monographs on Statistics and Applied Probability

  27. Sun D, Wulff J, Sudderth EB, Pfister H, Black MJ (2013) A fully-connected layered model of foreground and background flow. In: Proceedings of the 2013 IEEE conference on computer vision and pattern recognition, CVPR ’13. IEEE Computer Society, Washington, DC, pp 2451–2458

  28. Tsai D, Flagg M, Nakazawa A, Rehg JM (2012) Motion coherent tracking using multi-label mrf optimization. Int J Comput Vis 100(2):190–202

    Article  MathSciNet  Google Scholar 

  29. Tsai YH, Yang MH, Black MJ (2016) Video segmentation via object flow. In: The 2016 IEEE conference on computer vision and pattern recognition (CVPR)

  30. Varas D, Marques F (2014) Region-based particle filter for video object segmentation. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition

  31. Vijayanarasimhan S, Grauman K (2012) Active frame selection for label propagation in videos. In: Proceedings of the 2012 European conference on computer vision. Lecture Notes in Computer Science, vol 7576. Springer, pp 496–509

  32. Wang J, Li T, Shi Y, Lian S, Ye J (2016) Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics. In: Multimedia Tools and Applications published online first, pp 1–17

  33. Wang W, Shen J, Porikli F (2015) Saliency-aware geodesic video object segmentation. In: CVPR. IEEE Computer Society, pp 3395–3402

  34. Xiao F, Jae Lee Y (2016) Track and segment: an iterative unsupervised approach for video object proposals. In: The 2016 IEEE conference on computer vision and pattern recognition (CVPR)

  35. Xu C, Corso JJ (2016) Libsvx: a supervoxel library and benchmark for early video processing. Int J Comput Vis 119(3):272–290

    Article  MathSciNet  Google Scholar 

  36. Zach C, Pock T, Bischof H (2007) A duality based approach for realtime tv-l1 optical flow. In: Proceedings of the 29th DAGM conference on pattern recognition. Springer, Berlin, pp 214–223

  37. Zhang D, Javed O, Shah M (2013) Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In: Proceedings of the 2013 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, Los Alamitos, pp 628–635

  38. Zhang Y, Tang Y, Cheng KL (2015) Efficient video cutout by paint selection. J Comput Sci Technol 30(3):467–477

    Article  Google Scholar 

  39. Zhong F, Qin X, Peng Q, Meng X (2012) Discontinuity-aware video object cutout. ACM Trans Graph 31(6):175:1–175:10

    Article  Google Scholar 

  40. Zhong F, Yang S, Qin X, Lischinski D, Cohen-Or D, Chen B (2014) Slippage-free background replacement forhand-held video. ACM Trans Graph 33(6):199:1–199:11

    Article  Google Scholar 

Download references

Acknowledgments

We thank the anonymous reviewers for their valuable comments. This paper is supported by National Natural Science Foundation of China (No. 61602252), Natural Science Foundation of Jiangsu Province of China (No. BK20160964, BK20160902, BK20160967), Project through the Priority Academic Program Development(PAPD) of Jiangsu Higher Education Institutions, Startup Foundation for Introducing Talent of Nanjing University of Information Science and Technology(NUIST) (No. 2243141601013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yadang Chen.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(WMV 18.2 MB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Hao, C., Wu, W. et al. Efficient frame-sequential label propagation for video object segmentation. Multimed Tools Appl 77, 6117–6133 (2018). https://doi.org/10.1007/s11042-017-4520-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4520-5

Keywords

Navigation