Skip to main content
Log in

Interactive stereo image segmentation via adaptive prior selection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Interactive stereo image segmentation (i.e., cutting out objects from stereo pairs with limited user assistance) is an important research topic in computer vision. Given a pair of images, users mark a few foreground/background pixels, based on which prior models are formulated for labeling unknown pixels. Note that color priors might not help if the marked foreground and background have similar colors. However, integrating multiple types of priors, e.g., color and disparity in segmenting stereo pairs, is not trivial. This is because differing pairs of images and even differing pixels in the same image might require different proportions of the priors. Besides, disparities of natural images are too noisy to be directly used. This paper presents a method that can adaptively determine the proportion of the priors (color or disparity) for each pixel. Specifically speaking, the segmentation problem is defined in the framework of MRF (Markov Random Field). We formulate an MRF energy function which is composed of clues from the two types of priors, as well as neighborhood smoothness and stereo correspondence constraints. The weights of the color and disparity priors at each pixel are treated as variables which are optimized together with the label (foreground or background) of the pixel. In order to overcome the noise problem, the weight of the disparity prior is controlled by a confidence value learned from data. The energy function is optimized by using multi-label graph cut. Experimental results show that our method performs well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Achanta R, Estrada F, Wils P, Susstrunk S (2008) Salient region detection and segmentation. In: International conference on computer vision systems. Springer, pp 66–75

  2. Blake A, Rother C, Brown M, Perez P, Torr P (2004) Interactive image segmentation using an adaptive gmmrf model. In: European conference on computer vision. Springer, pp 428–441

  3. Boykov Y, Veksler O, Zabih R (2001) Interactive graph cuts for optimal boundary region segmentation of objects in n-d images. In: International conference on computer vision. IEEE, pp 105–112

  4. Boykov Y, Veksler O, Zabih R (2002) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239

    Article  Google Scholar 

  5. Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26(9):1142–1137

    Article  Google Scholar 

  6. Feng J, Price B, Cohen S, Chang S (2016) Interactive segmentation on rgbd images via cue selection. In: Computer vision and pattern recognition. IEEE, pp 156–164

  7. Giró-Nieto X, Martos M, Mohedano E, Pont-Tuset J (2015) From global image annotation to interactive object segmentation. Multimed Tools Appl 70(1):475–493

    Article  Google Scholar 

  8. Ju R, Xu X, Yang Y, Wu G (2013) Stereo GrabCut: Interactive and consistent object extraction for stereo images. In: Pacific-rim conference on advances in multimedia information processing. IEEE, pp 418–429

  9. Ju R, Ren T, Wu G (2015) Stereosnakes Contour based consistent object extraction for stereo images. In: International conference on computer vision. IEEE, pp 1724–1732

  10. Kass M, Witkin A, Terzopoulos D (1988) Snakes Active contour models. Int J Comput Vis 1(4):321–331

    Article  Google Scholar 

  11. Kim Y, Winnemoller H, Lee S (2014) WYSIWYG stereo painting with usability enhancements. IEEE Trans Vis Comput Graph 20:957–969

    Article  Google Scholar 

  12. Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. In: European conference on computer vision. IEEE, pp 82–96

  13. Li Y, Sun J, Tang CK, Shum HY (2004) Lazy snapping. ACM Trans Graph 23(3):303–308

    Article  Google Scholar 

  14. Lo W, Baar J, Knaus C, Zwicker M (2010) Stereoscopic 3d copy & paste. ACM Trans Graph 29:147:1–147:10

    Article  Google Scholar 

  15. Luo S, Shen I, Chen B, Cheng W, Chuang Y (2012) Perspective-aware warping for seamless stereoscopic image cloning. ACM Trans Graph 31:182:1–182:8

    Google Scholar 

  16. Ma W, Qin Y, Yang L, Xu S, Zhang X (2016) Interactive stereo image segmentation with rgb-d hybrid constraints. IEEE Signal Process Lett 23(11):1533–1537

    Article  Google Scholar 

  17. Ma W, Yang L, Zhang Y, Duan L (2016) Fast interactive stereo image segmentation. Multimed Tools Appl 75(18):10,935–10,948

    Article  Google Scholar 

  18. Ma W, Zhang Y, Yang L, Duan L (2016) Graph-cut based interactive image segmentation with randomized texton searching. Comput Animat Virtual Worlds 27(5):454–465

    Article  Google Scholar 

  19. Ning J, Zhang L, Zhang D, Wu C (2010) Interactive image segmentation by maximal similarity based region merging. Pattern Recogn 43(2):445–456

    Article  Google Scholar 

  20. Price B, Cohen S (2011) StereoCut: Consistent interactive object selection in stereo image pairs. In: International conference on computer vision. IEEE, pp 1148–1155

  21. Rother C, kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314

    Article  Google Scholar 

  22. Smith B, Zhang L, Jin H (2009) Stereo matching with nonparametric smoothness priors in feature space. In: Computer vision and pattern recognition. IEEE, pp 485–492

  23. Xia L, Chen CC, Aggarwal JK (2011) Human detection using depth information by kinect. In: Computer vision and pattern recognition workshops. IEEE, pp 15–22

  24. Xie L, Zhu L, Chen G (2016) Unsupervised multi-graph cross-modal hashing for large-scale multimedia retrieval. Multimed Tools Appl 75(15):9185–9204

    Article  Google Scholar 

  25. Xu N, Price B, Cohen S, Yang J, Huang T (2016) Deep interactive object selection. In: Computer vision and pattern recognition. IEEE, pp 373–381

  26. Zhang C, Li Z, Cai R, Chao H, Rui Y (2016) Joint multiview segmentation and localization of rgb-d images using depth-induced silhouette consistency. In: Computer vision and pattern recognition. IEEE, pp 4031–4039

  27. Zhu L, Jin H, Zheng R, Feng X (2014) Weighting scheme for image retrieval based on bag-of-visual-words. IET Image Process 8(9):509–518

    Article  Google Scholar 

  28. Zhu L, Shen J, Liu X, Xie L, Nie L (2016) Learning compact visual representation with canonical views for robust mobile landmark search. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp 3959–3965

Download references

Acknowledgments

This research is supported by National Natural Science Foundation of China (61771026, 61379096, 61671451, 61502490), Scientific Research Project of Beijing Educational Committee (KM201510005015), the Open Project Program of the National Laboratory of Pattern Recognition (NLPR) and Beijing Municipal Natural Science Foundation (4152006). Great thanks to Dr. Xing Su and Dr. Tong Li for helping proofread the paper.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Shibiao Xu or Xiaopeng Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, W., Qin, Y., Xu, S. et al. Interactive stereo image segmentation via adaptive prior selection. Multimed Tools Appl 77, 28709–28724 (2018). https://doi.org/10.1007/s11042-018-6067-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6067-5

Keywords

Navigation