Skip to main content
Log in

Learning a convolutional neural network for propagation-based stereo image segmentation

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Stereo image segmentation is the key technology in stereo image editing with the population of stereoscopic 3D media. Most previous methods perform stereo image segmentation on both views relying primarily on per-pixel disparities, which results in the segmentation quality closely connected to the accuracy of the disparities. Therefore, a mechanism to remove the errors of the disparities are highly demanded. To date, there’s no such a method yet that can produce accurate disparity maps. In this paper, we propose a novel convolutional neural network (CNN)-based framework, which will automatically propagate the segmentation result from one view to the other. The key problem of accurate stereo image segmentation is the missing of occluded regions. To solve this problem, the CNN architecture is proposed to improve the stereo segmentation performance. In order to address the inevitable inaccuracies problem of the disparities computed from a stereo pair of images, we utilize the coherent disparity propagation that propagates segment result via those pixels with coherent disparities. The pixels by coherent disparity propagation and the high confidence pixels of the object probability map produced by the CNN architecture are then used to generate the initial reliable pixels to perform an energy minimization framework-based segmentation. A comprehensive evaluations and comparisons on Middlebury and Adobe benchmark datasets show the effectiveness of our proposed method in terms of high-quality results, and the robustness against various types of inputs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Casaca, W., Nonato, L.G., Taubin, G.: Laplacian coordinates for seeded image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 384–391 (2014)

  2. Peng, J., Shen, J., Jia, Y., Li, X.: Saliency cut in stereo images. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 22–28 (2013)

  3. Tasli, H.E., Alatan, A.A.: User assisted disparity remapping for stereo images. Signal Process. Image Commun. 28(10), 1374–1389 (2013)

    Article  Google Scholar 

  4. Ma, W., Qin, Y., Yang, L., Shibiao, X., Zhang, X.: Interactive stereo image segmentation with rgb-d hybrid constraints. IEEE Signal Process. Lett. 23(11), 1533–1537 (2016)

    Article  Google Scholar 

  5. Ju, R., Ren, T., Wu, G.: Stereosnakes: contour based consistent object extraction for stereo images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1724–1732 (2015)

  6. Shen, J., Yunfan, D., Li, X.: Interactive segmentation using constrained laplacian optimization. IEEE Trans. Circuits Syst. Video Technol. 24(7), 1088–1100 (2014)

    Article  Google Scholar 

  7. Shen, J., Peng, J., Dong, X., Shao, L., Porikli, F.: Higher-order energies for image segmentation. IEEE Trans. Image Process. 99, 1–1 (2017)

    MathSciNet  MATH  Google Scholar 

  8. Shen, J., Yunfan, D., Wang, W., Li, X.: Lazy random walks for superpixel segmentation. IEEE Trans. Image Process. 23(4), 1451–1462 (2014)

    Article  MathSciNet  Google Scholar 

  9. Rother, C., Kolmogorov, V., Blake, Andrew: grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)

    Article  Google Scholar 

  10. Bertasius, G., Torresani, L., Yu, S.X., Shi, J.: Convolutional random walk networks for semantic image segmentation. arXiv preprint arXiv:1605.07681 (2016)

  11. Dong, X., Shen, J., Shao, L., Van Gool, Luc: Sub-markov random walk for image segmentation. IEEE Trans Image Process 25(2), 516–527 (2016)

    Article  MathSciNet  Google Scholar 

  12. Wang, G., Zuluaga, M.A., Li, W., Pratt, R., Patel, P.A., Aertsen, M., Doel, T., David, A.L., Deprest, J., Ourselin, S., et al.: Deepigeos: a deep interactive geodesic framework for medical image segmentation. arXiv preprint arXiv:1707.00652 (2017)

  13. Li, X., Zhao, H., Huang, H., Xiao, L., Zhongyi, H., Shao, J.: Stereoscopic image recoloring. J. Electron. Imag. 25(5), 053031–053031 (2016)

    Article  Google Scholar 

  14. Price, B.L., Cohen, S.: Stereocut: consistent interactive object selection in stereo image pairs. In: 2011 IEEE International Conference on Computer Vision (ICCV). IEEE, pp. 1148–1155 (2011)

  15. Ju, R., Xu, X., Yang, Y., Wu, G.: Stereo grabcut: interactive and consistent object extraction for stereo images. In: Pacific-Rim Conference on Multimedia. Springer, pp. 418–429 (2013)

  16. Peng, J., Shen, J., Li, Xuelong: High-order energies for stereo segmentation. IEEE Trans. Cybern. 46(7), 1616–1627 (2016)

    Article  Google Scholar 

  17. Ma, W., Yang, L., Zhang, Y., Duan, Lijuan: Fast interactive stereo image segmentation. Multimed. Tools Appl. 75(18), 10935–10948 (2016)

    Article  Google Scholar 

  18. Lo, W.-Y., Van Baar, J., Knaus, C., Zwicker, M., Gross, M.: Stereoscopic 3d copy & paste. ACM Trans. Graph. (TOG) 29(6), 147 (2010)

    Article  Google Scholar 

  19. Wang, W., Shen, J., Shao, Ling: Video salient object detection via fully convolutional networks. IEEE Trans. Image Process. 27(1), 38–49 (2017)

    Article  MathSciNet  Google Scholar 

  20. Wang, W., Shen, J., Porikli, F.: Saliency-aware geodesic video object segmentation. In: Computer Vision and Pattern Recognition, pp. 3395–3402 (2015)

  21. Shelhamer, E., Long, J., Darrell, Trevor: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)

    Article  Google Scholar 

  22. Nah, S., Kim, T.H., Lee, K.M.; Deep multi-scale convolutional neural network for dynamic scene deblurring. arXiv preprint arXiv:1612.02177 (2016)

  23. Endo, Y., Iizuka, S., Kanamori, Y., Mitani, J.: Deepprop: extracting deep features from a single image for edit propagation. In: Computer Graphics Forum, vol 35. Wiley Online Library, pp. 189–201 (2016)

    Article  Google Scholar 

  24. Cho, D., Tai, Y.-W., Kweon, I.: Natural image matting using deep convolutional neural networks. In: European Conference on Computer Vision. Springer, pp. 626–643 (2016)

  25. Chen, Q., Li, D., Tang, Chi-Keung: KNN matting. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2175–2188 (2013)

    Article  Google Scholar 

  26. Shen, X., Tao, X., Gao, H., Zhou, C., Jia, J.: Deep automatic portrait matting. In: European Conference on Computer Vision. Springer, pp. 92–107 (2016)

  27. Shen, X., Hertzmann, A., Jia, J., Paris, S., Price, B., Shechtman, E., Sachs, I.: Automatic portrait segmentation for image stylization. In: Computer Graphics Forum, vol 35. Wiley Online Library, pp. 93–102 (2016)

    Article  Google Scholar 

  28. Xu, N., Price, B., Cohen, S., Huang, T.: Deep image matting. arXiv preprint arXiv:1703.03872 (2017)

  29. Xu, N., Price, B., Cohen, S., Yang, J., Huang, T.S.: Deep interactive object selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 373–381 (2016)

  30. Seki, A., Pollefeys, M.: Sgm-nets: Semi-global matching with neural networks. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, pp. 21–26 (2017)

  31. Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)

    Article  Google Scholar 

  32. Huang, H., Li, X., Zhao, H., Nie, G., Zhongyi, H., Xiao, L.: Manifold-preserving image colorization with nonlocal estimation. Multimed. Tools Appl. 74(18), 7555–7568 (2015)

    Article  Google Scholar 

  33. Wang, L., Yang, Y., Min, R., Chakradhar, S.: Accelerating deep neural network training with inconsistent stochastic gradient descent. Neural Netw. 93, 219–229 (2017)

    Article  Google Scholar 

  34. Zhou, X., Wang, Y., Zhu, Q., Xiao, C., Xiao, L.: Ssg: superpixel segmentation and grabcut-based salient object segmentation. Vis. Comput. 11, 1–14 (2018)

    Google Scholar 

  35. Shen, J., Hao, X., Liang, Z., Liu, Y., Wang, W., Shao, L.: Real-time superpixel segmentation by dbscan clustering algorithm. IEEE Trans. Image Process. A Publ. IEEE Signal Process. Soc. 25(12), 5933–5942 (2016)

    Article  MathSciNet  Google Scholar 

  36. Li, Z., Chen, J.: Superpixel segmentation using linear spectral clustering. In: Computer Vision and Pattern Recognition, pp. 1356–1363 (2015)

  37. He, K., Sun, J., Tang, Xiaoou: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 35(6), 1397–1409 (2013)

    Article  Google Scholar 

  38. Levin, A., Lischinski, D., Weiss, Yair: A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2008)

    Article  Google Scholar 

  39. Wang, W., Shen, J.: Higher-order image co-segmentation. IEEE Trans. Multimed. 18(6), 1011–1021 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Zhejiang Provincial Natural Science Foundation of China (Grant Nos. LY18 F020022, LY14F020032 and LQ17F020002), the Open Project Program of the State Key Lab of CAD&CG (Grant No. A1805), Zhejiang University and the National Natural Science Foundation of China (Grant No. 61702376). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro K5200 GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xujie Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Huang, H., Zhao, H. et al. Learning a convolutional neural network for propagation-based stereo image segmentation. Vis Comput 36, 39–52 (2020). https://doi.org/10.1007/s00371-018-1582-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-018-1582-y

Keywords

Navigation