Abstract
Stereo image segmentation is the key technology in stereo image editing with the population of stereoscopic 3D media. Most previous methods perform stereo image segmentation on both views relying primarily on per-pixel disparities, which results in the segmentation quality closely connected to the accuracy of the disparities. Therefore, a mechanism to remove the errors of the disparities are highly demanded. To date, there’s no such a method yet that can produce accurate disparity maps. In this paper, we propose a novel convolutional neural network (CNN)-based framework, which will automatically propagate the segmentation result from one view to the other. The key problem of accurate stereo image segmentation is the missing of occluded regions. To solve this problem, the CNN architecture is proposed to improve the stereo segmentation performance. In order to address the inevitable inaccuracies problem of the disparities computed from a stereo pair of images, we utilize the coherent disparity propagation that propagates segment result via those pixels with coherent disparities. The pixels by coherent disparity propagation and the high confidence pixels of the object probability map produced by the CNN architecture are then used to generate the initial reliable pixels to perform an energy minimization framework-based segmentation. A comprehensive evaluations and comparisons on Middlebury and Adobe benchmark datasets show the effectiveness of our proposed method in terms of high-quality results, and the robustness against various types of inputs.
Similar content being viewed by others
References
Casaca, W., Nonato, L.G., Taubin, G.: Laplacian coordinates for seeded image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 384–391 (2014)
Peng, J., Shen, J., Jia, Y., Li, X.: Saliency cut in stereo images. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 22–28 (2013)
Tasli, H.E., Alatan, A.A.: User assisted disparity remapping for stereo images. Signal Process. Image Commun. 28(10), 1374–1389 (2013)
Ma, W., Qin, Y., Yang, L., Shibiao, X., Zhang, X.: Interactive stereo image segmentation with rgb-d hybrid constraints. IEEE Signal Process. Lett. 23(11), 1533–1537 (2016)
Ju, R., Ren, T., Wu, G.: Stereosnakes: contour based consistent object extraction for stereo images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1724–1732 (2015)
Shen, J., Yunfan, D., Li, X.: Interactive segmentation using constrained laplacian optimization. IEEE Trans. Circuits Syst. Video Technol. 24(7), 1088–1100 (2014)
Shen, J., Peng, J., Dong, X., Shao, L., Porikli, F.: Higher-order energies for image segmentation. IEEE Trans. Image Process. 99, 1–1 (2017)
Shen, J., Yunfan, D., Wang, W., Li, X.: Lazy random walks for superpixel segmentation. IEEE Trans. Image Process. 23(4), 1451–1462 (2014)
Rother, C., Kolmogorov, V., Blake, Andrew: grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)
Bertasius, G., Torresani, L., Yu, S.X., Shi, J.: Convolutional random walk networks for semantic image segmentation. arXiv preprint arXiv:1605.07681 (2016)
Dong, X., Shen, J., Shao, L., Van Gool, Luc: Sub-markov random walk for image segmentation. IEEE Trans Image Process 25(2), 516–527 (2016)
Wang, G., Zuluaga, M.A., Li, W., Pratt, R., Patel, P.A., Aertsen, M., Doel, T., David, A.L., Deprest, J., Ourselin, S., et al.: Deepigeos: a deep interactive geodesic framework for medical image segmentation. arXiv preprint arXiv:1707.00652 (2017)
Li, X., Zhao, H., Huang, H., Xiao, L., Zhongyi, H., Shao, J.: Stereoscopic image recoloring. J. Electron. Imag. 25(5), 053031–053031 (2016)
Price, B.L., Cohen, S.: Stereocut: consistent interactive object selection in stereo image pairs. In: 2011 IEEE International Conference on Computer Vision (ICCV). IEEE, pp. 1148–1155 (2011)
Ju, R., Xu, X., Yang, Y., Wu, G.: Stereo grabcut: interactive and consistent object extraction for stereo images. In: Pacific-Rim Conference on Multimedia. Springer, pp. 418–429 (2013)
Peng, J., Shen, J., Li, Xuelong: High-order energies for stereo segmentation. IEEE Trans. Cybern. 46(7), 1616–1627 (2016)
Ma, W., Yang, L., Zhang, Y., Duan, Lijuan: Fast interactive stereo image segmentation. Multimed. Tools Appl. 75(18), 10935–10948 (2016)
Lo, W.-Y., Van Baar, J., Knaus, C., Zwicker, M., Gross, M.: Stereoscopic 3d copy & paste. ACM Trans. Graph. (TOG) 29(6), 147 (2010)
Wang, W., Shen, J., Shao, Ling: Video salient object detection via fully convolutional networks. IEEE Trans. Image Process. 27(1), 38–49 (2017)
Wang, W., Shen, J., Porikli, F.: Saliency-aware geodesic video object segmentation. In: Computer Vision and Pattern Recognition, pp. 3395–3402 (2015)
Shelhamer, E., Long, J., Darrell, Trevor: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)
Nah, S., Kim, T.H., Lee, K.M.; Deep multi-scale convolutional neural network for dynamic scene deblurring. arXiv preprint arXiv:1612.02177 (2016)
Endo, Y., Iizuka, S., Kanamori, Y., Mitani, J.: Deepprop: extracting deep features from a single image for edit propagation. In: Computer Graphics Forum, vol 35. Wiley Online Library, pp. 189–201 (2016)
Cho, D., Tai, Y.-W., Kweon, I.: Natural image matting using deep convolutional neural networks. In: European Conference on Computer Vision. Springer, pp. 626–643 (2016)
Chen, Q., Li, D., Tang, Chi-Keung: KNN matting. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2175–2188 (2013)
Shen, X., Tao, X., Gao, H., Zhou, C., Jia, J.: Deep automatic portrait matting. In: European Conference on Computer Vision. Springer, pp. 92–107 (2016)
Shen, X., Hertzmann, A., Jia, J., Paris, S., Price, B., Shechtman, E., Sachs, I.: Automatic portrait segmentation for image stylization. In: Computer Graphics Forum, vol 35. Wiley Online Library, pp. 93–102 (2016)
Xu, N., Price, B., Cohen, S., Huang, T.: Deep image matting. arXiv preprint arXiv:1703.03872 (2017)
Xu, N., Price, B., Cohen, S., Yang, J., Huang, T.S.: Deep interactive object selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 373–381 (2016)
Seki, A., Pollefeys, M.: Sgm-nets: Semi-global matching with neural networks. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, pp. 21–26 (2017)
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)
Huang, H., Li, X., Zhao, H., Nie, G., Zhongyi, H., Xiao, L.: Manifold-preserving image colorization with nonlocal estimation. Multimed. Tools Appl. 74(18), 7555–7568 (2015)
Wang, L., Yang, Y., Min, R., Chakradhar, S.: Accelerating deep neural network training with inconsistent stochastic gradient descent. Neural Netw. 93, 219–229 (2017)
Zhou, X., Wang, Y., Zhu, Q., Xiao, C., Xiao, L.: Ssg: superpixel segmentation and grabcut-based salient object segmentation. Vis. Comput. 11, 1–14 (2018)
Shen, J., Hao, X., Liang, Z., Liu, Y., Wang, W., Shao, L.: Real-time superpixel segmentation by dbscan clustering algorithm. IEEE Trans. Image Process. A Publ. IEEE Signal Process. Soc. 25(12), 5933–5942 (2016)
Li, Z., Chen, J.: Superpixel segmentation using linear spectral clustering. In: Computer Vision and Pattern Recognition, pp. 1356–1363 (2015)
He, K., Sun, J., Tang, Xiaoou: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 35(6), 1397–1409 (2013)
Levin, A., Lischinski, D., Weiss, Yair: A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2008)
Wang, W., Shen, J.: Higher-order image co-segmentation. IEEE Trans. Multimed. 18(6), 1011–1021 (2016)
Acknowledgements
This work was supported by the Zhejiang Provincial Natural Science Foundation of China (Grant Nos. LY18 F020022, LY14F020032 and LQ17F020002), the Open Project Program of the State Key Lab of CAD&CG (Grant No. A1805), Zhejiang University and the National Natural Science Foundation of China (Grant No. 61702376). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro K5200 GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, X., Huang, H., Zhao, H. et al. Learning a convolutional neural network for propagation-based stereo image segmentation. Vis Comput 36, 39–52 (2020). https://doi.org/10.1007/s00371-018-1582-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-018-1582-y