Abstract
Deep learning methods have shown very promising results for regressing dense disparity maps directly from stereo image pairs. However, apart from a few public datasets such as Kitti, the ground-truth disparity needed for supervised training is hardly available. In this paper, we propose an unsupervised stereo matching approach with a novel occlusion-aware reconstruction loss. Together with smoothness loss and left-right consistency loss to enforce the disparity smoothness and correctness, the deep neural network can be well trained without requiring any ground-truth disparity data. To verify the effectiveness of the proposed method, we train and test our approach without ground-truth disparity data. Competitive results can be achieved on the public datasets (Kitti Stereo 2012, Kitti Stereo 2015, Cityscape) and our self-collected driving dataset that contains diverse driving scenario compared to the public datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.C.: MatchNet: unifying feature and metric learning for patch-based matching. In: CVPR (2015)
Pang, J., Sun, W., Ren, J.S., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: ICCVW (2017)
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: CVPR (2016)
Kendall, A., et al. Bry, A.: End-to-end learning of geometry and context for deep stereo regression. In: arXiv preprint arXiv:1703.04309 (2017)
Xu, L., Jia, J.: Stereo matching: an outlier confidence approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5305, pp. 775–787. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88693-8_57
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)
Menze, M., Geiger, A.: Object scence flow for autonomous vehicles. In: CVPR (2015)
Cordts, M., et al.: The Cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47(1–3), 7–42 (2002)
Scharstein, D., Szeliski, R.: Stereo matching with nonlinear diffusionD. IJCV 28(2), 155–174 (1998)
Yoon, K.J., Kweon, I.S.: Adaptive support-weight approach for correspondence search. TPAMI 28(4), 650–656 (2006)
Kolmogorov, V., Zabih, R.: Computing visual correspondence with occlusions via graph cuts. In: ICCV (2001)
Klaus, A., Sormanm, M., Karner, K.: Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. In: ICPR (2006)
Yang, Q., Wang, L., Yang, R., Stewénius, H., Nistér, D.: Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. TPAMI 31(3), 492–504 (2009)
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. TPAMI 30(2), 328–341 (2008)
Yang, Q.: A non-local cost aggregation method for stereo matching. In: CVPR (2012)
Mei, X., Sun, X., Dong, W., Wang, H., Zhang, X.: Segment-tree based cost aggregation for stereo matching. In: CVPR (2013)
Rhemann, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M.: Fast cost volume filtering for visual correspondence and beyond. In: CVPR (2011)
Yang, Q.: Stereo matching using tree filtering. TPAMI 37(4), 834–846 (2015)
Zhang, L., Seitz, S.M.: Estimating optimal parameters for MRF stereo from a single image pair. TPAMI 29(2), 331–342 (2007)
Scharstein, D., Pal, C.: Learning conditional random fields for stereo. In: CVPR (2007)
Li, Y., Huttenlocher, D.P.: Learning for stereo vision using the structured support vector machine. In: CVPR (2008)
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. JMLR 17(1), 2287–2318 (2016)
Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: CVPR (2016)
Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: CVPR (2015)
Gidaris, S., Komodakis, N.: Detect, replace, refine: deep structured prediction for pixel wise labeling. In: CVPR (2017)
Fischer, P., et al.: FlowNet: Learning optical flow with convolutional networks. In: ICCV (2015)
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: CVPR (2017)
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. arXiv preprint arXiv:1506.02025 (2015)
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: CVPR (2017)
Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for Single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45
Zhou, T., Brown, M., Snavely, N., Lowe, D.: Unsupervised learning of depth and ego-motion from video. In: CVPR (2017)
Zhou, C., Zhang, H., Shen, X., Jia, J.: Unsupervised learning of stereo matching. In: ICCV (2017)
Zhong, Y., Dai, Y., Li, H.: Self-supervised learning for stereo matching with self-improving ability. arXiv preprint arXiv:1709.00930 (2017)
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
Guney, F., Geiger, A.: Displets: resolving stereo ambiguities using object knowledge. In: CVPR (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Luo, N., Yang, C., Sun, W., Song, B. (2018). Unsupervised Stereo Matching with Occlusion-Aware Loss. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11012. Springer, Cham. https://doi.org/10.1007/978-3-319-97304-3_57
Download citation
DOI: https://doi.org/10.1007/978-3-319-97304-3_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97303-6
Online ISBN: 978-3-319-97304-3
eBook Packages: Computer ScienceComputer Science (R0)