Abstract
Stereo matching task has been greatly improved by convolutional neural networks, especially the fully-convolutional network. However, existing deep learning methods always overfit to specific domains. In this paper, focus on domain adaptation problem of disparity estimation, we present a novel training strategy to conduct synthetic-realistic collaborative learning. At first, we design a compact model that consists of shallow feature extractor, correlation feature aggregator and disparity encoder-decoder. Our model enables end-to-end disparity regression with fast speed and high accuracy. To perform collaborative learning, we then propose two distinct training schemes, including guided label distillation and semi-supervised regularization, both of which are used to alleviate the lack of disparity labels in realistic datasets. Finally, we evaluate the trained models on different datasets that belong to various domains. Comparative results demonstrate the capability of our designed model and the effectiveness of collaborative training strategy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bai, M., Luo, W., Kundu, K., Urtasun, R.: Exploiting semantic information and deep matching for optical flow. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 154–170. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_10
Bailer, C., Taetz, B., Stricker, D.: Flow Fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: ICCV (2015)
Barron, J.T.: A more general robust loss function. arXiv preprint arXiv:1701.03077 (2017)
Behl, A., Jafari, O.H., Mustikovela, S.K., Alhaija, H.A., Rother, C., Geiger, A.: Bounding boxes, segmentations and object coordinates: how important is recognition for 3D scene flow estimation in autonomous driving scenarios? In: ICCV (2017)
Brown, M., Hua, G., Winder, S.: Discriminative learning of local image descriptors. TPAMI 33, 43–57 (2011)
Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: CVPR (2018)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40, 834–848 (2016)
Chen, Z., Sun, X., Wang, L., Yu, Y., Huang, C.: A deep visual correspondence embedding model for stereo matching costs. In: ICCV (2015)
Cheng, J., Tsai, Y.H., Wang, S., Yang, M.H.: SegFlow: joint learning for video object segmentation and optical flow. In: ICCV (2017)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: ICCV (2015)
Franke, U., Joos, A.: Real-time stereo vision for urban traffic scene understanding. In: IV (2000)
Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)
Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: CVPR (2017)
Guney, F., Geiger, A.: Displets: resolving stereo ambiguities using object knowledge. In: CVPR (2015)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. TPAMI 30, 328–341 (2008)
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: NIPS (2015)
Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 3–10. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_1
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM MM (2014)
Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: ICCV (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Liang, Z., et al.: Learning for disparity estimation through feature constancy. In: CVPR (2018)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: CVPR (2016)
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: CVPR (2016)
Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: CVPR (2015)
Pang, J., Sun, W., Ren, J., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: ICCV Workshop (2017)
Rajagopalan, A., Chaudhuri, S., Mudenagudi, U.: Depth estimation and image restoration using defocused stereo pairs. TPAMI (2004)
Ren, Z., Sun, D., Kautz, J., Sudderth, E.B.: Cascaded scene flow prediction using semantic segmentation. In: 3DV (2017)
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: DeepMatching: hierarchical deformable dense matching. IJCV 120, 300–323 (2016)
Scharstein, D., et al.: High-resolution stereo datasets with subpixel-accurate ground truth. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 31–42. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11752-2_3
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47, 7–42 (2002)
Schmid, K., Tomic, T., Ruess, F., Hirschmuller, H.: Stereo vision based indoor/outdoor navigation for flying robots. In: IROS (2013)
Shaked, A., Wolf, L.: Improved stereo matching with constant highway networks and reflective confidence learning. In: CVPR (2017)
Song, X., Zhao, X., Hu, H., Fang, L.: EdgeStereo: a context integrated residual pyramid network for stereo matching. arXiv preprint arXiv:1803.05196 (2018)
Tonioni, A., Poggi, M., Mattoccia, S., Di Stefano, L.: Unsupervised adaptation for deep stereo. In: ICCV (2017)
Yang, G., Zhao, H., Shi, J., Deng, Z., Jia, J.: SegStereo: exploiting semantic information for disparity estimation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 660–676. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_39
Yu, L., Wang, Y., Wu, Y., Jia, Y.: Deep stereo matching with explicit cost aggregation sub-architecture. In: AAAI (2018)
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. JMLR 17, 2 (2016)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Zhu, Y., Lan, Z., Newsam, S., Hauptmann, A.G.: Guided optical flow learning. In: CVPR Workshop (2017)
Acknowledgment
This work was supported in part by the National Key R&D Program of China under Grant No. 2017YFB1302200 and by Joint Fund of NORINCO Group of China for Advanced Research under Grant No. 6141B010318.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, G., Deng, Z., Lu, H., Li, Z. (2019). SRC-Disp: Synthetic-Realistic Collaborative Disparity Learning for Stereo Matching. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11365. Springer, Cham. https://doi.org/10.1007/978-3-030-20873-8_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-20873-8_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20872-1
Online ISBN: 978-3-030-20873-8
eBook Packages: Computer ScienceComputer Science (R0)