Skip to main content

SRC-Disp: Synthetic-Realistic Collaborative Disparity Learning for Stereo Matching

  • Conference paper
  • First Online:
Computer Vision – ACCV 2018 (ACCV 2018)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11365))

Included in the following conference series:

Abstract

Stereo matching task has been greatly improved by convolutional neural networks, especially the fully-convolutional network. However, existing deep learning methods always overfit to specific domains. In this paper, focus on domain adaptation problem of disparity estimation, we present a novel training strategy to conduct synthetic-realistic collaborative learning. At first, we design a compact model that consists of shallow feature extractor, correlation feature aggregator and disparity encoder-decoder. Our model enables end-to-end disparity regression with fast speed and high accuracy. To perform collaborative learning, we then propose two distinct training schemes, including guided label distillation and semi-supervised regularization, both of which are used to alleviate the lack of disparity labels in realistic datasets. Finally, we evaluate the trained models on different datasets that belong to various domains. Comparative results demonstrate the capability of our designed model and the effectiveness of collaborative training strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bai, M., Luo, W., Kundu, K., Urtasun, R.: Exploiting semantic information and deep matching for optical flow. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 154–170. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_10

    Chapter  Google Scholar 

  2. Bailer, C., Taetz, B., Stricker, D.: Flow Fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: ICCV (2015)

    Google Scholar 

  3. Barron, J.T.: A more general robust loss function. arXiv preprint arXiv:1701.03077 (2017)

  4. Behl, A., Jafari, O.H., Mustikovela, S.K., Alhaija, H.A., Rother, C., Geiger, A.: Bounding boxes, segmentations and object coordinates: how important is recognition for 3D scene flow estimation in autonomous driving scenarios? In: ICCV (2017)

    Google Scholar 

  5. Brown, M., Hua, G., Winder, S.: Discriminative learning of local image descriptors. TPAMI 33, 43–57 (2011)

    Article  Google Scholar 

  6. Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: CVPR (2018)

    Google Scholar 

  7. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40, 834–848 (2016)

    Article  Google Scholar 

  8. Chen, Z., Sun, X., Wang, L., Yu, Y., Huang, C.: A deep visual correspondence embedding model for stereo matching costs. In: ICCV (2015)

    Google Scholar 

  9. Cheng, J., Tsai, Y.H., Wang, S., Yang, M.H.: SegFlow: joint learning for video object segmentation and optical flow. In: ICCV (2017)

    Google Scholar 

  10. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)

    Google Scholar 

  11. Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: ICCV (2015)

    Google Scholar 

  12. Franke, U., Joos, A.: Real-time stereo vision for urban traffic scene understanding. In: IV (2000)

    Google Scholar 

  13. Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_45

    Chapter  Google Scholar 

  14. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)

    Google Scholar 

  15. Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3

    Chapter  Google Scholar 

  16. Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: CVPR (2017)

    Google Scholar 

  17. Guney, F., Geiger, A.: Displets: resolving stereo ambiguities using object knowledge. In: CVPR (2015)

    Google Scholar 

  18. Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)

    MATH  Google Scholar 

  19. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  20. Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. TPAMI 30, 328–341 (2008)

    Article  Google Scholar 

  21. Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: NIPS (2015)

    Google Scholar 

  22. Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 3–10. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_1

    Chapter  Google Scholar 

  23. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM MM (2014)

    Google Scholar 

  24. Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: ICCV (2017)

    Google Scholar 

  25. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  26. Liang, Z., et al.: Learning for disparity estimation through feature constancy. In: CVPR (2018)

    Google Scholar 

  27. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)

    Google Scholar 

  28. Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: CVPR (2016)

    Google Scholar 

  29. Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: CVPR (2016)

    Google Scholar 

  30. Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: CVPR (2015)

    Google Scholar 

  31. Pang, J., Sun, W., Ren, J., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: ICCV Workshop (2017)

    Google Scholar 

  32. Rajagopalan, A., Chaudhuri, S., Mudenagudi, U.: Depth estimation and image restoration using defocused stereo pairs. TPAMI (2004)

    Google Scholar 

  33. Ren, Z., Sun, D., Kautz, J., Sudderth, E.B.: Cascaded scene flow prediction using semantic segmentation. In: 3DV (2017)

    Google Scholar 

  34. Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: DeepMatching: hierarchical deformable dense matching. IJCV 120, 300–323 (2016)

    Article  MathSciNet  Google Scholar 

  35. Scharstein, D., et al.: High-resolution stereo datasets with subpixel-accurate ground truth. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 31–42. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11752-2_3

    Chapter  Google Scholar 

  36. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47, 7–42 (2002)

    Article  Google Scholar 

  37. Schmid, K., Tomic, T., Ruess, F., Hirschmuller, H.: Stereo vision based indoor/outdoor navigation for flying robots. In: IROS (2013)

    Google Scholar 

  38. Shaked, A., Wolf, L.: Improved stereo matching with constant highway networks and reflective confidence learning. In: CVPR (2017)

    Google Scholar 

  39. Song, X., Zhao, X., Hu, H., Fang, L.: EdgeStereo: a context integrated residual pyramid network for stereo matching. arXiv preprint arXiv:1803.05196 (2018)

  40. Tonioni, A., Poggi, M., Mattoccia, S., Di Stefano, L.: Unsupervised adaptation for deep stereo. In: ICCV (2017)

    Google Scholar 

  41. Yang, G., Zhao, H., Shi, J., Deng, Z., Jia, J.: SegStereo: exploiting semantic information for disparity estimation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 660–676. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_39

    Chapter  Google Scholar 

  42. Yu, L., Wang, Y., Wu, Y., Jia, Y.: Deep stereo matching with explicit cost aggregation sub-architecture. In: AAAI (2018)

    Google Scholar 

  43. Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. JMLR 17, 2 (2016)

    MATH  Google Scholar 

  44. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)

    Google Scholar 

  45. Zhu, Y., Lan, Z., Newsam, S., Hauptmann, A.G.: Guided optical flow learning. In: CVPR Workshop (2017)

    Google Scholar 

Download references

Acknowledgment

This work was supported in part by the National Key R&D Program of China under Grant No. 2017YFB1302200 and by Joint Fund of NORINCO Group of China for Advanced Research under Grant No. 6141B010318.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhidong Deng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, G., Deng, Z., Lu, H., Li, Z. (2019). SRC-Disp: Synthetic-Realistic Collaborative Disparity Learning for Stereo Matching. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11365. Springer, Cham. https://doi.org/10.1007/978-3-030-20873-8_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20873-8_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20872-1

  • Online ISBN: 978-3-030-20873-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics