Skip to main content

HPA-Net: Hierarchical and Parallel Aggregation Network for Context Learning in Stereo Matching

  • Conference paper
  • First Online:
Computer Analysis of Images and Patterns (CAIP 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13052))

Included in the following conference series:

  • 696 Accesses

Abstract

Accurate disparity estimation with regard to rectified stereo image pairs is essential for many computer vision tasks. Current deep learning-based stereo networks generally construct single-scale cost volume to regularize and regress the disparity. However, these methods do not take advantage of multi-scale context information, leading to the limited performance of disparity prediction in ill-posed regions. In this paper, we propose a novel stereo network named HPA-Net, which provides an efficient representation of context information and lower error rates in ill-posed regions. First, we propose a hierarchical aggregation module to fuse context information from multi-scale cost volumes into an integrated cost volume. Then, we apply the integrated cost volume to the proposed parallel aggregation module, which utilizes several 3D dilated convolutions simultaneously to capture global and local clues of context information for disparity regressions. Experimental results show that our proposed HPA-Net achieves state-of-the-art stereo matching performances on KITTI datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47(1–3), 7–42 (2002)

    Article  Google Scholar 

  2. Nikolaus, M., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: CVPR, pp. 4040–4048 (2016)

    Google Scholar 

  3. Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: ICCV, pp. 66–75 (2017)

    Google Scholar 

  4. Guo, X., Yang, K., Yang, W., Wang, X., Li, H.: Group-wise correlation stereo network. In: CVPR, pp. 3273–3282 (2019)

    Google Scholar 

  5. Nie, G.Y., et al.: Multi-level context ultra-aggregation for stereo matching. In: CVPR, pp. 3283–3291 (2019)

    Google Scholar 

  6. Liang, Z., et al.: Stereo matching using multi-level cost volume and multi-scale feature constancy. PAMI (2019)

    Google Scholar 

  7. Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. JMLR 17(1), 2287–2318 (2016)

    MATH  Google Scholar 

  8. Pang, J., Sun, W., Ren, J.S.J., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: CVPRW, pp. 887–895 (2017)

    Google Scholar 

  9. Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: CVPR, pp. 5410–5418 (2018)

    Google Scholar 

  10. Wu, Z., Wu, X., Zhang, X., Wang, S., Ju, L.: Semantic stereo matching with pyramid cost volumes. In: ICCV, pp. 7484–7493 (2019)

    Google Scholar 

  11. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR, pp. 2881–2890 (2017)

    Google Scholar 

  12. Sun, D., Yang, X., Liu, M., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: CVPR, pp. 8934–8943 (2018)

    Google Scholar 

  13. Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: CVPR, pp. 2495–2504 (2020)

    Google Scholar 

  14. Yang, J., Mao, W., Alvarez, J.M., Liu, M.: Cost volume pyramid based depth inference for multi-view stereo. In: CVPR, pp. 4877–4886 (2020)

    Google Scholar 

  15. Schuster, R., Wasenmuller, O., Unger, C., Stricker, D.: SDC-stacked dilated convolution: a unified descriptor network for dense matching tasks. In: CVPR, pp. 2556–2565 (2019)

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  17. Zhang, Y., et al.: Adaptive unimodal cost volume filtering for deep stereo matching. In: AAAI, pp. 12926–12934 (2020)

    Google Scholar 

  18. Du, X., El-Khamy, M., Lee, J.: AmNet: deep atrous multiscale stereo disparity estimation networks, arXiv preprint arXiv:1904.09099 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, W., Peng, J., Zhu, Z., Zhao, Y. (2021). HPA-Net: Hierarchical and Parallel Aggregation Network for Context Learning in Stereo Matching. In: Tsapatsoulis, N., Panayides, A., Theocharides, T., Lanitis, A., Pattichis, C., Vento, M. (eds) Computer Analysis of Images and Patterns. CAIP 2021. Lecture Notes in Computer Science(), vol 13052. Springer, Cham. https://doi.org/10.1007/978-3-030-89128-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89128-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89127-5

  • Online ISBN: 978-3-030-89128-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics