Abstract
Accurate disparity estimation with regard to rectified stereo image pairs is essential for many computer vision tasks. Current deep learning-based stereo networks generally construct single-scale cost volume to regularize and regress the disparity. However, these methods do not take advantage of multi-scale context information, leading to the limited performance of disparity prediction in ill-posed regions. In this paper, we propose a novel stereo network named HPA-Net, which provides an efficient representation of context information and lower error rates in ill-posed regions. First, we propose a hierarchical aggregation module to fuse context information from multi-scale cost volumes into an integrated cost volume. Then, we apply the integrated cost volume to the proposed parallel aggregation module, which utilizes several 3D dilated convolutions simultaneously to capture global and local clues of context information for disparity regressions. Experimental results show that our proposed HPA-Net achieves state-of-the-art stereo matching performances on KITTI datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47(1–3), 7–42 (2002)
Nikolaus, M., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: CVPR, pp. 4040–4048 (2016)
Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: ICCV, pp. 66–75 (2017)
Guo, X., Yang, K., Yang, W., Wang, X., Li, H.: Group-wise correlation stereo network. In: CVPR, pp. 3273–3282 (2019)
Nie, G.Y., et al.: Multi-level context ultra-aggregation for stereo matching. In: CVPR, pp. 3283–3291 (2019)
Liang, Z., et al.: Stereo matching using multi-level cost volume and multi-scale feature constancy. PAMI (2019)
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. JMLR 17(1), 2287–2318 (2016)
Pang, J., Sun, W., Ren, J.S.J., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: CVPRW, pp. 887–895 (2017)
Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: CVPR, pp. 5410–5418 (2018)
Wu, Z., Wu, X., Zhang, X., Wang, S., Ju, L.: Semantic stereo matching with pyramid cost volumes. In: ICCV, pp. 7484–7493 (2019)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR, pp. 2881–2890 (2017)
Sun, D., Yang, X., Liu, M., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: CVPR, pp. 8934–8943 (2018)
Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: CVPR, pp. 2495–2504 (2020)
Yang, J., Mao, W., Alvarez, J.M., Liu, M.: Cost volume pyramid based depth inference for multi-view stereo. In: CVPR, pp. 4877–4886 (2020)
Schuster, R., Wasenmuller, O., Unger, C., Stricker, D.: SDC-stacked dilated convolution: a unified descriptor network for dense matching tasks. In: CVPR, pp. 2556–2565 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Zhang, Y., et al.: Adaptive unimodal cost volume filtering for deep stereo matching. In: AAAI, pp. 12926–12934 (2020)
Du, X., El-Khamy, M., Lee, J.: AmNet: deep atrous multiscale stereo disparity estimation networks, arXiv preprint arXiv:1904.09099 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, W., Peng, J., Zhu, Z., Zhao, Y. (2021). HPA-Net: Hierarchical and Parallel Aggregation Network for Context Learning in Stereo Matching. In: Tsapatsoulis, N., Panayides, A., Theocharides, T., Lanitis, A., Pattichis, C., Vento, M. (eds) Computer Analysis of Images and Patterns. CAIP 2021. Lecture Notes in Computer Science(), vol 13052. Springer, Cham. https://doi.org/10.1007/978-3-030-89128-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-89128-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89127-5
Online ISBN: 978-3-030-89128-2
eBook Packages: Computer ScienceComputer Science (R0)