Skip to main content
Log in

An improved binocular stereo matching algorithm based on AANet

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Stereo matching is an important part of establishing stereo vision. Parallax information obtained by stereo matching directly affects the three-dimensional information of an object. End-to-end stereo matching algorithms can directly derive parallax maps from the designed network. However, at the same time, the structure of the network is complex, and a large number of parameters take up much memory. The network increases the device burden, which increases the time required to obtain the parallax map, lowering the efficiency of the network movement. Thus, an improved stereo matching algorithm based on AANet (adaptive aggregation network for efficient stereo matching) is proposed in this paper: AEDNet (adaptive end-to-end depth network for stereo matching). In the feature extraction module, the network simplifies the network structure by limiting the convolution kernel size to obtain the features with low abstraction. In cost aggregation, the intra-scale aggregation module is used to achieve adaptive cost aggregation through deformable convolution, and the inter-scale aggregation module uses the traditional cross-scale aggregation method to compensate for the missing global information to a certain extent. The network is verified the performance on the KITTI dataset. The results show that the algorithm can still complete stereo matching efficiently and accurately and obtain a better disparity map when the network is simplified. These provide preconditions for accurate three-dimensional reconstruction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

  1. Aleotti F, Poggi M, Tosi F et al (2020) Learning End-to-End Scene Flow by Distilling Single Tasks Knowledge[C]. Nat Conf Artif Intell 34(7):10435–10442

    Google Scholar 

  2. Bhatti Uzair Aslam, Zhaoyuan Yu, Chanussot Jocelyn, Zeeshan Zeeshan, Yuan Linwang, Luo Wen, Nawaz Saqib Ali, Bhatti Mughair Aslam, Ain QuratUl, Mehmood Anum (2022) Local Similarity-Based Spatial-Spectral Fusion Hyperspectral Image Classification With Deep CNN and Gabor Filtering[J]. IEEE Trans Geosci Remote Sens 60:5514215–5514215

    Article  Google Scholar 

  3. Chang JR, Chen YS (2018) Pyramid stereo matching network [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5410–5418

  4. Changhee Won;Jongbin Ryu;Jongwoo Lim (2021) End-to-End Learning for Omnidirectional Stereo Matching With Uncertainty Prior[J]. IEEE Trans Pattern Anal Mach Intell 43(11):3850–3862

    Article  Google Scholar 

  5. Chen S, Xiang Z, Qiao C et al (2020) SGNet: semantics guided deep stereo matching[C]. Proceedings of Asian Conference on Computer Vision (ACCV) 106-122. Springer International Publishing, Kyoto

    Google Scholar 

  6. Chen W, Jia X, Mingfei Wu, Liang Z (2022) Multi-Dimensional Cooperative Network for Stereo Matching[J]. IEEE Robot Autom Lett 7(1):581–587

    Article  Google Scholar 

  7. Chenglong Xu, Chengdong Wu, Daokui Qu, Fang Xu, Sun H, Song J (2021) Accurate and Efficient Stereo Matching by Log-Angle and Pyramid-Tree[J]. IEEE Trans Circuits Syst Video Technol 31(10):4007–4019

  8. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite[C]. IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 3354–3361

  9. He K, Zhang X, Ren S, et al (2016) Deep Residual Learning for Image Recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. 770-778

  10. Hongzhi Du, Li Y, Sun Y, Zhu J, Tombari F (2021) SRH-Net: Stacked Recurrent Hourglass Network for Stereo Matching[J]. IEEE Robot Autom Lett 6(4):8005–8012

    Article  Google Scholar 

  11. Liu J, Feng Y, Ji G, Fu Y, Zhu S (2020) An improved stereo matching algorithm based on PSMNet[J]. South China Univ Technol (Nat Sci Edit) 48(01):60–69+ 83

  12. Kim S, Min D, Kim S, Sohn K (2021) Adversarial Confidence Estimation Networks for Robust Stereo Matching[J]. IEEE Trans Intell Transp Syst 22(11):6875–6889

    Article  Google Scholar 

  13. Kuzmin A, Mikushin D, Lempitsky V (2017) End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo [C]. 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, Japan. 1–6. https://doi.org/10.1109/MLSP.2017.8168183

  14. Laga Hamid, Jospin Laurent Valentin, Boussaid Farid, Bennamoun Mohammed (2022) A Survey on Deep Learning Techniques for Stereo-Based Depth Estimation[J]. IEEE Trans Pattern Anal Mach Intell 44(4):1738–1764

    Article  Google Scholar 

  15. Lee Y, Kim H (2022) A High-Throughput Depth Estimation Processor for Accurate Semiglobal Stereo Matching Using Pipelined Inter-Pixel Aggregation[J]. IEEE Trans Circuits Syst Vid Technol 32(1):411–422

    Article  Google Scholar 

  16. Li J, Wang P, Xiong P, Cai T, Yan Z, Yang L, Liu J, Fan H, Liu S (2022) Practical stereo matching via cascaded recurrent network with adaptive correlation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 16263–16271

  17. Liang Z, Guo Y, Feng Y, Chen W, Qiao L, Zhou Li, Zhang J, Liu H (2021) Stereo Matching Using Multi-Level Cost Volume and Multi-Scale Feature Constancy[J]. IEEE Trans Pattern Anal Mach Intell 43(1):300–315

    Article  Google Scholar 

  18. Lin TY, Dollar P, Girshick R, et al (2017) Feature Pyramid Networks for Object Detection[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 936–944

  19. Lipson L, Teed Z, Deng J (2021) Raft-stereo: Multilevel recurrent field transforms for stereo matching[C]. 2021 International Conference on 3D Vision (3DV). 202: 218–227

  20. Liu P, King I, Lyu M, Xu J (2020) Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6647–6656

  21. Mao Y, Liu Z, Li W, Dai Y, Wang Q, Kim Y-T, Lee H-S (2021) UASNet: Uncertainty adaptive sampling network for deep stereo matching[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV). 6291–6299

  22. Mayer N, Ilg E, Hausser P, et al. (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation [C]. IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas. 4040–4048

  23. Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation[C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016:4040–4048

  24. Menze M, Geiger A (2015) Object scene flow for antonomous vehicles [C]. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3061–3070

  25. Cheng MY, Gai SY, Da FP (2020) Three-dimensional Matching Network Research Based on Attention Mechanism [J]. Optic J 40(14):144–152

  26. Park Jinsun, Jeong Yongseop, Joo Kyungdon, Cho Donghyeon, Kweon In So (2022) Adaptive Cost Volume Fusion Network for Multi-Modal Depth Estimation in Changing Environments[J]. IEEE Robot Autom Lett 7(2):5095–5102

    Article  Google Scholar 

  27. Shankar K, Tjersland M, Ma J, Stone K, Bajracharya M (2022) A Learned Stereo Depth System for Robotic Manipulation in Homes. IEEE Robot Autom Lett 7(2):2305–2312

    Article  Google Scholar 

  28. Shen Z, Dai Y, Rao Z (2021) Cfnet: Cascade and fused cost volume for robust stereo matching[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 13901-13910

  29. Song X, Zhao X, Fang L et al (2020) Edge Stereo: An Effective Multi-task Learning Network for Stereo Matching and Edge Detection[J]. Int Jo Comput Vis 128(5):910–930. https://doi.org/10.48550/arXiv.1903.01700

    Article  Google Scholar 

  30. Tankovich V, Häne C, Zhang Y, Kowdle A, Fanello S, Bouaziz S (2021) Hitnet: Hierarchical iterative tile refinement network for real-time stereo matching[C].  IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) 14357–14367

  31. Li T, Ma W, Xu SB, Zhang XP (2020) Task-Adaptive End-to-End Networks for Stereo Matching [J]. Comput Res Dev 57(07):1531–1538

  32. Tonioni A, Tosi F, Poggi M, Mattoccia S, Di Stefano L (2019) Real-Time self-adaptive deep stereo[C].The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 195–204

  33. Wang H, Fan R, Cai P, Liu M (2021) PVStereo: Pyramid voting module for end-to-end self-supervised stereo matching[J]. IEEE Robot Autom Lett 6(3):4353–4360

    Article  Google Scholar 

  34. Xu H, Zhang J (2020) AANet: Adaptive Aggregation Network for Efficient Stereo Matching[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1956–1965

  35. Xu B, Xu Y, Yang X, Jia W, Guo Y (2021) Bilateral grid learning for stereo matching networks [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 12492–12501

  36. Yang G Zhao H Shi J Deng Z, Jia J (2018) SegStereo: Exploiting Semantic Information for Disparity Estimation[C]. European Conference on Computer Vision (ECCV). 660–676

  37. Yang J, Wang C, Wang H et al (2020) A RGB-D Based Real-Time Multiple Object Detection and Ranging System for Autonomous Driving[J]. IEEE Sens J 20(20):11959–11966

    Article  Google Scholar 

  38. Yao C, Jia Y, Di H, Li P, Wu Y (2021) A decomposition model for stereo matching[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 6087–6096

  39. Ye X, Sang X, Chen D, Wang P, Wang K, Yan B, Liu B, Wang H, Qi S (2022) Super pixel Guided Network for Three-Dimensional Stereo Matching[J]. IEEE Trans Comput Imaging 8:54–68

    Article  Google Scholar 

  40. Yufeng Wang, Wang Hongwei Yu, Guang Yang Mingquan, Yuwei Yuan, Jicheng Quan (2019) A Stereo-matching Algorithm based on a 3 D Convolutional Neural Network [J]. Optics 39(11):227–234

    Google Scholar 

  41. Yufeng W, Hongwei W, Liu Yu, Mingquan Y, Jicheng Q (2020) Progressive-refined real-time stereo matching algorithm [J]. Opt J 40(09):99–109

    Google Scholar 

  42. Zeng K, Wang Y, Mao J, Liu C, Peng W, Yang Y (2022) Deep Stereo Matching With Hysteresis Attention and Supervised Cost Volume Construction[J]. IEEE Trans Image Process 31:812–822

    Article  Google Scholar 

  43. Zhang F, Prisacariu V, Yang R, Torr PH (2019) GANet: Guided Aggregation Net for end-to-end Stereo Matching. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 185–194

  44. Zhang J, Skinner K, Vasudevan R, Johnson-Roberson M (2019) DispSegNet: Leveraging Semantics for Endto-End Learning of Disparity Estimation From Stereo Imagery[J]. IEEE Robot Autom Lett 4(2):1162–1169

    Article  Google Scholar 

  45. Zhang Y, Chen Y, Bai X, Yu S, Yu K, Li Z, Yang K (2020) Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching[C]. Proceed AAAI Conf Artif Intell 34(7):12926–12934

    Google Scholar 

Download references

Acknowledgements

This research was financially supported by the Major Scientific Research Project for Universities of Guangdong Province (2020ZDZX3058); Science and technology projects of Zhuhai in the field of social development (2220004000066); the Key Laboratory of Intelligent Multimedia Technology (201762005)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ge Yang.

Ethics declarations

Conflicts of interest

We declare that we have no financial or personal relationships with other people or organizations that may have inappropriately influenced our work. There is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, G., Liao, Y. An improved binocular stereo matching algorithm based on AANet. Multimed Tools Appl 82, 40987–41003 (2023). https://doi.org/10.1007/s11042-023-15183-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15183-6

Keywords

Navigation