Skip to main content

SAda-Net: A Self-supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15310))

Included in the following conference series:

  • 132 Accesses

Abstract

Stereo estimation has made many advancements in recent years with the introduction of deep-learning. However the traditional supervised approach to deep-learning requires the creation of accurate and plentiful ground-truth data, which is expensive to create and not available in many situations. This is especially true for remote sensing applications, where there is an excess of available data without proper ground truth. To tackle this problem, we propose a self-supervised CNN with self-improving adaptive abilities. In the first iteration, the created disparity map is inaccurate and noisy. Leveraging the left-right consistency check, we get a sparse but more accurate disparity map which is used as an initial pseudo ground-truth. This pseudo ground-truth is then adapted and updated after every epoch in the training step of the network. We use the sum of inconsistent points in order to track the network convergence. The code for our method will be made available after acceptance at https://github.com/thedodo/SAda-Net

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hirschmueller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. IEEE Computer Society Conf. Comput. Vis. Pattern Recog. (CVPR 2005), vol. 2, pp. 807-814 (2005)

    Google Scholar 

  2. Facciolo, G., et al.: MGM: A significantly more global matching for stereovision. In: BMVC 2015 (2015)

    Google Scholar 

  3. Hu, L., et al.: Model generalization in deep learning applications for land cover mapping, arXiv preprint arXiv:2008.10351 (2020)

  4. Godard, C., et al.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  5. Godard, C., et al.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)

    Google Scholar 

  6. Chang, C., et al.: On an analysis of static occlusion in stereo vision. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 722-723 (1991)

    Google Scholar 

  7. Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 151–158. Springer, Heidelberg (1994). https://doi.org/10.1007/BFb0028345

    Chapter  Google Scholar 

  8. Min, D., Sohn, K.: Cost aggregation and occlusion handling with WLS in stereo matching. IEEE Trans. Image Process. 17(8), 1431–1442 (2008)

    Article  MathSciNet  Google Scholar 

  9. Ohta, Y., Kanade, T.: (1985) Stereo by intra-and inter-scanline search using dynamic programming. IEEE Trans. Pattern Anal. Mach. Intell. 2, 139–154 (1985)

    Article  Google Scholar 

  10. Sun, J., et al.: Stereo matching using belief propagation. IEEE Trans. Pattern Analy. Mach. Intell. 25(7), 787–800 (2003)

    Article  Google Scholar 

  11. Le Saux, B., et al.: IEEE Dataport Data Fusion Contest 2019 (DFC2019) (2019). https://dx.doi.org/10.21227/c6tm-vw12

  12. Mari, R., et al.: Disparity estimation networks for aerial and high-resolution satellite images: a review. Image Process. Line 12, 501–526 (2022)

    Article  Google Scholar 

  13. De Franchis, C., et al.: An automatic and modular stereo pipeline for pushbroom images. ISPRS Annals Photogrammetry, Remote Sensing Spatial Inform. Sci. 2, 49–56 (2014)

    Article  Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  15. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural. Inf. Process. Syst. 32, 8024–8035 (2019)

    Google Scholar 

  16. Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17, 1–32 (2016)

    Google Scholar 

  17. Hirner, D., Fraundorfer, F.: FC-DCNN: a densely connected neural network for stereo estimation. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE (2021)

    Google Scholar 

  18. Hirner. D., Fraundorfer, F.: FCDSN-DC: an accurate and lightweight convolutional neural network for stereo estimation with depth completion. In: 2022 26th International Conference on Pattern Recognition (ICPR). IEEE (2022)

    Google Scholar 

  19. Zhang, F., et al.: Ga-net: Guided aggregation net for end-to-end stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  20. Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  21. Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  22. G. Bradski. Open Source Computer Vision Library. 2015

    Google Scholar 

  23. Wang, H., et al.: PVStereo: pyramid voting module for end-to-end self-supervised stereo matching. IEEE Robot. Autom. Lett. 6(3), 4353-4360 (2021)

    Google Scholar 

  24. Lipson, L., et al.: Raft-stereo: Multilevel recurrent field transforms for stereo matching. In: 2021 International Conference on 3D Vision (3DV). IEEE (2021)

    Google Scholar 

  25. Bosch, M., et al.: Metric evaluation pipeline for 3d modeling of urban scenes. Inter. Arch. Photogrammetry, Remote Sensing Spatial Inform. Sci. 42, 239–246 (2017)

    Article  Google Scholar 

  26. Longbotham, N., et al.: Measuring the spatial and spectral performance of WorldView-3. Hyperspectral Imaging and Sounding of the Environment. Optica Publishing Group (2015)

    Google Scholar 

  27. SpaceNet on Amazon Web Services (AWS). “Datasets.” The SpaceNet Catalog. Last modified October 1st (2018). https://spacenet.ai/datasets/, Accessed 5 November 2024

  28. Miclea, V.C., et al.: New sub-pixel interpolation functions for accurate real-time stereo-matching algorithms. In: 2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP). IEEE (2015)

    Google Scholar 

  29. Ma F., et al.: Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA). IEEE (2019)

    Google Scholar 

  30. Knöbelreiter, P., et al.: Self-supervised learning for stereo reconstruction on aerial images. In: IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE (2018)

    Google Scholar 

  31. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010) (2010)

    Google Scholar 

  32. Rottensteiner, F., et al.: Isprs test project on urban classification and 3d building reconstruction. Commission III-Photogrammetric Computer Vision and Image Analysis (2013)

    Google Scholar 

  33. Knobelreiter, P., et al.: End-to-end training of hybrid CNN-CRF models for stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  34. Scharstein, D., et al.: High-resolution stereo datasets with subpixel-accurate ground truth. In: German Conference on Pattern Recognition. Springer, Cham (2014)

    Google Scholar 

  35. Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dominik Hirner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hirner, D., Fraundorfer, F. (2025). SAda-Net: A Self-supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15310. Springer, Cham. https://doi.org/10.1007/978-3-031-78192-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78192-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78191-9

  • Online ISBN: 978-3-031-78192-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics