SAda-Net: A Self-supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data

Hirner, Dominik; Fraundorfer, Friedrich

doi:10.1007/978-3-031-78192-6_11

Dominik Hirner¹³ &
Friedrich Fraundorfer^13,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15310))

Included in the following conference series:

International Conference on Pattern Recognition

132 Accesses

Abstract

Stereo estimation has made many advancements in recent years with the introduction of deep-learning. However the traditional supervised approach to deep-learning requires the creation of accurate and plentiful ground-truth data, which is expensive to create and not available in many situations. This is especially true for remote sensing applications, where there is an excess of available data without proper ground truth. To tackle this problem, we propose a self-supervised CNN with self-improving adaptive abilities. In the first iteration, the created disparity map is inaccurate and noisy. Leveraging the left-right consistency check, we get a sparse but more accurate disparity map which is used as an initial pseudo ground-truth. This pseudo ground-truth is then adapted and updated after every epoch in the training step of the network. We use the sum of inconsistent points in order to track the network convergence. The code for our method will be made available after acceptance at https://github.com/thedodo/SAda-Net

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

EnhancedNet, an End-to-End Network for Dense Disparity Estimation and its Application to Aerial Images

Article Open access 28 August 2024

Learning Stereo from Single Images

InStereo2K: a large real dataset for stereo matching in indoor scenes

Article 31 July 2020

References

Hirschmueller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. IEEE Computer Society Conf. Comput. Vis. Pattern Recog. (CVPR 2005), vol. 2, pp. 807-814 (2005)
Google Scholar
Facciolo, G., et al.: MGM: A significantly more global matching for stereovision. In: BMVC 2015 (2015)
Google Scholar
Hu, L., et al.: Model generalization in deep learning applications for land cover mapping, arXiv preprint arXiv:2008.10351 (2020)
Godard, C., et al.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Godard, C., et al.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Google Scholar
Chang, C., et al.: On an analysis of static occlusion in stereo vision. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 722-723 (1991)
Google Scholar
Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 151–158. Springer, Heidelberg (1994). https://doi.org/10.1007/BFb0028345
Chapter Google Scholar
Min, D., Sohn, K.: Cost aggregation and occlusion handling with WLS in stereo matching. IEEE Trans. Image Process. 17(8), 1431–1442 (2008)
Article MathSciNet Google Scholar
Ohta, Y., Kanade, T.: (1985) Stereo by intra-and inter-scanline search using dynamic programming. IEEE Trans. Pattern Anal. Mach. Intell. 2, 139–154 (1985)
Article Google Scholar
Sun, J., et al.: Stereo matching using belief propagation. IEEE Trans. Pattern Analy. Mach. Intell. 25(7), 787–800 (2003)
Article Google Scholar
Le Saux, B., et al.: IEEE Dataport Data Fusion Contest 2019 (DFC2019) (2019). https://dx.doi.org/10.21227/c6tm-vw12
Mari, R., et al.: Disparity estimation networks for aerial and high-resolution satellite images: a review. Image Process. Line 12, 501–526 (2022)
Article Google Scholar
De Franchis, C., et al.: An automatic and modular stereo pipeline for pushbroom images. ISPRS Annals Photogrammetry, Remote Sensing Spatial Inform. Sci. 2, 49–56 (2014)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural. Inf. Process. Syst. 32, 8024–8035 (2019)
Google Scholar
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17, 1–32 (2016)
Google Scholar
Hirner, D., Fraundorfer, F.: FC-DCNN: a densely connected neural network for stereo estimation. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE (2021)
Google Scholar
Hirner. D., Fraundorfer, F.: FCDSN-DC: an accurate and lightweight convolutional neural network for stereo estimation with depth completion. In: 2022 26th International Conference on Pattern Recognition (ICPR). IEEE (2022)
Google Scholar
Zhang, F., et al.: Ga-net: Guided aggregation net for end-to-end stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Google Scholar
G. Bradski. Open Source Computer Vision Library. 2015
Google Scholar
Wang, H., et al.: PVStereo: pyramid voting module for end-to-end self-supervised stereo matching. IEEE Robot. Autom. Lett. 6(3), 4353-4360 (2021)
Google Scholar
Lipson, L., et al.: Raft-stereo: Multilevel recurrent field transforms for stereo matching. In: 2021 International Conference on 3D Vision (3DV). IEEE (2021)
Google Scholar
Bosch, M., et al.: Metric evaluation pipeline for 3d modeling of urban scenes. Inter. Arch. Photogrammetry, Remote Sensing Spatial Inform. Sci. 42, 239–246 (2017)
Article Google Scholar
Longbotham, N., et al.: Measuring the spatial and spectral performance of WorldView-3. Hyperspectral Imaging and Sounding of the Environment. Optica Publishing Group (2015)
Google Scholar
SpaceNet on Amazon Web Services (AWS). “Datasets.” The SpaceNet Catalog. Last modified October 1st (2018). https://spacenet.ai/datasets/, Accessed 5 November 2024
Miclea, V.C., et al.: New sub-pixel interpolation functions for accurate real-time stereo-matching algorithms. In: 2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP). IEEE (2015)
Google Scholar
Ma F., et al.: Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA). IEEE (2019)
Google Scholar
Knöbelreiter, P., et al.: Self-supervised learning for stereo reconstruction on aerial images. In: IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE (2018)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010) (2010)
Google Scholar
Rottensteiner, F., et al.: Isprs test project on urban classification and 3d building reconstruction. Commission III-Photogrammetric Computer Vision and Image Analysis (2013)
Google Scholar
Knobelreiter, P., et al.: End-to-end training of hybrid CNN-CRF models for stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Scharstein, D., et al.: High-resolution stereo datasets with subpixel-accurate ground truth. In: German Conference on Pattern Recognition. Springer, Cham (2014)
Google Scholar
Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Graphics and Vision, Graz University of Technology, Graz, Austria
Dominik Hirner & Friedrich Fraundorfer
Remote Sensing Technology Institute (IMF), German Aerospace Center (DLR), Cologne, Germany
Friedrich Fraundorfer

Authors

Dominik Hirner
View author publications
You can also search for this author in PubMed Google Scholar
Friedrich Fraundorfer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominik Hirner .

Editor information

Editors and Affiliations

University of Salford, Salford, Lancashire, UK
Apostolos Antonacopoulos
IIT Bombay, Powai, Mumbai, Maharashtra, India
Subhasis Chaudhuri
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
IIT Kharagpur, Kharagpur, West Bengal, India
Saumik Bhattacharya
Indian Statistical Institute, Kolkata, West Bengal, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hirner, D., Fraundorfer, F. (2025). SAda-Net: A Self-supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15310. Springer, Cham. https://doi.org/10.1007/978-3-031-78192-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-78192-6_11
Published: 04 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78191-9
Online ISBN: 978-3-031-78192-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

SAda-Net: A Self-supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

EnhancedNet, an End-to-End Network for Dense Disparity Estimation and its Application to Aerial Images

Learning Stereo from Single Images

InStereo2K: a large real dataset for stereo matching in indoor scenes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

SAda-Net: A Self-supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

EnhancedNet, an End-to-End Network for Dense Disparity Estimation and its Application to Aerial Images

Learning Stereo from Single Images

InStereo2K: a large real dataset for stereo matching in indoor scenes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation