skip to main content
10.1145/3577530.3577546acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

Monocular depth estimation using synthetic data with domain-separated feature alignment

Published: 30 March 2023 Publication History

Abstract

Depth estimation is a research hotspot in today's computer vision tasks, and its depth information is of great significance for many applications such as autonomous driving, 3D reconstruction, and object tracking. Although supervised depth estimation based on deep learning has high prediction accuracy, it is difficult to obtain a large amount of labeled data. Therefore, self-supervised monocular depth estimation methods that do not require labeled data have become the mainstream of research. Self-supervised learning utilizes the constraints of objects on spatial geometry, reducing the need for labeled data, but the problems of dynamic objects, shooting angles, and visibility weaken the effect of self-supervision. Using the accurate depth information of synthetic datasets to assist the self-supervised training of real datasets can improve the accuracy of self-supervised depth estimation, but most methods do not consider the distribution difference between synthetic data and real data, which affects the estimation effect. Aiming at this problem, this paper proposes a domain-separated Monocular Depth Estimation (DsMDE) algorithm based on domain separation network, which uses orthogonal loss to separate the public and private features of each domain, and then uses the maximum mean difference to The common features are aligned to reduce the difference between the synthetic domain and the real domain. The experimental results show that the DsMDE method proposed in this paper improves the depth estimation effect, and the depth estimation accuracy is better than that of the mainstream algorithms.

References

[1]
Eigen D, Puhrsch C, Fergus R. Depth map prediction from a single image using a multi-scale deep network[J]. In Advances in neural information processing systems,2014:2366-2374. https://arxiv.org/abs/1406.2283
[2]
Godard, C.; Mac Aodha, O.; Firman, M.; Brostow, G.J. Digging into self-supervised monocular depth estimation. In Proceedings of the 2019 IEEE International Conference on Computer Vision, Seoul, Korea,27–28 October 2019; pp. 3828–3838. https://arxiv.org/abs/1806.01260
[3]
Laina I, Rupprecht C, Belagiannis V, Deeper Depth Prediction with Fully Convolutional Residual Networks[J]. Fourth International Conference on 3d Vision, 2016. https://arxiv.org/abs/1606.00373v1
[4]
Fu H, Gong M, Wang C, Deep Ordinal Regression Network for Monocular Depth Estimation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018. https://arxiv.org/abs/1806.02446
[5]
Wofk D, Ma F, Yang T J, FastDepth: Fast Monocular Depth Estimation on Embedded Systems[J]. IEEE, 2019. https://arxiv.org/abs/1903.03273
[6]
Zhou T, Brown M, Snavely N, Unsupervised Learning of Depth and Ego-Motion from Video[J]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. http://arxiv.org/abs/1704.07813
[7]
Yin Z, Shi J . GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018. http://arxiv.org/abs/1803.02276
[8]
N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, T. Brox, A large dataset to train convolutional networks for disparity, opticalflow, and sceneflow estimation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4040–4048. https://arxiv.org/abs/1512.02134v1
[9]
Zheng C, Cham T J, Cai J . T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks[J]. Springer, Cham, 2018. pp. 767–783. http://arxiv.org/abs/1808.01454
[10]
B. Cheng, I. S. Saggu, R. Shah, G. Bansal, and D. Bharadia,“S3Net:Semantic-aware self-supervised depth estimation with monocular videos and synthetic data,” in Proc. Eur. Conf. 20. http://arxiv.org/abs/2007.14511
[11]
Pnvr K, Zhou H, Jacobs D . SharinGAN: Combining Synthetic and Real Data for Unsupervised Geometry Estimation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020. http://arxiv.org/abs/2006.04026
[12]
Gurram A, Tuna A F, F Shen, Monocular Depth Estimation through Virtual-world Supervision and Real-world SfM Self-Supervision[J]. 2021. http://arxiv.org/abs/2103.12209v1
[13]
Liu L, Song X, Wang M, Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation[J]. 2021. http://arxiv.org/abs/2108.07628
[14]
Godard C, Aodha O M, Brostow G J . Unsupervised Monocular Depth Estimation with Left-Right Consistency[C]// Computer Vision & Pattern Recognition. IEEE, 2017. http://arxiv.org/abs/1609.03677
[15]
Bousmalis K, Trigeorgis G, Silberman N, Domain Separation Networks[C]// 2016. http://arxiv.org/abs/1608.06019

Cited By

View all
  • (2023)Attention Mechanism Used in Monocular Depth Estimation: An OverviewApplied Sciences10.3390/app1317994013:17(9940)Online publication date: 2-Sep-2023

Index Terms

  1. Monocular depth estimation using synthetic data with domain-separated feature alignment

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence
    December 2022
    341 pages
    ISBN:9781450397773
    DOI:10.1145/3577530
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 March 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. domain adaptation
    2. migration learning
    3. monocular depth estimation
    4. synthetic data

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • National Natural Science Foundation of China

    Conference

    CSAI 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Attention Mechanism Used in Monocular Depth Estimation: An OverviewApplied Sciences10.3390/app1317994013:17(9940)Online publication date: 2-Sep-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media