research-article

Monocular depth estimation using synthetic data with domain-separated feature alignment

Authors:

Yundong LiAuthors Info & Claims

CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence

Pages 100 - 105

https://doi.org/10.1145/3577530.3577546

Published: 30 March 2023 Publication History

Abstract

Depth estimation is a research hotspot in today's computer vision tasks, and its depth information is of great significance for many applications such as autonomous driving, 3D reconstruction, and object tracking. Although supervised depth estimation based on deep learning has high prediction accuracy, it is difficult to obtain a large amount of labeled data. Therefore, self-supervised monocular depth estimation methods that do not require labeled data have become the mainstream of research. Self-supervised learning utilizes the constraints of objects on spatial geometry, reducing the need for labeled data, but the problems of dynamic objects, shooting angles, and visibility weaken the effect of self-supervision. Using the accurate depth information of synthetic datasets to assist the self-supervised training of real datasets can improve the accuracy of self-supervised depth estimation, but most methods do not consider the distribution difference between synthetic data and real data, which affects the estimation effect. Aiming at this problem, this paper proposes a domain-separated Monocular Depth Estimation (DsMDE) algorithm based on domain separation network, which uses orthogonal loss to separate the public and private features of each domain, and then uses the maximum mean difference to The common features are aligned to reduce the difference between the synthetic domain and the real domain. The experimental results show that the DsMDE method proposed in this paper improves the depth estimation effect, and the depth estimation accuracy is better than that of the mainstream algorithms.

References

[1]

Eigen D, Puhrsch C, Fergus R. Depth map prediction from a single image using a multi-scale deep network[J]. In Advances in neural information processing systems,2014:2366-2374. https://arxiv.org/abs/1406.2283

[2]

Godard, C.; Mac Aodha, O.; Firman, M.; Brostow, G.J. Digging into self-supervised monocular depth estimation. In Proceedings of the 2019 IEEE International Conference on Computer Vision, Seoul, Korea,27–28 October 2019; pp. 3828–3838. https://arxiv.org/abs/1806.01260

[3]

Laina I, Rupprecht C, Belagiannis V, Deeper Depth Prediction with Fully Convolutional Residual Networks[J]. Fourth International Conference on 3d Vision, 2016. https://arxiv.org/abs/1606.00373v1

[4]

Fu H, Gong M, Wang C, Deep Ordinal Regression Network for Monocular Depth Estimation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018. https://arxiv.org/abs/1806.02446

[5]

Wofk D, Ma F, Yang T J, FastDepth: Fast Monocular Depth Estimation on Embedded Systems[J]. IEEE, 2019. https://arxiv.org/abs/1903.03273

[6]

Zhou T, Brown M, Snavely N, Unsupervised Learning of Depth and Ego-Motion from Video[J]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. http://arxiv.org/abs/1704.07813

[7]

Yin Z, Shi J . GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2018. http://arxiv.org/abs/1803.02276

[8]

N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, T. Brox, A large dataset to train convolutional networks for disparity, opticalflow, and sceneflow estimation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4040–4048. https://arxiv.org/abs/1512.02134v1

[9]

Zheng C, Cham T J, Cai J . T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks[J]. Springer, Cham, 2018. pp. 767–783. http://arxiv.org/abs/1808.01454

[10]

B. Cheng, I. S. Saggu, R. Shah, G. Bansal, and D. Bharadia,“S3Net:Semantic-aware self-supervised depth estimation with monocular videos and synthetic data,” in Proc. Eur. Conf. 20. http://arxiv.org/abs/2007.14511

[11]

Pnvr K, Zhou H, Jacobs D . SharinGAN: Combining Synthetic and Real Data for Unsupervised Geometry Estimation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020. http://arxiv.org/abs/2006.04026

[12]

Gurram A, Tuna A F, F Shen, Monocular Depth Estimation through Virtual-world Supervision and Real-world SfM Self-Supervision[J]. 2021. http://arxiv.org/abs/2103.12209v1

[13]

Liu L, Song X, Wang M, Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation[J]. 2021. http://arxiv.org/abs/2108.07628

[14]

Godard C, Aodha O M, Brostow G J . Unsupervised Monocular Depth Estimation with Left-Right Consistency[C]// Computer Vision & Pattern Recognition. IEEE, 2017. http://arxiv.org/abs/1609.03677

[15]

Bousmalis K, Trigeorgis G, Silberman N, Domain Separation Networks[C]// 2016. http://arxiv.org/abs/1608.06019

Cited By

Li YWei XFan H(2023)Attention Mechanism Used in Monocular Depth Estimation: An OverviewApplied Sciences10.3390/app1317994013:17(9940)Online publication date: 2-Sep-2023
https://doi.org/10.3390/app13179940

Index Terms

Monocular depth estimation using synthetic data with domain-separated feature alignment
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding

Recommendations

Transferring knowledge from monocular completion for self-supervised monocular depth estimation
Abstract
Monocular depth estimation is a very challenging task in computer vision, with the goal to predict per-pixel depth from a single RGB image. Supervised learning methods require large amounts of depth measurement data, which are time-consuming and ...
$S^{3}$ Net: Semantic-Aware Self-supervised Depth Estimation with Monocular Videos and Synthetic Data
Computer Vision – ECCV 2020
Abstract
Solving depth estimation with monocular cameras enables the possibility of widespread use of cameras as low-cost depth estimation sensors in applications such as autonomous driving and robotics. However, learning such a scalable depth estimation ... $^{}$
Learning Monocular Depth by Distilling Cross-Domain Stereo Networks
Computer Vision – ECCV 2018
Abstract
Monocular depth estimation aims at estimating a pixelwise depth map for a single image, which has wide applications in scene understanding and autonomous driving. Existing supervised and unsupervised methods face great challenges. Supervised ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence

December 2022

341 pages

ISBN:9781450397773

DOI:10.1145/3577530

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 March 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

CSAI 2022

CSAI 2022: 2022 6th International Conference on Computer Science and Artificial Intelligence

December 9 - 11, 2022

Beijing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
28
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li YWei XFan H(2023)Attention Mechanism Used in Monocular Depth Estimation: An OverviewApplied Sciences10.3390/app1317994013:17(9940)Online publication date: 2-Sep-2023
https://doi.org/10.3390/app13179940

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten