In-Plane Rotation-Aware Monocular Depth Estimation Using SLAM

Saito, Yuki; Hachiuma, Ryo; Yamaguchi, Masahiro; Saito, Hideo

doi:10.1007/978-981-15-4818-5_23

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1212))

Included in the following conference series:

International Workshop on Frontiers of Computer Vision

937 Accesses
3 Altmetric

Abstract

Estimating accurate depth from an RGB image in any environment is challenging task in computer vision. Recent learning based method using deep Convolutional Neural Networks (CNNs) have driven plausible appearance, but these conventional methods are not good at estimating scenes that have a pure rotation of camera, such as in-plane rolling. This movement imposes perturbations on learning-based methods because gravity direction is considered to be strong prior to CNN depth estimation (i.e., the top region of an image has a relatively large depth, whereas bottom region tends to have a small depth). To overcome this crucial weakness in depth estimation with CNN, we propose a simple but effective refining method that incorporates in-plane roll alignment using camera poses of monocular Simultaneous Localization and Mapping (SLAM). For the experiment, we used public datasets and also created our own dataset composed of mostly in-plane roll camera movements. Evaluation results on these datasets show the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pseudo RGB-D for Self-improving Monocular SLAM and Depth Prediction

Depth estimation from single monocular images using deep hybrid network

Article 20 December 2016

Learning Depth from Monocular Sequence with Convolutional LSTM Network

Notes

1.
A rotary motion around an optical axis in camera coordinate system.
2.
https://github.com/NetEaseAI-CVLab/CNN-MonoFusion.

References

Wang, J., Liu, H., Cong, L., Xiahou, Z., Wang, L.: CNN-MonoFusion: online monocular dense reconstruction using learned depth from single view. In: IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pp. 57–62. IEEE, Munich (2018)
Google Scholar
Wang, Y., Chao, W., Garg, D., Hariharan, B., Campbell, M., Weinberger, K.: Pseudo-LiDAR from visual depth estimation: bridging the gap in 3D object detection for autonomous driving. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8445–8453. IEEE (2019)
Google Scholar
Marcu, A., Costea, D., Licăreţ, V., Pîrvu, M., Sluşanschi, E., Leordeanu, M.: SafeUAV: learning to estimate depth and safe landing areas for UAVs from synthetic data. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11130, pp. 43–58. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11012-3_4
Chapter Google Scholar
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: International Conference on 3D Vision (3DV), pp. 11–20. IEEE (2016)
Google Scholar
Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2002–2011. IEEE (2018)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Chapter Google Scholar
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: International Conference on 3D Vision (3DV), pp. 11–20. IEEE (2017)
Google Scholar
Mi, L., Wang, H., Tian, Y., Shavit, N.: Training-free uncertainty estimation for neural networks. arXiv preprint arXiv:1910.04858 (2019)
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D SLAM systems. In: IEEE International Conference on Intelligent Robot Systems, pp. 573–580. IEEE (2012)
Google Scholar
Grisettiyz, G., Stachniss, C., Burgard, W.: Improving grid-based SLAM with Rao-Blackwellized particle filters by adaptive proposals and selective resampling. In: IEEE International Conference on Robotics and Automation, pp. 2432–2437. IEEE (2005)
Google Scholar
Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: real-time dense monocular SLAM with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6243–6252. IEEE (2017)
Google Scholar
Laidlow, T., Czarnowski, J., Leutenegger, S.: DeepFusion: real-time dense 3D reconstruction for monocular SLAM using single-view depth and gradient predictions. In: International Conference on Robotics and Automation, pp. 4068–4074. IEEE (2019)
Google Scholar
Toyoda, K., Kono, M., Rekimoto, J.: Post-data augmentation to improve deep pose estimation of extreme and wild motions. arXiv preprint arXiv:1902.04250 (2019)
Kurz, D., Benhimane, S.: Gravity-aware handheld augmented reality. In: IEEE International Symposium on Mixed and Augmented Reality, pp. 111–120. IEEE (2011)
Google Scholar
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: Proceedings of the IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234. IEEE (2007)
Google Scholar
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Article Google Scholar
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_54
Chapter Google Scholar
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2017)
Article Google Scholar
Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: International Conference on Robotics and Automation, pp. 15–22. IEEE (2014)
Google Scholar
Fischer, P., Dosovitskiy, A., Brox, T.: Image orientation estimation with convolutional networks. In: Gall, J., Gehler, P., Leibe, B. (eds.) GCPR 2015. LNCS, vol. 9358, pp. 368–378. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24947-6_30
Chapter Google Scholar
Olmschenk, G., Tang, H., Zhu, Z.: Pitch and roll camera orientation from a single 2D image using convolutional neural networks. In: 2017 14th Conference on Computer and Robot Vision, pp. 261–268. IEEE (2015)
Google Scholar
Xian, W., Li, Z., Fisher, M., Eisenmann, J., Shechtman, E., Snavely, N.: UprightNet: geometry-aware camera orientation estimation from single images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9974–9983. IEEE (2019)
Google Scholar
Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)
Google Scholar

Download references

Acknowledgement

This work was partially supported by the Japan Science and Technology Agency (JST) under grant JPMJMI19B2 and JPMJCR1683.

Author information

Authors and Affiliations

Department of Information and Computer Science, Keio University, Yokohama, Japan
Yuki Saito, Ryo Hachiuma, Masahiro Yamaguchi & Hideo Saito

Authors

Yuki Saito
View author publications
You can also search for this author in PubMed Google Scholar
Ryo Hachiuma
View author publications
You can also search for this author in PubMed Google Scholar
Masahiro Yamaguchi
View author publications
You can also search for this author in PubMed Google Scholar
Hideo Saito
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yuki Saito , Ryo Hachiuma , Masahiro Yamaguchi or Hideo Saito .

Editor information

Editors and Affiliations

Saitama Institute of Technology, Saitama, Japan
Wataru Ohyama
Kyungpook National University, Daegu, Korea (Republic of)
Soon Ki Jung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saito, Y., Hachiuma, R., Yamaguchi, M., Saito, H. (2020). In-Plane Rotation-Aware Monocular Depth Estimation Using SLAM. In: Ohyama, W., Jung, S. (eds) Frontiers of Computer Vision. IW-FCV 2020. Communications in Computer and Information Science, vol 1212. Springer, Singapore. https://doi.org/10.1007/978-981-15-4818-5_23

Download citation

DOI: https://doi.org/10.1007/978-981-15-4818-5_23
Published: 28 April 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4817-8
Online ISBN: 978-981-15-4818-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

In-Plane Rotation-Aware Monocular Depth Estimation Using SLAM

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pseudo RGB-D for Self-improving Monocular SLAM and Depth Prediction

Depth estimation from single monocular images using deep hybrid network

Learning Depth from Monocular Sequence with Convolutional LSTM Network

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

In-Plane Rotation-Aware Monocular Depth Estimation Using SLAM

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pseudo RGB-D for Self-improving Monocular SLAM and Depth Prediction

Depth estimation from single monocular images using deep hybrid network

Learning Depth from Monocular Sequence with Convolutional LSTM Network

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation