Depth Estimation Using Sparse Depth and Transformer

Malik, Roopak; Hambarde, Praful; Murala, Subrahmanyam

doi:10.1007/978-3-031-11349-9_29

Depth Estimation Using Sparse Depth and Transformer

Roopak Malik¹⁰,
Praful Hambarde¹⁰ &
Subrahmanyam Murala¹⁰

Conference paper
First Online: 24 July 2022

834 Accesses
2 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1568))

Abstract

Depth prediction from single image is a challenging task due to the intra scale ambiguity and unavailability of prior information. The prediction of an unambiguous depth from single RGB image is very important aspect for computer vision applications. In this paper, an end-to-end sparse-to-dense network using transformers is proposed for depth estimation. The proposed network processes single images along with the additional sparse depth samples which have been generated for depth estimation. The additional sparse depth sample are acquired either with a low-resolution depth sensor or calculated by visual simultaneous localization. Here, we have proposed a model that utilises both sparse samples and transformers and along with a encoder-decoder structure that helps us in giving great depth results that are comparable to other state-of-the-art results.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Chen, J., et al.: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.13645 (2021)
Soleymani, A.A.M.-M., Deep Learning: Transformer Networks (2019)
Google Scholar
Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA) (2018)
Google Scholar
Hambarde, P., Murala, S.: S2DNet: depth estimation from single image and sparse samples. IEEE Trans. Comput. Imaging 6, 806–817 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. arXiv preprint arXiv:2012.15840 (2020)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. (IJRR) 32, 1231–1237 (2013)
Google Scholar
Koch, T., Liebel, L., Fraundorfer, F., Körner, M.: Evaluation of CNN-based single-image depth estimation methods. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11131, pp. 331–348. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11015-4_25
Chapter Google Scholar
Roy, A., Todorovic, S.: Monocular depth estimation using neural regression forest. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Wang, W., Chen, C., Ding, M., Li, J., Yu, H., Zha, S.: TransBTS: multimodal brain tumor segmentation using transformer. arXiv preprint arXiv:2103.04430 (2021)
Han, K., et al.: A survey on visual transformer. arXiv preprint arXiv:2012.12556 (2020)
Karimi, D., Vasylechko, S., Gholipour, A.: Convolution-free medical image segmentation using transformers. arXiv preprint arXiv:2102.13645 (2021)
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Chapter Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Yang, G., Tang, H., Ding, M., Sebe, N., Ricci, E.: Transformers solve the limited receptive field for monocular depth prediction. arXiv preprint arXiv:2103.12091 (2021)
Phutke, S.S., Murala, S.: Diverse receptive field based adversarial concurrent encoder network for image inpainting. IEEE Signal Process. Lett. 28, 1873–1877 (2021)
Google Scholar
Mehta, N., Murala, S.: MSAR-Net: multi-scale attention based light-weight image super-resolution. Pattern Recognit. Lett. 151, 215–221 (2021)
Google Scholar
Patil, P.W., et al.: An unified recurrent video object segmentation framework for various surveillance environments. IEEE Trans. Image Process. 30, 7889–7902 (2021)
Google Scholar
Dudhane, A., Hambarde, P., Patil, P., Murala, S.: Deep underwater image restoration and beyond. IEEE Signal Process. Lett. 27, 675–679 (2020)
Google Scholar
Dudhane, A., Biradar, K.M., Patil, P.W., Hambarde, P., Murala, S.: Varicolored image de-hazing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4564–4573 (2020)
Google Scholar
Hambarde, P., Dudhane, A., Murala, S.: Single image depth estimation using deep adversarial training. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 989–993. IEEE (2019)
Google Scholar
Hambarde, P., Dudhane, A., Patil, P.W., Murala, S., Dhall, A.: Depth estimation from single image and semantic prior. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 1441–1445. IEEE (2020)
Google Scholar
Patil, P.W., Biradar, K.M., Dudhane, A., Murala, S.: An end-to-end edge aggregation network for moving object segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8149–8158 (2020)
Google Scholar
Patil, P.W., Dudhane, A., Chaudhary, S., Murala, S.: Multi-frame based adversarial learning approach for video surveillance. Pattern Recognit. 122, 108350 (2022)
Google Scholar
Hambarde, P., Murala, S., Dhall, A.: UW-GAN: single image DepthEstimation and image enhancement for underwater images. IEEE Trans. Instrum. Meas. 70, 1–12(2021)
Google Scholar
Hambarde, P., Talbar, S.N., Sable, N., Mahajan, A., Chavan, S.S., Thakur, M.: Radiomics for peripheral zone and intra-prostatic urethra segmentation in MR imaging. Biomed. Signal Process. Control 51, 19–29 (2019)
Article Google Scholar
Hambarde, P., Talbar, S., Mahajan, A., Chavan, S., Thakur, M., Sable, N.: Prostate lesion segmentation in MR images using radiomics based deeply supervised U-Net. Biocybern. Biomed. Eng. 40(4), 1421–1435 (2020)
Article Google Scholar
Bhagat, S., Kokare, M., Haswani, V., Hambarde, P., Kamble, R.: WheatNet-lite: a novel light weight network for wheat head detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1332–1341 (2021)
Google Scholar
Alaspure, P., Hambarde, P., Dudhane, A., Murala, S.: DarkGAN: night image enhancement using generative adversarial networks. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds.) CVIP 2020. CCIS, vol. 1376, pp. 293–302. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-1086-8_26
Chapter Google Scholar
Bhagat, S., Kokare, M., Haswani, V., Hambarde, P., Kamble, R.: Eff-UNet++: a novel architecture for plant leaf segmentation and counting. Ecol. Inform. 68, 101583 (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

CVPR Lab, IIT Ropar, Ropar, India
Roopak Malik, Praful Hambarde & Subrahmanyam Murala

Authors

Roopak Malik
View author publications
You can also search for this author in PubMed Google Scholar
Praful Hambarde
View author publications
You can also search for this author in PubMed Google Scholar
Subrahmanyam Murala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Praful Hambarde .

Editor information

Editors and Affiliations

Indian Institute of Technology Roorkee, Roorkee, India
Balasubramanian Raman
Indian Institute of Technology Ropar, Ropar, India
Subrahmanyam Murala
Jadavpur University, Kolkata, India
Ananda Chowdhury
Indian Institute of Technology Ropar, Ropar, India
Abhinav Dhall
Indian Institute of Technology Ropar, Ropar, India
Puneet Goyal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Malik, R., Hambarde, P., Murala, S. (2022). Depth Estimation Using Sparse Depth and Transformer. In: Raman, B., Murala, S., Chowdhury, A., Dhall, A., Goyal, P. (eds) Computer Vision and Image Processing. CVIP 2021. Communications in Computer and Information Science, vol 1568. Springer, Cham. https://doi.org/10.1007/978-3-031-11349-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-11349-9_29
Published: 24 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11348-2
Online ISBN: 978-3-031-11349-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics