An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction

Dalai, Radhamadhab; Dalai, Nibedita; Senapati, Kishore Kumar

doi:10.1007/s11042-023-14615-7

An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction

Published: 21 February 2023

Volume 82, pages 28235–28258, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Radhamadhab Dalai¹,
Nibedita Dalai² &
Kishore Kumar Senapati¹

646 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The volume estimation of a rigid object from a single view object image is the important need in numerous automated vision based systems. The volume estimation on multiple view images are simple to estimate. But volume estimation on a single view object image is a difficult process and has significant importance in volume estimation. This work presents effective object volume estimation in both regular and irregular single view object images. Initially, the single view input images are pre-processed with Mean-median filtering. Afterwards, edge features are extracted by utilizing the Gaussian edge based laplacian operator and key points are extracted using the Scale invariant feature transform (SIFT) feature. The extracted features are considered for the shape analysis of the objects. Subsequently, VGG-ResNet framework is utilized for depth analysis based on the extracted features. The point clouds generation for the volume estimation is attained through the extracted features. Finally, the volume estimation on single view object is effectively attained through the hybrid 3 dimensional U-Net and graph neural network (Hybrid 3DU-GNet). This framework provides the 3D geometric creation for the accurate volume estimation. This provides the significant improvement on volume estimation. The presented methodology effectively estimates the volume on both regular and irregular single view object images. The presented approach is implemented in the working platform of MATLAB. The experimental results of the presented work is analysed with the different existing approaches and proved the significant improvement in performance metrics. The performance metrics are Accuracy (98.59%), precision (98.21%), recall (97.09%), computational time (3.2 seconds), R-squared (98.2%), (Mean absolute percentage error) MAPE (6.1%), and (Root mean squared error) RMSE (0.93).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Three-dimensional rapid registration and reconstruction of multi-view rigid objects based on end-to-end deep surface model

Article 15 February 2020

IV-Net: single-view 3D volume reconstruction by fusing features of image and recovered volume

Article 23 November 2022

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

Data availability

Data sharing not applicable to this article.

References

Chaudhuri K, Kakade SM, Livescu K, Sridharan K (2019) Multi-view clustering via canonical correlation analysis, proceedings of the 26th annual international conference on machine learning, 129-136
Chen P-H, Yang H-C, Chen K-W, Chen Y-S (2020) MVSNet++: learning depth-based attention pyramid features for multi-view stereo. IEEE Trans Image Process 29:7261–7273
Article MATH Google Scholar
Dehais J, Anthimopoulos M, Shevchik S, Mougiakakou S (2016) Two-view 3D reconstruction for food volume estimation. IEEE Trans Multimed 19(5):1090–1099
Article Google Scholar
dos Santos Rosa N, Guizilini V, Grassi V (2019) Sparse-to-continuous: enhancing monocular depth estimation using occupancy maps. In 2019 19th international conference on advanced robotics (ICAR), IEEE, 793-800
Fu H, Gong M, Wang C, Batmanghelich K, Tao D (2018) deep ordinal regression network for monocular depth estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition 2002-2011
Godard C, Aodha OM, Firman M, Brostow GJ (2019) Digging into self-supervised monocular depth estimation. In proceedings of the IEEE/CVF international conference on computer vision 3828-3838
Goldman M, Hassner T, Avidan S (2019) Learn stereo, infer mono: Siamese networks for self-supervised, monocular, depth estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 1–10
Guizilini V, Ambrus R, Pillai S, Gaidon (2019) A. Packnet-sfm: 3d packing for self-supervised monocular depth estimation. arXiv preprint arXiv:1905.02693
Guo Y, Ding X, Liu C, Xue JH (2016) Sufficient canonical correlation analysis. IEEE Trans Image Process 6(25):610–2619
MATH MathSciNet Google Scholar
He T, Collomosse J, Jin H, Soatto S (2020) Geo-pifu: Geometry and pixel aligned implicit functions for single-view human reconstruction. Adv Neural Inf Process Syst 33:9276–9287
He L, Lu J, Wang G, Song S, Zhou J (2021) SOSD-net: joint semantic object segmentation and depth estimation from monocular images. Neurocomputing 440:251–263
Article Google Scholar
Hou T, Ahmadyan A, Zhang L, Wei J, Grundmann M (2020) Mobilepose: real-time pose estimation for unseen objects with weak shape supervision. arXiv preprint arXiv:2003.03522
Huang P-H, Matzen K, Kopf J, Ahuja N, Huang J-B (2018) Deepmvs: learning multi-view stereopsis. In proceedings of the IEEE conference on computer vision and pattern recognition, 2821-2830
Huynh L, Nguyen-Ha P, Matas J, Rahtu E, Heikkilä J (2020) Guiding monocular depth estimation using depth-attention volume. In European Conference on Computer Vision, Springer, Cham, pp 581–597
Google Scholar
Jadhav T, Singh K, Abhyankar A (2019) Volumetric estimation using 3D reconstruction method for grading of fruits. Multimed Tools Appl 78(2):1613–1634
Article Google Scholar
Khan F, Salahuddin S, Javidnia H (2020) Deep learning-based monocular depth estimation methods—a state-of-the-art review. Sensors 20(8):2272
Article Google Scholar
Kharazi BA, Behzadan AH (2021) Flood depth mapping in street photos with image processing and deep neural networks. Comput Environ Urban Syst 88:1–12
Google Scholar
Khojastehnazhand M, Mohammadi V, Minaei S (2019) Maturity detection and volume estimation of apricot using image processing technique. ScientiaHorticulturae 251:247–251
Google Scholar
Kirk R, Mangan M and Cielniak G (2021) Non-destructive soft fruit mass and volume estimation for phenotyping in horticulture. In international conference on computer vision systems, springer, Cham 223-233.
Lee JH, Han MK, Ko DW, Suh IH (2019) From big to small: multi-scale local planar guidance for monocular depth estimation. arXiv preprint arXiv:1907.10326, pp 1–11
Liang B, Zheng L (2017) Specificity and Latent Correlation Learning for Action Recognition Using Synthetic Multi-View Data From Depth Maps, IEEE Transactions On Image Processing, 26(12)
Liao J, Fu Y, Yan Q, Luo F, Xiao C (2021) Adaptive depth estimation for pyramid multi-view stereo. Comput Graph 97:268–278
Article Google Scholar
Liu J, Wang X, Wang T (2019) Classification of tree species and stock volume estimation in ground forest images using deep learning. Comput Electron Agric 166:105012
Article Google Scholar
Luo K, Guan T, Ju L, Huang H, Luo Y (2019) P-mvsnet: Learning patch-wise matching confidence aggregation for multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10452–10461
Maugey T, Petrazzuoli G, Frossard P, Cagnazzo M, Pesquet-Popescu B (2016) Reference view selection in DIBR-based multiview coding. IEEE Trans Image Process 25(4):1808–1819
Article MATH MathSciNet Google Scholar
Mon TO, ZarAung N (2020) Vision based volume estimation method for automatic mango grading system. Biosyst Eng 198:338–349
Article Google Scholar
Okinda C, Sun Y, Nyalala I, Korohou T, Opiyo S, Wang J, Shen M (2020) Egg volume estimation based on image processing and computer vision. J Food Eng 283:110041
Article Google Scholar
Pandey S (2020) A comparative study of 2D-to-3D reconstruction techniques. In Intelligent Communication, Control and Devices, Springer, Singapore 255–263
Rematas K, Nguyen CH, Ritschel T, Fritz M, Tuytelaars T (Aug. 2016) Novel views of objects from a single image. IEEE Trans Pattern Anal Mach Intell 39(8):1576–1590
Article Google Scholar
Su Z, Zhou T, Li K, Brady D, Liu Y (2020) View synthesis from multi-view RGB data using multilayered representation and volumetric estimation. Virtual Real Intell Hardw 2(1):43–55
Article Google Scholar
Sun P, Wu S, Lin K (2020) Attention-guided multi-view stereo network for depth estimation. In 2020 IEEE 22nd international conference on high performance computing and communications; IEEE 18th international conference on Smart City; IEEE 6th international conference on data science and systems (HPCC/SmartCity/DSS), 808-815
Tiwari A (2019) Nondestructive methods for size determination of fruits and vegetables. In Processing of Fruits and Vegetables, Apple Academic Press 203–221
Tosi F, Aleotti F, Poggi M, Mattoccia S (2019) Learning monocular depth estimation infusing traditional stereo knowledge. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition 9799-9809
Wu X, Wang H, Liu C, Jia Y (June 2015) Cross-view action recognition over heterogeneous feature spaces, proceedings of the IEEE international conference on computer vision, 609-616
Xie H, Yao H, Sun X, Zhou S, Zhang S (2019) Pix2vox: context-aware 3d reconstruction from single and multi-view images. In proceedings of the IEEE/CVF international conference on computer vision, 2690-2698
Xie H, Yao H, Zhang S, Zhou S, Sun W (2020) Pix2Vox++: multi-scale context-aware 3D object reconstruction from single and multiple images. Int J Comput Vis 128(12):2919–2935
Article Google Scholar
Xu Q, Wang W, Ceylan D, Mech R, Neumann U (2019) Disn: deep implicit surface network for high-quality single-view 3d reconstruction. Adv Neural Inf Proces Syst 32:1–11
Yang H-C, Chen P-H, Chen K-W, Lee C-Y, Chen Y-S (2020) Fade: feature aggregation for depth estimation with multi-view stereo. IEEE Trans Image Process 29:6590–6600
Article MATH Google Scholar
Yang Z, Yu H, Cao S, Xu Q, Yuan D, Zhang H, Sun M (2021) Human-mimetic estimation of food volume from a single-view RGB image using an AI system. Electronics 10(13):1556
Article Google Scholar
Yu A, Guo W, Liu B, Chen X, Wang X, Cao X, Jiang B (2021) Attention aware cost volume pyramid based multi-view stereo network for 3D reconstruction. ISPRS J Photogramm Remote Sens 175:448–460
Article Google Scholar
Zanfir A, Marinoiu E, Sminchisescu C(2018) Monocular 3d pose and shape estimation of multiple people in natural scenes-the importance of multiple scene constraints. In proceedings of the IEEE conference on computer vision and pattern recognition, 2148-2157
Zhao S, Fu H, Gong M, Tao D (2019) Geometry-aware symmetric domain adaptation for monocular depth estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 9788–9798

Download references

Funding

No funding.

Author information

Authors and Affiliations

Computer Science & Engineering, BIT Mesra, Ranchi, Jharkhand, 835215, India
Radhamadhab Dalai & Kishore Kumar Senapati
Civil Engineering, PMEC College, Bhubaneswar, Odisha, 761003, India
Nibedita Dalai

Authors

Radhamadhab Dalai
View author publications
You can also search for this author in PubMed Google Scholar
Nibedita Dalai
View author publications
You can also search for this author in PubMed Google Scholar
Kishore Kumar Senapati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Radhamadhab Dalai.

Ethics declarations

Conflict of interest

Authors declared that there is no conflict of Interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dalai, R., Dalai, N. & Senapati, K.K. An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction. Multimed Tools Appl 82, 28235–28258 (2023). https://doi.org/10.1007/s11042-023-14615-7

Download citation

Received: 03 December 2021
Revised: 09 June 2022
Accepted: 03 February 2023
Published: 21 February 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11042-023-14615-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction

Abstract

Access this article

Similar content being viewed by others

Three-dimensional rapid registration and reconstruction of multi-view rigid objects based on end-to-end deep surface model

IV-Net: single-view 3D volume reconstruction by fusing features of image and recovered volume

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction

Abstract

Access this article

Similar content being viewed by others

Three-dimensional rapid registration and reconstruction of multi-view rigid objects based on end-to-end deep surface model

IV-Net: single-view 3D volume reconstruction by fusing features of image and recovered volume

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation