Skip to main content
Log in

An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The volume estimation of a rigid object from a single view object image is the important need in numerous automated vision based systems. The volume estimation on multiple view images are simple to estimate. But volume estimation on a single view object image is a difficult process and has significant importance in volume estimation. This work presents effective object volume estimation in both regular and irregular single view object images. Initially, the single view input images are pre-processed with Mean-median filtering. Afterwards, edge features are extracted by utilizing the Gaussian edge based laplacian operator and key points are extracted using the Scale invariant feature transform (SIFT) feature. The extracted features are considered for the shape analysis of the objects. Subsequently, VGG-ResNet framework is utilized for depth analysis based on the extracted features. The point clouds generation for the volume estimation is attained through the extracted features. Finally, the volume estimation on single view object is effectively attained through the hybrid 3 dimensional U-Net and graph neural network (Hybrid 3DU-GNet). This framework provides the 3D geometric creation for the accurate volume estimation. This provides the significant improvement on volume estimation. The presented methodology effectively estimates the volume on both regular and irregular single view object images. The presented approach is implemented in the working platform of MATLAB. The experimental results of the presented work is analysed with the different existing approaches and proved the significant improvement in performance metrics. The performance metrics are Accuracy (98.59%), precision (98.21%), recall (97.09%), computational time (3.2 seconds), R-squared (98.2%), (Mean absolute percentage error) MAPE (6.1%), and (Root mean squared error) RMSE (0.93).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data availability

Data sharing not applicable to this article.

References

  1. Chaudhuri K, Kakade SM, Livescu K, Sridharan K (2019) Multi-view clustering via canonical correlation analysis, proceedings of the 26th annual international conference on machine learning, 129-136

  2. Chen P-H, Yang H-C, Chen K-W, Chen Y-S (2020) MVSNet++: learning depth-based attention pyramid features for multi-view stereo. IEEE Trans Image Process 29:7261–7273

    Article  MATH  Google Scholar 

  3. Dehais J, Anthimopoulos M, Shevchik S, Mougiakakou S (2016) Two-view 3D reconstruction for food volume estimation. IEEE Trans Multimed 19(5):1090–1099

    Article  Google Scholar 

  4. dos Santos Rosa N, Guizilini V, Grassi V (2019) Sparse-to-continuous: enhancing monocular depth estimation using occupancy maps. In 2019 19th international conference on advanced robotics (ICAR), IEEE, 793-800

  5. Fu H, Gong M, Wang C, Batmanghelich K, Tao D (2018) deep ordinal regression network for monocular depth estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition 2002-2011

  6. Godard C, Aodha OM, Firman M, Brostow GJ (2019) Digging into self-supervised monocular depth estimation. In proceedings of the IEEE/CVF international conference on computer vision 3828-3838

  7. Goldman M, Hassner T, Avidan S (2019) Learn stereo, infer mono: Siamese networks for self-supervised, monocular, depth estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 1–10

  8. Guizilini V, Ambrus R, Pillai S, Gaidon (2019) A. Packnet-sfm: 3d packing for self-supervised monocular depth estimation. arXiv preprint arXiv:1905.02693

  9. Guo Y, Ding X, Liu C, Xue JH (2016) Sufficient canonical correlation analysis. IEEE Trans Image Process 6(25):610–2619

    MATH  MathSciNet  Google Scholar 

  10. He T, Collomosse J, Jin H, Soatto S (2020) Geo-pifu: Geometry and pixel aligned implicit functions for single-view human reconstruction. Adv Neural Inf Process Syst 33:9276–9287

  11. He L, Lu J, Wang G, Song S, Zhou J (2021) SOSD-net: joint semantic object segmentation and depth estimation from monocular images. Neurocomputing 440:251–263

    Article  Google Scholar 

  12. Hou T, Ahmadyan A, Zhang L, Wei J, Grundmann M (2020) Mobilepose: real-time pose estimation for unseen objects with weak shape supervision. arXiv preprint arXiv:2003.03522

  13. Huang P-H, Matzen K, Kopf J, Ahuja N, Huang J-B (2018) Deepmvs: learning multi-view stereopsis. In proceedings of the IEEE conference on computer vision and pattern recognition, 2821-2830

  14. Huynh L, Nguyen-Ha P, Matas J, Rahtu E, Heikkilä J (2020) Guiding monocular depth estimation using depth-attention volume. In European Conference on Computer Vision, Springer, Cham, pp 581–597

    Google Scholar 

  15. Jadhav T, Singh K, Abhyankar A (2019) Volumetric estimation using 3D reconstruction method for grading of fruits. Multimed Tools Appl 78(2):1613–1634

    Article  Google Scholar 

  16. Khan F, Salahuddin S, Javidnia H (2020) Deep learning-based monocular depth estimation methods—a state-of-the-art review. Sensors 20(8):2272

    Article  Google Scholar 

  17. Kharazi BA, Behzadan AH (2021) Flood depth mapping in street photos with image processing and deep neural networks. Comput Environ Urban Syst 88:1–12

    Google Scholar 

  18. Khojastehnazhand M, Mohammadi V, Minaei S (2019) Maturity detection and volume estimation of apricot using image processing technique. ScientiaHorticulturae 251:247–251

    Google Scholar 

  19. Kirk R, Mangan M and Cielniak G (2021) Non-destructive soft fruit mass and volume estimation for phenotyping in horticulture. In international conference on computer vision systems, springer, Cham 223-233.

  20. Lee JH, Han MK, Ko DW, Suh IH (2019) From big to small: multi-scale local planar guidance for monocular depth estimation. arXiv preprint arXiv:1907.10326, pp 1–11

  21. Liang B, Zheng L (2017) Specificity and Latent Correlation Learning for Action Recognition Using Synthetic Multi-View Data From Depth Maps, IEEE Transactions On Image Processing, 26(12)

  22. Liao J, Fu Y, Yan Q, Luo F, Xiao C (2021) Adaptive depth estimation for pyramid multi-view stereo. Comput Graph 97:268–278

    Article  Google Scholar 

  23. Liu J, Wang X, Wang T (2019) Classification of tree species and stock volume estimation in ground forest images using deep learning. Comput Electron Agric 166:105012

    Article  Google Scholar 

  24. Luo K, Guan T, Ju L, Huang H, Luo Y (2019) P-mvsnet: Learning patch-wise matching confidence aggregation for multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10452–10461

  25. Maugey T, Petrazzuoli G, Frossard P, Cagnazzo M, Pesquet-Popescu B (2016) Reference view selection in DIBR-based multiview coding. IEEE Trans Image Process 25(4):1808–1819

    Article  MATH  MathSciNet  Google Scholar 

  26. Mon TO, ZarAung N (2020) Vision based volume estimation method for automatic mango grading system. Biosyst Eng 198:338–349

    Article  Google Scholar 

  27. Okinda C, Sun Y, Nyalala I, Korohou T, Opiyo S, Wang J, Shen M (2020) Egg volume estimation based on image processing and computer vision. J Food Eng 283:110041

    Article  Google Scholar 

  28. Pandey S (2020) A comparative study of 2D-to-3D reconstruction techniques. In Intelligent Communication, Control and Devices, Springer, Singapore 255–263

  29. Rematas K, Nguyen CH, Ritschel T, Fritz M, Tuytelaars T (Aug. 2016) Novel views of objects from a single image. IEEE Trans Pattern Anal Mach Intell 39(8):1576–1590

    Article  Google Scholar 

  30. Su Z, Zhou T, Li K, Brady D, Liu Y (2020) View synthesis from multi-view RGB data using multilayered representation and volumetric estimation. Virtual Real Intell Hardw 2(1):43–55

    Article  Google Scholar 

  31. Sun P, Wu S, Lin K (2020) Attention-guided multi-view stereo network for depth estimation. In 2020 IEEE 22nd international conference on high performance computing and communications; IEEE 18th international conference on Smart City; IEEE 6th international conference on data science and systems (HPCC/SmartCity/DSS), 808-815

  32. Tiwari A (2019) Nondestructive methods for size determination of fruits and vegetables. In Processing of Fruits and Vegetables, Apple Academic Press 203–221

  33. Tosi F, Aleotti F, Poggi M, Mattoccia S (2019) Learning monocular depth estimation infusing traditional stereo knowledge. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition 9799-9809

  34. Wu X, Wang H, Liu C, Jia Y (June 2015) Cross-view action recognition over heterogeneous feature spaces, proceedings of the IEEE international conference on computer vision, 609-616

  35. Xie H, Yao H, Sun X, Zhou S, Zhang S (2019) Pix2vox: context-aware 3d reconstruction from single and multi-view images. In proceedings of the IEEE/CVF international conference on computer vision, 2690-2698

  36. Xie H, Yao H, Zhang S, Zhou S, Sun W (2020) Pix2Vox++: multi-scale context-aware 3D object reconstruction from single and multiple images. Int J Comput Vis 128(12):2919–2935

    Article  Google Scholar 

  37. Xu Q, Wang W, Ceylan D, Mech R, Neumann U (2019) Disn: deep implicit surface network for high-quality single-view 3d reconstruction. Adv Neural Inf Proces Syst 32:1–11

  38. Yang H-C, Chen P-H, Chen K-W, Lee C-Y, Chen Y-S (2020) Fade: feature aggregation for depth estimation with multi-view stereo. IEEE Trans Image Process 29:6590–6600

    Article  MATH  Google Scholar 

  39. Yang Z, Yu H, Cao S, Xu Q, Yuan D, Zhang H, Sun M (2021) Human-mimetic estimation of food volume from a single-view RGB image using an AI system. Electronics 10(13):1556

    Article  Google Scholar 

  40. Yu A, Guo W, Liu B, Chen X, Wang X, Cao X, Jiang B (2021) Attention aware cost volume pyramid based multi-view stereo network for 3D reconstruction. ISPRS J Photogramm Remote Sens 175:448–460

    Article  Google Scholar 

  41. Zanfir A, Marinoiu E, Sminchisescu C(2018) Monocular 3d pose and shape estimation of multiple people in natural scenes-the importance of multiple scene constraints. In proceedings of the IEEE conference on computer vision and pattern recognition, 2148-2157

  42. Zhao S, Fu H, Gong M, Tao D (2019) Geometry-aware symmetric domain adaptation for monocular depth estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 9788–9798

Download references

Funding

No funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Radhamadhab Dalai.

Ethics declarations

Conflict of interest

Authors declared that there is no conflict of Interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dalai, R., Dalai, N. & Senapati, K.K. An accurate volume estimation on single view object images by deep learning based depth map analysis and 3D reconstruction. Multimed Tools Appl 82, 28235–28258 (2023). https://doi.org/10.1007/s11042-023-14615-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14615-7

Keywords

Navigation