Skip to main content

Three-Dimensional Plant Reconstruction with Enhanced Cascade-MVSNet

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14426))

Included in the following conference series:

  • 912 Accesses

Abstract

Three-dimensional reconstruction is an important method for recovering the morphological structure of plants, and a complete and accurate 3D point cloud of plants can better reflect the phenotypic parameters of plants, such as plant height, volume of the plant and leaf area. In order to obtain more complete and accurate plant point clouds, this paper proposes a series of enhancements based on the Cascade-MVSNet network. To improve the reconstruction of weakly textured areas of plants, a lightweight attention mechanism is incorporated into the feature extraction stage. To enhance the valid points of the plant point clouds and suppress invalid points, focal loss is employed as the loss function, treating the depth estimation problem as a classification task to obtain more accurate depth information. Additionally, to improve the efficiency of the model, we substitute standard convolutions with depth-wise separable convolutions, reducing the number of parameters and computational complexity while maintaining performance. The above work is applied to the plant dataset and the reconstructions showed a significant reduction in invalid points and a clearer reconstructed point cloud.In addition to conducting research on the plant dataset, we also evaluate our approach on the publicly available DTU dataset. Experimental results demonstrate a noticeable increase in reconstruction completeness on the DTU dataset, achieving competitive performance overall.

This work was supported by the National Key R&D Program of China (2022YFD2002304).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vision Image Underst. 110(3), 346–359 (2008). https://doi.org/10.1016/j.cviu.2007.09.014. https://www.sciencedirect.com/science/article/pii/S1077314207001555

  2. Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 873–881 (2015). https://doi.org/10.1109/ICCV.2015.106

  3. Gu, X., Fan, Z., Dai, Z., Zhu, S., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching (2020)

    Google Scholar 

  4. Hartley, R.I., Sturm, P.: Triangulation. Comput. Vis. Image Underst. 68(2), 146–157 (1997). https://doi.org/10.1006/cviu.1997.0547

    Article  Google Scholar 

  5. Heckbert, P.S.: Survey of texture mapping. IEEE Comput. Graph. Appl. 6(11), 56–67 (1986). https://doi.org/10.1109/MCG.1986.276672

    Article  Google Scholar 

  6. Hold-Geoffroy, Y., Sunkavalli, K., Hadap, S., Gambaretto, E., Lalonde, J.F.: Deep outdoor illumination estimation (2018)

    Google Scholar 

  7. Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks (2019)

    Google Scholar 

  8. Igehy, H., Eldridge, M., Proudfoot, K.: Prefetching in a texture cache architecture. In: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, HWWS 1998, p. 133-ff. Association for Computing Machinery, New York (1998). https://doi.org/10.1145/285305.285321

  9. Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., Aanæs, H.: Large scale multi-view stereopsis evaluation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 406–413 (2014). https://doi.org/10.1109/CVPR.2014.59

  10. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection (2017)

    Google Scholar 

  11. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection (2018)

    Google Scholar 

  12. Liu, Y., Shen, S.: Self-adaptive single and multi-illuminant estimation framework based on deep learning (2019)

    Google Scholar 

  13. Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999). https://doi.org/10.1109/ICCV.1999.790410

  14. Luo, K., Guan, T., Ju, L., Huang, H., Luo, Y.: P-MVSNet: learning patch-wise matching confidence aggregation for multi-view stereo. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10451–10460 (2019). https://doi.org/10.1109/ICCV.2019.01055

  15. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571 (2011). https://doi.org/10.1109/ICCV.2011.6126544

  16. Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31

    Chapter  Google Scholar 

  17. Wei, Z., Zhu, Q., Min, C., Chen, Y., Wang, G.: AA-RMVSNet: adaptive aggregation recurrent multi-view stereo network (2021)

    Google Scholar 

  18. Xue, Y., et al.: MVSCRF: learning multi-view stereo with conditional random fields. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4311–4320 (2019). https://doi.org/10.1109/ICCV.2019.00441

  19. Yang, J., Mao, W., Alvarez, J.M., Liu, M.: Cost volume pyramid based depth inference for multi-view stereo (2020)

    Google Scholar 

  20. Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: depth inference for unstructured multi-view stereo (2018)

    Google Scholar 

  21. Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T., Quan, L.: Recurrent MVSNet for high-resolution multi-view stereo depth inference. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5520–5529 (2019). https://doi.org/10.1109/CVPR.2019.00567

  22. Yu, Z., Gao, S.: Fast-MVSNet: sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement (2020)

    Google Scholar 

  23. Zhang, J., Yao, Y., Li, S., Luo, Z., Fang, T.: Visibility-aware multi-view stereo network (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruifang Zhai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ren, H., Zhu, J., Chen, L., Jiang, X., Xie, K., Zhai, R. (2024). Three-Dimensional Plant Reconstruction with Enhanced Cascade-MVSNet. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14426. Springer, Singapore. https://doi.org/10.1007/978-981-99-8432-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8432-9_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8431-2

  • Online ISBN: 978-981-99-8432-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics