Abstract
Three-dimensional reconstruction is an important method for recovering the morphological structure of plants, and a complete and accurate 3D point cloud of plants can better reflect the phenotypic parameters of plants, such as plant height, volume of the plant and leaf area. In order to obtain more complete and accurate plant point clouds, this paper proposes a series of enhancements based on the Cascade-MVSNet network. To improve the reconstruction of weakly textured areas of plants, a lightweight attention mechanism is incorporated into the feature extraction stage. To enhance the valid points of the plant point clouds and suppress invalid points, focal loss is employed as the loss function, treating the depth estimation problem as a classification task to obtain more accurate depth information. Additionally, to improve the efficiency of the model, we substitute standard convolutions with depth-wise separable convolutions, reducing the number of parameters and computational complexity while maintaining performance. The above work is applied to the plant dataset and the reconstructions showed a significant reduction in invalid points and a clearer reconstructed point cloud.In addition to conducting research on the plant dataset, we also evaluate our approach on the publicly available DTU dataset. Experimental results demonstrate a noticeable increase in reconstruction completeness on the DTU dataset, achieving competitive performance overall.
This work was supported by the National Key R&D Program of China (2022YFD2002304).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vision Image Underst. 110(3), 346–359 (2008). https://doi.org/10.1016/j.cviu.2007.09.014. https://www.sciencedirect.com/science/article/pii/S1077314207001555
Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 873–881 (2015). https://doi.org/10.1109/ICCV.2015.106
Gu, X., Fan, Z., Dai, Z., Zhu, S., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching (2020)
Hartley, R.I., Sturm, P.: Triangulation. Comput. Vis. Image Underst. 68(2), 146–157 (1997). https://doi.org/10.1006/cviu.1997.0547
Heckbert, P.S.: Survey of texture mapping. IEEE Comput. Graph. Appl. 6(11), 56–67 (1986). https://doi.org/10.1109/MCG.1986.276672
Hold-Geoffroy, Y., Sunkavalli, K., Hadap, S., Gambaretto, E., Lalonde, J.F.: Deep outdoor illumination estimation (2018)
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks (2019)
Igehy, H., Eldridge, M., Proudfoot, K.: Prefetching in a texture cache architecture. In: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, HWWS 1998, p. 133-ff. Association for Computing Machinery, New York (1998). https://doi.org/10.1145/285305.285321
Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., Aanæs, H.: Large scale multi-view stereopsis evaluation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 406–413 (2014). https://doi.org/10.1109/CVPR.2014.59
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection (2018)
Liu, Y., Shen, S.: Self-adaptive single and multi-illuminant estimation framework based on deep learning (2019)
Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999). https://doi.org/10.1109/ICCV.1999.790410
Luo, K., Guan, T., Ju, L., Huang, H., Luo, Y.: P-MVSNet: learning patch-wise matching confidence aggregation for multi-view stereo. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10451–10460 (2019). https://doi.org/10.1109/ICCV.2019.01055
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571 (2011). https://doi.org/10.1109/ICCV.2011.6126544
Schönberger, J.L., Zheng, E., Frahm, J.-M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 501–518. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_31
Wei, Z., Zhu, Q., Min, C., Chen, Y., Wang, G.: AA-RMVSNet: adaptive aggregation recurrent multi-view stereo network (2021)
Xue, Y., et al.: MVSCRF: learning multi-view stereo with conditional random fields. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4311–4320 (2019). https://doi.org/10.1109/ICCV.2019.00441
Yang, J., Mao, W., Alvarez, J.M., Liu, M.: Cost volume pyramid based depth inference for multi-view stereo (2020)
Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: depth inference for unstructured multi-view stereo (2018)
Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T., Quan, L.: Recurrent MVSNet for high-resolution multi-view stereo depth inference. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5520–5529 (2019). https://doi.org/10.1109/CVPR.2019.00567
Yu, Z., Gao, S.: Fast-MVSNet: sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement (2020)
Zhang, J., Yao, Y., Li, S., Luo, Z., Fang, T.: Visibility-aware multi-view stereo network (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ren, H., Zhu, J., Chen, L., Jiang, X., Xie, K., Zhai, R. (2024). Three-Dimensional Plant Reconstruction with Enhanced Cascade-MVSNet. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14426. Springer, Singapore. https://doi.org/10.1007/978-981-99-8432-9_23
Download citation
DOI: https://doi.org/10.1007/978-981-99-8432-9_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8431-2
Online ISBN: 978-981-99-8432-9
eBook Packages: Computer ScienceComputer Science (R0)