YOLO-LF: a lightweight multi-scale feature fusion algorithm for wheat spike detection

Zhou, Shuren; Long, Shengzhen

doi:10.1007/s11554-024-01529-2

YOLO-LF: a lightweight multi-scale feature fusion algorithm for wheat spike detection

Research
Published: 08 August 2024

Volume 21, article number 148, (2024)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Shuren Zhou¹ &
Shengzhen Long¹

279 Accesses
Explore all metrics

Abstract

Wheat is one of the most significant crops in China, as its yield directly affects the country’s food security. Due to its dense, overlapping, and relatively fuzzy distribution, wheat spikes are prone to being missed in practical detection. Existing object detection models suffer from large model size, high computational complexity, and long computation times. Consequently, this study proposes a lightweight real-time wheat spike detection model called YOLO-LF. Initially, a lightweight backbone network is improved to reduce the model size and lower the number of parameters, thereby improving the runtime speed. Second, the structure of the neck is redesigned in the context of the wheat spike dataset to enhance the feature extraction capability of the network for wheat spikes and to achieve lightweightness. Finally, a lightweight detection head was designed to significantly reduce the FLOPs of the model and achieve further lightweighting. Experimental results on the test set indicate that the size of our model is 1.7 MB, the number of parameters is 0.76 M, and the FLOPs are 2.9, which represent reductions of 73, 74, and 64% compared to YOLOv8n, respectively. Our model demonstrates a latency of 8.6 ms and an FPS of 115 on Titan X, whereas YOLOv8n has a latency of 10.2 ms and an FPS of 97 on the same hardware. In contrast, our model is more lightweight and faster to detect, while the mAP@0.5 only decreases by 0.9%, outperforming YOLOv8 and other mainstream detection networks in overall performance. Consequently, our model can be deployed on mobile devices to provide effective assistance in the real-time detection of wheat spikes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

UAV remote sensing detection and target recognition based on SCP-YOLO

Article 20 June 2024

A generalized model for accurate wheat spike detection and counting in complex scenarios

Article Open access 15 October 2024

WheatLFANet: in-field detection and counting of wheat heads with high-real-time global regression network

Article Open access 04 October 2023

Data availability

No datasets were generated or analysed during the current study.

References

Wen, C., Wu, J., Chen, H., Su, H., Chen, X., Li, Z., Yang, C.: Wheat spike detection and counting in the field based on SpikeRetinaNet. Front. Plant Sci. 13, 821717 (2022)
Article Google Scholar
Alkhudaydi, T., Lglesia, B.: Counting spikelets from infield wheat crop images using fully convolutional networks. Neural Comput. Appl. 34(20), 17539–17560 (2022)
Article Google Scholar
Ullah, E., Ullah, M., Sajjad, M., Cheikh, F.A.: Deep learning based wheat ears count in robot images for wheat phenotyping. Electron. Imaging 34, 1–6 (2022)
Article Google Scholar
Dandrifosse, S., Ennadifi, E., Carlier, A., Gosselin, B., Dumont, B., Mercatoris, B.: Deep learning for wheat ear segmentation and ear density measurement: from heading to maturity. Comput. Electron. Agric. 199, 107161 (2022)
Article Google Scholar
Zaji, A., Liu, Z., Xiao, G., Bhowmik, P., Sangha, J.S., Ruan, Y.: AutoOLA: automatic object level augmentation for wheat spikes counting. Comput. Electron. Agric. 205, 107623 (2023)
Article Google Scholar
Fernandez-Gallego, J.A., Lootens, P., Borra-Serrano, I., Derycke, V., Haesaert, G., Roldán-Ruiz, I., Araus, J.L., Kefauver, S.C.: Automatic wheat ear counting using machine learning based on RGB UAV imagery. Plant J. 103(4), 1603–1613 (2020)
Article Google Scholar
Zaji, A., Liu, Z., Xiao, G., Bhowmik, P., Sangha, J.S., Ruan, Y.: Wheat spike localization and counting via hybrid UNet architectures. Comput. Electron. Agric. 203, 107439 (2022)
Article Google Scholar
Dong, L., Guangqiao, C., Yibai, L., Cong, C.: Recognition and counting of wheat ears at flowering stage of heading poplar based on color features. J. Chin. Agric. Mech. 42(11), 97 (2021)
Google Scholar
Xu, X., Geng, Q., Gao, F., Xiong, D., Qiao, H., Ma, X.: Segmentation and counting of wheat spike grains based on deep learning and textural feature. Plant Methods 19(1), 77 (2023)
Article Google Scholar
Li, H., Di, L., Zhang, C., Lin, L., Guo, L.: Improvement of in-season crop mapping for Illinois cropland using multiple machine learning classifiers. In: 2022 10th International Conference on Agro-geoinformatics (Agro-Geoinformatics), pp. 1–6. IEEE (2022)
Fourati, F., Mseddi, W.S., Attia, R.: Wheat head detection using deep, semi-supervised and ensemble learning. Can. J. Remote. Sens. 47(2), 198–208 (2021)
Article Google Scholar
Bhagat, S., Kokare, M., Haswani, V., Hambarde, P., Kamble, R.: WheatNet-lite: a novel light weight network for wheat head detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1332–1341 (2021)
Lin, L., Di, L., Zhang, C., Guo, L., Di, Y., Li, H., Yang, A.: Validation and refinement of cropland data layer using a spatial-temporal decision tree algorithm. Sci. Data 9(1), 63 (2022)
Article Google Scholar
Weiss, M., Jacob, F., Duveiller, G.: Remote sensing for agricultural applications: a meta-review. Remote Sens. Environ. 236, 111402 (2020)
Article Google Scholar
Guo, H.: Wheat head counting by estimating a density map with convolutional neural networks. arXiv preprint (2023). arXiv:2303.10542
Zhao, J., Yan, J., Xue, T., Wang, S., Qiu, X., Yao, X., Tian, Y., Zhu, Y., Cao, W., Zhang, X.: A deep learning method for oriented and small wheat spike detection (OSWSDET) in UAV images. Comput. Electron. Agric. 198, 107087 (2022)
Article Google Scholar
Laabassi, K., Belarbi, M.A., Mahmoudi, S., Mahmoudi, S.A., Ferhat, K.: Wheat varieties identification based on a deep learning approach. J. Saudi Soc. Agric. Sci. 20(5), 281–289 (2021)
Google Scholar
Misra, T., Arora, A., Marwaha, S., Jha, R.R., Ray, M., Jain, R., Rao, A., Varghese, E., Kumar, S., Kumar, S., et al.: Web-SpikeSegNet: deep learning framework for recognition and counting of spikes from visual images of wheat plants. IEEE Access 9, 76235–76247 (2021)
Article Google Scholar
Qiu, R., He, Y., Zhang, M.: Automatic detection and counting of wheat spikelet using semi-automatic labeling and deep learning. Front. Plant Sci. 13, 872555 (2022)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer (2016)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
He, M.-X., Hao, P., Xin, Y.-Z.: A robust method for wheatear detection using UAV in natural scenes. IEEE Access 8, 189043–189053 (2020)
Article Google Scholar
Xiang, S., Wang, S., Xu, M., Wang, W., Liu, W.: YOLO POD: a fast and accurate multi-task model for dense soybean POD counting. Plant Methods 19(1), 8 (2023)
Article Google Scholar
Khaki, S., Safaei, N., Pham, H., Wang, L.: WheatNet: a lightweight convolutional neural network for high-throughput image-based wheat head detection and counting. Neurocomputing 489, 78–89 (2022)
Article Google Scholar
Ye, J., Yu, Z., Wang, Y., Lu, D., Zhou, H.: WheatLFANet: in-field detection and counting of wheat heads with high-real-time global regression network. Plant Methods 19(1), 103 (2023)
Article Google Scholar
Sangeetha, J., Govindarajan, P.: Prediction of agricultural waste compost maturity using fast regions with convolutional neural network (R-CNN). Mater. Today Proc. (2023)
Zhang, J., Min, A., Steffenson, B.J., Su, W.-H., Hirsch, C.D., Anderson, J., Wei, J., Ma, Q., Yang, C.: Wheat-Net: an automatic dense wheat spike segmentation method based on an optimized hybrid task cascade model. Front. Plant Sci. 13, 834938 (2022)
Article Google Scholar
Li, L., Hassan, M.A., Yang, S., Jing, F., Yang, M., Rasheed, A., Wang, J., Xia, X., He, Z., Xiao, Y.: Development of image-based wheat spike counter through a faster R-CNN algorithm and application for genetic studies. Crop J. 10(5), 1303–1311 (2022)
Article Google Scholar
Im Choi, J., Tian, Q.: Visual-saliency-guided channel pruning for deep visual detectors in autonomous driving. In: 2023 IEEE Intelligent Vehicles Symposium (IV), pp. 1–6. IEEE (2023)
Wang, D., Zhang, D., Yang, G., Xu, B., Luo, Y., Yang, X.: SSRNet: in-field counting wheat ears using multi-stage convolutional neural network. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2021)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28. (2015)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint (2017). arXiv:1704.04861
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., et al.: Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)
Mehta, S., Rastegari, M.: MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint (2021). arXiv:2110.02178
Yu, W., Luo, M., Zhou, P., Si, C., Zhou, Y., Wang, X., Feng, J., Yan, S.: MetaFormer is actually what you need for vision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10819–10829 (2022)
Maaz, M., Shaker, A., Cholakkal, H., Khan, S., Zamir, S.W., Anwer, R.M., Shahbaz Khan, F.: EdgeNext: efficiently amalgamated CNN-transformer architecture for mobile vision applications. In: European Conference on Computer Vision, pp. 3–20. Springer (2022)
Chen, J., Kao, S.-h., He, H., Zhuo, W., Wen, S., Lee, C.-H., Chan, S.-H.G.: Run, don’t walk: chasing higher flops for faster neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12021–12031 (2023)
Wang, A., Chen, H., Lin, Z., Pu, H., Ding, G.: RepViT: revisiting mobile CNN from vit perspective. arXiv preprint (2023). arXiv:2307.09283
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOV7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25. (2012)

Download references

Acknowledgements

This work was supported in part by the Humanities and Social Sciences Planning Fund Projects of Ministry of Education of China under Grant 23YJAZH226, "Research on the Development Path of Artificial Intelligence Based on ChatGPT-like Generated Content", 2023-09 2026-08, and the Hunan Provincial Natural Science Foundation of China under Grant (2024JJ5042, 2023JJ30050).

Author information

Authors and Affiliations

Changsha University of Science and Technology, Changsha, China
Shuren Zhou & Shengzhen Long

Authors

Shuren Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Shengzhen Long
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Sr.Z provided financial support, conceived the experimental ideas, proposed the main methods of this paper, and reviewed and revised the first draft. Sz.L prepared the dataset, conducted the experiments, wrote the initial draft of the paper, and created the figures and experimental visualizations. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Shuren Zhou.

Ethics declarations

Conflict of interest

The authors declare no competing interests. Financial and personal relationships: We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, S., Long, S. YOLO-LF: a lightweight multi-scale feature fusion algorithm for wheat spike detection. J Real-Time Image Proc 21, 148 (2024). https://doi.org/10.1007/s11554-024-01529-2

Download citation

Received: 23 June 2024
Accepted: 25 July 2024
Published: 08 August 2024
DOI: https://doi.org/10.1007/s11554-024-01529-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

YOLO-LF: a lightweight multi-scale feature fusion algorithm for wheat spike detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

UAV remote sensing detection and target recognition based on SCP-YOLO

A generalized model for accurate wheat spike detection and counting in complex scenarios

WheatLFANet: in-field detection and counting of wheat heads with high-real-time global regression network

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now