Skip to main content

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Object detection in unmanned aerial vehicle (UAV) images presents challenges such as high altitudes, small object sizes, and complex backgrounds. Additionally, many deep learning object detection algorithms require substantial computational resources, making them difficult to deploy on embedded devices with limited memory and processing power, which affects the effectiveness of drones in task execution. To tackle these issues, we propose the LightUAV-YOLO algorithm which is a lightweight object detection algorithm for UAVs based on YOLOv8n. We modified the neck structure of YOLOv8, enhancing the network’s capability to detect small objects. To further optimize features fusion at different scales, we designed the orthogonal feature enhancement module (OFEM) which replaces simple concatenation for better feature representation. We also designed the local attention module (LAM) to effectively filter out irrelevant interference. The module enables our model better focus on important areas and further enhancing the model’s robustness. Results demonstrate that our proposed LightUAV-YOLO algorithm achieves a 6.4 and 3.9% improvement in mAP50 and mAP50:95, respectively, on the VisDrone test dataset compared to the YOLOv8-nano. Meanwhile, the model maintains a low parameter count and computational complexity. Furthermore, we conducted extensive experiments on the UAVDT dataset, and our method consistently exhibited favorable results. This model not only meets accuracy requirements but also considers the lightweight requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability

The datasets on which the study is based were accessed from websites and are available for downloading through the following link: Visdrone2021 dataset (http://aiskyeye.com/visdrone-2020/), UAVDT dataset (https://sites.google.com/view/grli-uavdt).

References

  1. Jia X, Tong Y, Qiao H, Li M, Tong J, Liang B (2023) Fast and accurate object detector for autonomous driving based on improved yolov5. Sci Rep 13(1):9711

    Article  Google Scholar 

  2. Teja Y (2023) Static object detection for video surveillance. Multimed Tools Appl 82(14):21627–21639

    Article  Google Scholar 

  3. Zhao H, Zhang H, Zhao Y (2023) Yolov7-Sea: Object Detection of Maritime UAV Images Based on Improved yolov7. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 233–238

  4. Zhai X, Huang Z, Li T, Liu H, Wang S (2023) Yolo-drone: an optimized yolov8 network for tiny uav object detection. Electronics 12(17):3664

    Article  Google Scholar 

  5. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587

  6. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448

  7. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst 28

  8. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You Only Look Once: Unified, Real-Time Object Detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788

  9. Redmon J, Farhadi A(2017) Yolo9000: Better, Faster, Stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271

  10. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767

  11. Bochkovskiy A, Wang C-Y, Liao H-YM(2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934

  12. Wang C-Y, Bochkovskiy A, Liao H-YM(2023) Yolov7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475

  13. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single Shot Multibox Detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp 21–37 . Springer

  14. Diwan T, Anirudh G, Tembhurne JV (2023) Object detection using yolo: challenges, architectural successors, datasets and applications. Multimed Tools Appl 82(6):9243–9275

  15. Zhao Q, Liu B, Lyu S, Wang C, Zhang H (2023) Tph-yolov5++: boosting object detection on drone-captured scenarios with cross-layer asymmetric transformer. Remote Sens 15(6):1687

    Article  Google Scholar 

  16. Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q (2018) The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking. In: Proceedings of the European Conference on Computer Vision (ECCV)

  17. Cao Y, He Z, Wang L, Wang W, Yuan Y, Zhang D, Zhang J, Zhu P, Van Gool L, Han J (2021) Visdrone-det2021: The Vision Meets Drone Object Detection Challenge Results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2847–2854

  18. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature Pyramid Networks for Object Detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125

  19. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path Aggregation Network for Instance Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8759–8768

  20. Kong T, Sun F, Tan C, Liu H, Huang W (2018) Deep Feature Pyramid Reconfiguration for Object Detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 169–185

  21. Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra r-cnn: Towards Balanced Learning for Object Detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 821–830

  22. Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and Efficient Object Detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10781–10790

  23. Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2011) You Only Look One-Level Feature. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13039–13048

  24. Li Y-l, Feng Y, Zhou M-l, Xiong X-c, Wang Y-h, Qiang B-h (2024) Dma-yolo: multi-scale object detection method with attention mechanism for aerial images. The Visual Comput 40(6):4505–4518

    Article  Google Scholar 

  25. Zhang Z (2023) Drone-yolo: an efficient neural network method for target detection in drone images. Drones 7(8):526

    Article  Google Scholar 

  26. Zhong R, Peng E, Li Z, Ai Q, Han T, Tang Y (2024) Spd-yolov8: an small-size object detection model of uav imagery in complex scene. The J Supercomput 1–21

  27. Zeng S, Yang W, Jiao Y, Geng L, Chen X (2024) Sca-yolo: a new small object detection model for uav images. The Visual Comput 40(3):1787–1803

    Article  Google Scholar 

  28. Gong Y, Yu X, Ding Y, Peng X, Zhao J, Han Z (2021) Effective Fusion Factor in fpn for Tiny Object Detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1160–1168

  29. Wang M, Yang W, Wang L, Chen D, Wei F, KeZiErBieKe H, Liao Y (2023) Fe-yolov5: feature enhancement network based on yolov5 for small object detection. J Vis Commun Image Rep 90:103752

    Article  Google Scholar 

  30. Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W, et al (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976

  31. Ghiasi G, Lin T-Y, Le QV (2019) Nas-fpn: Learning Scalable Feature Pyramid Architecture for Object Detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7036–7045

  32. Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-Aware Trident Networks for Object Detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6054–6063

  33. Chen K, Cao Y, Loy CC, Lin D, Feichtenhofer C (2020) Feature pyramid grids. arXiv preprint arXiv:2004.03580

  34. Xu X, Jiang Y, Chen W, Huang Y, Zhang Y, Sun, X (2022) Damo-yolo: a report on real-time object detection design. arXiv preprint arXiv:2211.15444

  35. Yang G, Lei J, Zhu Z, Cheng S, Feng Z, Liang R (2023) Afpn: Asymptotic Feature Pyramid Network for Object Detection. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 2184–2189 . IEEE

  36. Fan Q, Li Y, Deveci M, Zhong K, Kadry S (2024) Lud-yolo: a novel lightweight object detection network for unmanned aerial vehicle. Inform Sci 121366

  37. Chen N, Li Y, Yang Z, Lu Z, Wang S, Wang J (2023) Lodnu: lightweight object detection network in uav vision. The J Supercompu 79(9):10117–10138

    Article  Google Scholar 

  38. Hu J, Shen L, Sun G (2018) Squeeze-and-Excitation Networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141

  39. Liu Y, Shao Z, Hoffmann N (2021) Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv preprint arXiv:2112.05561

  40. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional Block Attention Module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19

  41. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: Efficient Channel Attention for Deep Convolutional Neural Networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11534–11542

  42. Yang L, Zhang R-Y, Li L, Xie X (2021) Simam: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks. In: International Conference on Machine Learning, pp 11863–11874 . PMLR

  43. Wan D, Lu R, Shen S, Xu T, Lang X, Ren Z (2023) Mixed local channel attention for object detection. Eng Appl Artif Intell 123:106442

    Article  Google Scholar 

  44. Yang M, He D, Fan M, Shi B, Xue X, Li F, Ding E, Huang J (2021) Dolg: Single-Stage Image Retrieval With Deep Orthogonal Fusion of Local and Global Features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 11772–11781

  45. Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. IEEE Trans Pattern Anal Mach Intell 41(7):1655–1668

    Article  Google Scholar 

  46. Yu W, Yang T, Chen C (2021) Towards Resolving the Challenge of Long-Tail Distribution in UAV Images for Object Detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3258–3267

  47. Albaba BM, Ozer S (2021) Synet: An Ensemble Network for Object Detection in UAV Images. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp 10227–10234 . IEEE

  48. Du D, Zhu P, Wen L, Bian X, Lin H, Hu Q, Peng T, Zheng J, Wang X, Zhang Y (2019) Visdrone-det2019: The Vision Meets Drone Object Detection in Image Challenge Results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp 0–0

  49. Zhao H, Zhou Y, Zhang L, Peng Y, Hu X, Peng H, Cai X (2020) Mixed yolov3-lite: a lightweight real-time object detection method. Sensors 20(7):1861

    Article  Google Scholar 

  50. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-Cam: Visual Explanations from Deep Networks Via Gradient-Based Localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 618–626

  51. Zhang Y, Wu C, Guo W, Zhang T, Li W (2023) Cfanet: efficient detection of uav image based on cross-layer feature aggregation. IEEE Transactions on Geoscience and Remote Sensing

  52. Du B, Huang Y, Chen J, Huang D (2023) Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13435–13444

  53. Shi Y, Wang C, Xu S, Yuan M-D, Liu F, Zhang L (2024) Deformable convolution-guided multiscale feature learning and fusion for uav object detection. IEEE Geoscience and Remote Sensing Letters

Download references

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, Y.L. and X.L.; methodology, Y.L.; software, X.L. and T.Z.; validation, Y.L. and T.Z.; formal analysis, Y.L. and T.Z.; investigation, T.Z.; resources, G.S.; writing---original draft preparation, Y.L.; writing---review and editing, Y.L. and T.Z.; visualization, Y.L. and T.Z.; supervision, G.S.; project administration, G.S.; funding acquisition, G.S. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Gang Shi.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lyu, Y., Zhang, T., Li, X. et al. LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image. J Supercomput 81, 105 (2025). https://doi.org/10.1007/s11227-024-06611-x

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06611-x

Keywords