Abstract
Object detection in unmanned aerial vehicle (UAV) images presents challenges such as high altitudes, small object sizes, and complex backgrounds. Additionally, many deep learning object detection algorithms require substantial computational resources, making them difficult to deploy on embedded devices with limited memory and processing power, which affects the effectiveness of drones in task execution. To tackle these issues, we propose the LightUAV-YOLO algorithm which is a lightweight object detection algorithm for UAVs based on YOLOv8n. We modified the neck structure of YOLOv8, enhancing the network’s capability to detect small objects. To further optimize features fusion at different scales, we designed the orthogonal feature enhancement module (OFEM) which replaces simple concatenation for better feature representation. We also designed the local attention module (LAM) to effectively filter out irrelevant interference. The module enables our model better focus on important areas and further enhancing the model’s robustness. Results demonstrate that our proposed LightUAV-YOLO algorithm achieves a 6.4 and 3.9% improvement in mAP50 and mAP50:95, respectively, on the VisDrone test dataset compared to the YOLOv8-nano. Meanwhile, the model maintains a low parameter count and computational complexity. Furthermore, we conducted extensive experiments on the UAVDT dataset, and our method consistently exhibited favorable results. This model not only meets accuracy requirements but also considers the lightweight requirements.
Similar content being viewed by others
Data availability
The datasets on which the study is based were accessed from websites and are available for downloading through the following link: Visdrone2021 dataset (http://aiskyeye.com/visdrone-2020/), UAVDT dataset (https://sites.google.com/view/grli-uavdt).
References
Jia X, Tong Y, Qiao H, Li M, Tong J, Liang B (2023) Fast and accurate object detector for autonomous driving based on improved yolov5. Sci Rep 13(1):9711
Teja Y (2023) Static object detection for video surveillance. Multimed Tools Appl 82(14):21627–21639
Zhao H, Zhang H, Zhao Y (2023) Yolov7-Sea: Object Detection of Maritime UAV Images Based on Improved yolov7. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 233–238
Zhai X, Huang Z, Li T, Liu H, Wang S (2023) Yolo-drone: an optimized yolov8 network for tiny uav object detection. Electronics 12(17):3664
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst 28
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You Only Look Once: Unified, Real-Time Object Detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Redmon J, Farhadi A(2017) Yolo9000: Better, Faster, Stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Bochkovskiy A, Wang C-Y, Liao H-YM(2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Wang C-Y, Bochkovskiy A, Liao H-YM(2023) Yolov7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single Shot Multibox Detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp 21–37 . Springer
Diwan T, Anirudh G, Tembhurne JV (2023) Object detection using yolo: challenges, architectural successors, datasets and applications. Multimed Tools Appl 82(6):9243–9275
Zhao Q, Liu B, Lyu S, Wang C, Zhang H (2023) Tph-yolov5++: boosting object detection on drone-captured scenarios with cross-layer asymmetric transformer. Remote Sens 15(6):1687
Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q (2018) The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking. In: Proceedings of the European Conference on Computer Vision (ECCV)
Cao Y, He Z, Wang L, Wang W, Yuan Y, Zhang D, Zhang J, Zhu P, Van Gool L, Han J (2021) Visdrone-det2021: The Vision Meets Drone Object Detection Challenge Results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2847–2854
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature Pyramid Networks for Object Detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path Aggregation Network for Instance Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8759–8768
Kong T, Sun F, Tan C, Liu H, Huang W (2018) Deep Feature Pyramid Reconfiguration for Object Detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 169–185
Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra r-cnn: Towards Balanced Learning for Object Detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 821–830
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and Efficient Object Detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10781–10790
Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2011) You Only Look One-Level Feature. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13039–13048
Li Y-l, Feng Y, Zhou M-l, Xiong X-c, Wang Y-h, Qiang B-h (2024) Dma-yolo: multi-scale object detection method with attention mechanism for aerial images. The Visual Comput 40(6):4505–4518
Zhang Z (2023) Drone-yolo: an efficient neural network method for target detection in drone images. Drones 7(8):526
Zhong R, Peng E, Li Z, Ai Q, Han T, Tang Y (2024) Spd-yolov8: an small-size object detection model of uav imagery in complex scene. The J Supercomput 1–21
Zeng S, Yang W, Jiao Y, Geng L, Chen X (2024) Sca-yolo: a new small object detection model for uav images. The Visual Comput 40(3):1787–1803
Gong Y, Yu X, Ding Y, Peng X, Zhao J, Han Z (2021) Effective Fusion Factor in fpn for Tiny Object Detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1160–1168
Wang M, Yang W, Wang L, Chen D, Wei F, KeZiErBieKe H, Liao Y (2023) Fe-yolov5: feature enhancement network based on yolov5 for small object detection. J Vis Commun Image Rep 90:103752
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W, et al (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976
Ghiasi G, Lin T-Y, Le QV (2019) Nas-fpn: Learning Scalable Feature Pyramid Architecture for Object Detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7036–7045
Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-Aware Trident Networks for Object Detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6054–6063
Chen K, Cao Y, Loy CC, Lin D, Feichtenhofer C (2020) Feature pyramid grids. arXiv preprint arXiv:2004.03580
Xu X, Jiang Y, Chen W, Huang Y, Zhang Y, Sun, X (2022) Damo-yolo: a report on real-time object detection design. arXiv preprint arXiv:2211.15444
Yang G, Lei J, Zhu Z, Cheng S, Feng Z, Liang R (2023) Afpn: Asymptotic Feature Pyramid Network for Object Detection. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 2184–2189 . IEEE
Fan Q, Li Y, Deveci M, Zhong K, Kadry S (2024) Lud-yolo: a novel lightweight object detection network for unmanned aerial vehicle. Inform Sci 121366
Chen N, Li Y, Yang Z, Lu Z, Wang S, Wang J (2023) Lodnu: lightweight object detection network in uav vision. The J Supercompu 79(9):10117–10138
Hu J, Shen L, Sun G (2018) Squeeze-and-Excitation Networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Liu Y, Shao Z, Hoffmann N (2021) Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv preprint arXiv:2112.05561
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional Block Attention Module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: Efficient Channel Attention for Deep Convolutional Neural Networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11534–11542
Yang L, Zhang R-Y, Li L, Xie X (2021) Simam: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks. In: International Conference on Machine Learning, pp 11863–11874 . PMLR
Wan D, Lu R, Shen S, Xu T, Lang X, Ren Z (2023) Mixed local channel attention for object detection. Eng Appl Artif Intell 123:106442
Yang M, He D, Fan M, Shi B, Xue X, Li F, Ding E, Huang J (2021) Dolg: Single-Stage Image Retrieval With Deep Orthogonal Fusion of Local and Global Features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 11772–11781
Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. IEEE Trans Pattern Anal Mach Intell 41(7):1655–1668
Yu W, Yang T, Chen C (2021) Towards Resolving the Challenge of Long-Tail Distribution in UAV Images for Object Detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3258–3267
Albaba BM, Ozer S (2021) Synet: An Ensemble Network for Object Detection in UAV Images. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp 10227–10234 . IEEE
Du D, Zhu P, Wen L, Bian X, Lin H, Hu Q, Peng T, Zheng J, Wang X, Zhang Y (2019) Visdrone-det2019: The Vision Meets Drone Object Detection in Image Challenge Results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp 0–0
Zhao H, Zhou Y, Zhang L, Peng Y, Hu X, Peng H, Cai X (2020) Mixed yolov3-lite: a lightweight real-time object detection method. Sensors 20(7):1861
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-Cam: Visual Explanations from Deep Networks Via Gradient-Based Localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 618–626
Zhang Y, Wu C, Guo W, Zhang T, Li W (2023) Cfanet: efficient detection of uav image based on cross-layer feature aggregation. IEEE Transactions on Geoscience and Remote Sensing
Du B, Huang Y, Chen J, Huang D (2023) Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13435–13444
Shi Y, Wang C, Xu S, Yuan M-D, Liu F, Zhang L (2024) Deformable convolution-guided multiscale feature learning and fusion for uav object detection. IEEE Geoscience and Remote Sensing Letters
Author information
Authors and Affiliations
Contributions
Conceptualization, Y.L. and X.L.; methodology, Y.L.; software, X.L. and T.Z.; validation, Y.L. and T.Z.; formal analysis, Y.L. and T.Z.; investigation, T.Z.; resources, G.S.; writing---original draft preparation, Y.L.; writing---review and editing, Y.L. and T.Z.; visualization, Y.L. and T.Z.; supervision, G.S.; project administration, G.S.; funding acquisition, G.S. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lyu, Y., Zhang, T., Li, X. et al. LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image. J Supercomput 81, 105 (2025). https://doi.org/10.1007/s11227-024-06611-x
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06611-x