Abstract
In UAV aerial photography, the existence of small-size targets, dense distribution and occlusion phenomenon often leads to frequent missed and false detection in the detection process, which has a significant impact on the detection accuracy of the model. To solve this problem, this paper proposes an improved YOLOv9s model, BF-YOLOv9s. First, the application of the BiFormer attention mechanism serves to enhance the model’s concentration on small targets, thereby facilitating the retention of more refined and detailed features. Second, according to the lightweight demand of UAV aerial photography, the RepNCSPELAN4_Ghost module is proposed, which integrates GhostConv into the backbone network RepNCSPELAN4, significantly reducing the computing load and optimizing the use of computing and memory resources. Finally, the BiFPN feature pyramid network is introduced to promote the fusion and exchange of cross-layer information and improve the detection effect. By selecting the Focal WIOU loss function, model convergence is accelerated, the loss is reduced and training efficiency is improved. The experimental results show that BF-YOLOv9s achieves a mAP50 of 41.3% on the VisDrone2019 dataset, outperforming the original YOLOv9s by 5.6%, while also reducing the parameter count by 8.3%.












Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Liu Y, Sun P, Wergeles N et al (2021) A survey and performance evaluation of deep learning methods for small object detection. Expert Syst Appl 172(4):114602
Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” In Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 7464-7475
Liu W, Anguelov D, Erhan D, (2016) SSD: single shot multibox detector, In Computer Vision?ECCV, et al (2016) 14th European Conference, Amsterdam, The Netherlands, October 11 14, 2016, Proceedings, Part I 14. Springer International Publishing 21–37
Lin TY, Goyal P, Girshick R, et al (2017) “Focal Loss for Dense Object Detection,” In Proceedings of the IEEE International Conference on Computer Vision, 2980-2988
He KM, Zhang XY, Ren SQ et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Girshick R (2015) Fast R-CNN, In Proceedings of the IEEE International Conference on Computer Vision, 1440-1448
Ren S, He K, Girshick R et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Luo X, Wu Y, Zhao L (2022) YOLOD: a target detection method for UAV aerial imagery. Remote Sens 14(14):3240
Qian CH, Shen SH, Sun N et al (2023) Research on improved YOLOv5 forest fire detection method based on transformer. Electron Meas Technol 46(16):46–56
Hou H, Chen M, Tie Y et al (2022) A universal landslide detection method in optical remote sensing images based on improved YOLOX. Remote Sens 14(19):4939
Zhang YF, Ren W, Zhang Z et al (2022) Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 506:146–157
Wang Q et al (2021) Ship detection based on fused features and rebuilt YOLOv3 networks in optical remote-sensing images. Int J Remote Sens 42(2):520–536
Ji SJ, Ling QH, Han F (2023) An improved algorithm for small object detection based on YOLOv4 and multi-scale contextual information. Comput Electr Eng 105:108490
Liu W, Quijano K, Crawford MM (2022) YOLOv5-Tassel: detecting tassels in RGB UAV imagery with improved YOLOv5 based on transfer learning’’. IEEE J Sel Topics Appl Earth Observ Remote Sens 15:8085–8094
Qiao SY, Chen LC, Yuille A (2021) DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10208-10219
Liu ZM, Gao GY, Sun L et al (2021) HRDNet: high-resolution detection network for small objects,” In: Proceedings of the IEEE International Conference on Multimedia and Expo,1-6
Cubuk ED, Zoph B, Mane D, et al (2018) Autoaugment: learning augmentation policies from data,” arXiv:1805.09501
Qiao S, Chen L, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville: IEEE, 10213-10224
Zhang Z (2023) Drone-YOLO: an efficient neural network method for target detection in drone images. Drones 7(8):526
Ding X, Zhang X, Ma N et al (2021) RepVGG: making VGG-Style ConvNets great again, In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville: IEEE, 13733-13742
Zhong M, Huang F, Li S (2023) Lightweight YOLOv8: an upgraded you only look once v8 algorithm for small object identification in unmanned aerial vehicle images. Appl Sci 13(22):12369
Li H, Li J, Wei H, et al (2022) Slim-Neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles,’ arXiv:2206.02424
Wang P, Wang W, Wang H (2017) Infrared unmanned aerial vehicle targets detection based on multi-scale filtering and feature fusion, 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 1746-1750
Zhou Y et al (2024) UAV image detection based on multi-scale spatial attention mechanism with hybrid dilated convolution, 2024 3rd International Conference on Image Processing and Media Computing (ICIPMC), Hefei, China, 279-284
Ma R, Liang C (2024) Systematic improvement and analysis of YOLOv8 for multiscale targets in UAV perspective,” 2024 9th International Conference on Computer and Communication Systems (ICCCS), Xi’an, China, 899-905
Ma C, Fu Y, Wang D, Guo R, Zhao X, Fang J (2023) YOLO-UAV: object detection method of unmanned aerial vehicle imagery based on efficient multi-scale feature fusion. IEEE Access 11:126857–126878
Ma S, Xu Y (2023) MPDIoU: a loss for efficient and accurate bounding box regression,” arXiv:2307.07662
Han K, Wang Y, Tian Q, et al (2020) GhostNet: more features from cheap operations,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1580-1589
Tan MX, Pang RM, Le QV (2020) EfficientDet: scalable and efficient object detection,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10778-10787
Du DW, Zhu PF, Wen LY, et al (2019) VisDrone-DET2019: the vision meets drone object detection in image challenge results,” In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop, 213-222
Lin TY, Goyal P, Girshick R, et al (2017) Focal loss for dense object detection, In Proceedings of the IEEE International Conference on Computer Vision 2980-2988
Duan DKW, Bai S, Xie LX, others (2019) CenterNet: keypoint triplets for object detection,” in *IEEE/CVF International Conference on Computer Vision (ICCV)*, Seoul, Korea (South) pp. 6568-6577
CaiZ. W, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6154-6162
Zhu XK, Lu SC, Wang X (2021) TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios,” In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 2778-2788
Wu MJ, Yun LJ, Chen ZQ et al (2024) Improved YOLOv5s small object detection algorithm in UAV view. Comput Eng Appl 60(2):191–199
Funding
This work was supported in part by Shanghai Science and Technology Program, China, under Grant 23010501000; in part by Humanities and Social Sciences of Ministry of Education Planning Fund, China, under Grant 22YJAZHA145; in part by the National Natural Science Foundation of China under Grant 61963017; in part by Shanghai Educational Science Research Project, China, under Grant C2022056.
Author information
Authors and Affiliations
Contributions
Z.H. performed conceptualization, software, validation, and writing-originadraft;, P.Y. provided funding acquisition, writing-review and editing, and project administration; L.YL.prepared writing-review and editing, project administration, and supervision.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by Shanghai Science and Technology Program, China, under Grant 23010501000; in part by Humanities and Social Sciences of Ministry of Education Planning Fund, China, under Grant 22YJAZHA145; in part by the National Natural Science Foundation of China under Grant 61963017; in part by Shanghai Educational Science Research Project, China, under Grant C2022056.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, H., Peng, Y. & Liu, Y.l. UAV aerial photography target detection based on improved YOLOv9. J Supercomput 81, 492 (2025). https://doi.org/10.1007/s11227-025-06991-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-06991-8