Abstract
The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive. This paper presents a novel YOLO-based method for detecting small targets, specifically tailored to UAV photography. Firstly, a detection head is formulated for small targets to provide higher-resolution feature mapping. Secondly, a three-scale feature fusion module is proposed as a means of fusing the network features with the underlying features. This is intended to improve the deep semantic feature fusion and shallow texture feature fusion, provide rich spatial information for different detection heads and address the issue of feature loss. Furthermore, a module for Feature Selection Guidance Module is proposed, which enhances the ability to discriminate small targets by combining the CNN and the nonlinear learning operator. Finally, Soft_NMS is introduced and combined with DIOU, and the DIOU_Soft_NMS algorithm is proposed as a replacement for the original nonextremely large value suppression method. This new algorithm solves target crowding effectively and overlapping. Experimental results show that exhibits superior detection performance in UAV aerial photography scenarios, achieving remarkable outcomes on the VisDrone2019 dataset. In the test set, mAP0.5 reached 45%, representing a 12.1% improvement in comparison with YOLOv8, while mAP0.5 − 0.95 reached 34.1%, indicating an 11.4% improvement in comparison with YOLOv8. This suggests that the method will have potential for use in practical tasks in the field of UAVs. Furthermore, the results provide a solid foundation for future related research.









Similar content being viewed by others
Data availability
The data used to support the findings of this study is available from the corresponding author upon request.
References
Huang S, Ren S, Wu W et al (2024) Discriminative features enhancement for low-altitude uav object detection. Pattern Recogn 147:110041
Wan J, Zhang B, Zhao Y, et al (2021) Vistrongerdet: Stronger visual information for object detection in visdrone images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2820–2829
Gao C, Meng D, Yang Y et al (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009
Wang X, Yan Y, Sun H et al (2023) Dense-and-similar object detection in aerial images. Pattern Recogn Lett 176:153–159
Du D, Qi Y, Yu H, et al (2018) The unmanned aerial vehicle benchmark: Object detection and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 370–386
Yu W, Yang T, Chen C (2021) Towards resolving the challenge of long-tail distribution in uav images for object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3258–3267
Zhang Z (2023) Drone-yolo: an efficient neural network method for target detection in drone images. Drones 7(8):526
Lin TY, Dollár P, Girshick R, et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125
Hou HY, Shen MY, Hsu CC, et al (2023) Ensemble fusion for small object detection. In: 2023 18th International Conference on Machine Vision and Applications (MVA), IEEE, pp 1–6
Gong Y, Yu X, Ding Y, et al (2021) Effective fusion factor in fpn for tiny object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1160–1168
Guo C, Fan B, Zhang Q, et al (2020) Augfpn: Improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12595–12604
Redmon J, Divvala S, Girshick R, et al (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Yu W, Xiang Z, Jiantong S, et al (2024) Yolov5-based dense small target detection algorithm for aerial images using diou-nms. Radioengineering 33(1)
Tang S, Fang Y, Zhang S (2023) Hic-yolov5: Improved yolov5 for small object detection. arXiv preprint arXiv:2309.16393
Bodla N, Singh B, Chellappa R, et al (2017) Soft-nms–improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5561–5569
Zheng Z, Wang P, Liu W, et al (2020) Distance-iou loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 12993–13000
Bharati P, Pramanik A (2020) Deep learning techniques-r-cnn to mask r-cnn: a survey. Comput Intell Pattern Recogn: Proc CIPR 2019:657–668
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448
Ren S, He K, Girshick R et al (2016) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Fang W, Wang L, Ren P (2019) Tinier-yolo: a real-time object detection method for constrained environments. Ieee Access 8:1935–1944
Liu W, Anguelov D, Erhan D, et al (2016) Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, Springer, pp 21–37
Lin TY, Goyal P, Girshick R, et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988
Nihal RA, Yen B, Itoyama K, et al (2024) From blurry to brilliant detection: Yolov5-based aerial object detection with super resolution. arXiv preprint arXiv:2401.14661
Creswell A, White T, Dumoulin V et al (2018) Generative adversarial networks: an overview. IEEE Signal Process Mag 35(1):53–65
Xiao J, Guo H, Zhou J et al (2023) Tiny object detection with context enhancement and feature purification. Expert Syst Appl 211:118665
Li X, Diao W, Mao Y et al (2023) Ogmn: occlusion-guided multi-task network for object detection in uav images. ISPRS J Photogramm Remote Sens 199:242–257
Cao S, Wang T, Li T et al (2023) Uav small target detection algorithm based on an improved yolov5s model. J Vis Commun Image Represent 97:103936
Ma Y, Chai L, Jin L (2023) Scale decoupled pyramid for object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing
Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475
Zhou L, Liu Z, Zhao H et al (2023) A multi-scale object detector based on coordinate and global information aggregation for uav aerial images. Remote Sens 15(14):3468
Du B, Huang Y, Chen J, et al (2023) Adaptive sparse convolutional networks with global context enhancement for faster object detection on drone images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13435–13444
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6154–6162
Sunkara R, Luo T (2022) No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, pp 443–459
Zhu X, Lyu S, Wang X, et al (2021) Tph-yolov5: Improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2778–2788
Shi T, Ding Y, Zhu W (2023) Yolov5s_2e: Improved yolov5s for aerial small target detection. IEEE Access
Wang A, Chen H, Liu L, et al (2024) Yolov10: Real-time end-to-end object detection. arXiv preprint arXiv:2405.14458
Khanam R, Hussain M (2024) Yolov11: An overview of the key architectural enhancements. arXiv preprint arXiv:2410.17725
Funding
This project was supported by Liaoning Provincial Department of Education item (LJKFZ20220206) and Dalian Science and Technology Bureau project (2019J13SN102).
Author information
Authors and Affiliations
Contributions
YhS contributed to the conception of the study and wrote the manuscript. YpG and XxL performed the data analyzes. ZpL, YgS, YrW and YwM completed the revision and touch-up of the manuscript. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, Y., Lan, Z., Sun, Y. et al. Htfd-yolo: Small target detection in drone aerial photography based on YOLOv8s. J Supercomput 81, 545 (2025). https://doi.org/10.1007/s11227-025-07067-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-07067-3