Skip to main content

Advertisement

Log in

Htfd-yolo: Small target detection in drone aerial photography based on YOLOv8s

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive. This paper presents a novel YOLO-based method for detecting small targets, specifically tailored to UAV photography. Firstly, a detection head is formulated for small targets to provide higher-resolution feature mapping. Secondly, a three-scale feature fusion module is proposed as a means of fusing the network features with the underlying features. This is intended to improve the deep semantic feature fusion and shallow texture feature fusion, provide rich spatial information for different detection heads and address the issue of feature loss. Furthermore, a module for Feature Selection Guidance Module is proposed, which enhances the ability to discriminate small targets by combining the CNN and the nonlinear learning operator. Finally, Soft_NMS is introduced and combined with DIOU, and the DIOU_Soft_NMS algorithm is proposed as a replacement for the original nonextremely large value suppression method. This new algorithm solves target crowding effectively and overlapping. Experimental results show that exhibits superior detection performance in UAV aerial photography scenarios, achieving remarkable outcomes on the VisDrone2019 dataset. In the test set, mAP0.5 reached 45%, representing a 12.1% improvement in comparison with YOLOv8, while mAP0.5 − 0.95 reached 34.1%, indicating an 11.4% improvement in comparison with YOLOv8. This suggests that the method will have potential for use in practical tasks in the field of UAVs. Furthermore, the results provide a solid foundation for future related research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data used to support the findings of this study is available from the corresponding author upon request.

References

  1. Huang S, Ren S, Wu W et al (2024) Discriminative features enhancement for low-altitude uav object detection. Pattern Recogn 147:110041

    Article  MATH  Google Scholar 

  2. Wan J, Zhang B, Zhao Y, et al (2021) Vistrongerdet: Stronger visual information for object detection in visdrone images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2820–2829

  3. Gao C, Meng D, Yang Y et al (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009

    Article  MathSciNet  MATH  Google Scholar 

  4. Wang X, Yan Y, Sun H et al (2023) Dense-and-similar object detection in aerial images. Pattern Recogn Lett 176:153–159

    Article  MATH  Google Scholar 

  5. Du D, Qi Y, Yu H, et al (2018) The unmanned aerial vehicle benchmark: Object detection and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 370–386

  6. Yu W, Yang T, Chen C (2021) Towards resolving the challenge of long-tail distribution in uav images for object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3258–3267

  7. Zhang Z (2023) Drone-yolo: an efficient neural network method for target detection in drone images. Drones 7(8):526

    Article  MATH  Google Scholar 

  8. Lin TY, Dollár P, Girshick R, et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125

  9. Hou HY, Shen MY, Hsu CC, et al (2023) Ensemble fusion for small object detection. In: 2023 18th International Conference on Machine Vision and Applications (MVA), IEEE, pp 1–6

  10. Gong Y, Yu X, Ding Y, et al (2021) Effective fusion factor in fpn for tiny object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1160–1168

  11. Guo C, Fan B, Zhang Q, et al (2020) Augfpn: Improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12595–12604

  12. Redmon J, Divvala S, Girshick R, et al (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788

  13. Yu W, Xiang Z, Jiantong S, et al (2024) Yolov5-based dense small target detection algorithm for aerial images using diou-nms. Radioengineering 33(1)

  14. Tang S, Fang Y, Zhang S (2023) Hic-yolov5: Improved yolov5 for small object detection. arXiv preprint arXiv:2309.16393

  15. Bodla N, Singh B, Chellappa R, et al (2017) Soft-nms–improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5561–5569

  16. Zheng Z, Wang P, Liu W, et al (2020) Distance-iou loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 12993–13000

  17. Bharati P, Pramanik A (2020) Deep learning techniques-r-cnn to mask r-cnn: a survey. Comput Intell Pattern Recogn: Proc CIPR 2019:657–668

    Article  MATH  Google Scholar 

  18. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448

  19. Ren S, He K, Girshick R et al (2016) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  MATH  Google Scholar 

  20. Fang W, Wang L, Ren P (2019) Tinier-yolo: a real-time object detection method for constrained environments. Ieee Access 8:1935–1944

    Article  MATH  Google Scholar 

  21. Liu W, Anguelov D, Erhan D, et al (2016) Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, Springer, pp 21–37

  22. Lin TY, Goyal P, Girshick R, et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988

  23. Nihal RA, Yen B, Itoyama K, et al (2024) From blurry to brilliant detection: Yolov5-based aerial object detection with super resolution. arXiv preprint arXiv:2401.14661

  24. Creswell A, White T, Dumoulin V et al (2018) Generative adversarial networks: an overview. IEEE Signal Process Mag 35(1):53–65

    Article  MATH  Google Scholar 

  25. Xiao J, Guo H, Zhou J et al (2023) Tiny object detection with context enhancement and feature purification. Expert Syst Appl 211:118665

    Article  MATH  Google Scholar 

  26. Li X, Diao W, Mao Y et al (2023) Ogmn: occlusion-guided multi-task network for object detection in uav images. ISPRS J Photogramm Remote Sens 199:242–257

    Article  MATH  Google Scholar 

  27. Cao S, Wang T, Li T et al (2023) Uav small target detection algorithm based on an improved yolov5s model. J Vis Commun Image Represent 97:103936

    Article  Google Scholar 

  28. Ma Y, Chai L, Jin L (2023) Scale decoupled pyramid for object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing

  29. Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475

  30. Zhou L, Liu Z, Zhao H et al (2023) A multi-scale object detector based on coordinate and global information aggregation for uav aerial images. Remote Sens 15(14):3468

    Article  MATH  Google Scholar 

  31. Du B, Huang Y, Chen J, et al (2023) Adaptive sparse convolutional networks with global context enhancement for faster object detection on drone images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13435–13444

  32. Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6154–6162

  33. Sunkara R, Luo T (2022) No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, pp 443–459

  34. Zhu X, Lyu S, Wang X, et al (2021) Tph-yolov5: Improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2778–2788

  35. Shi T, Ding Y, Zhu W (2023) Yolov5s_2e: Improved yolov5s for aerial small target detection. IEEE Access

  36. Wang A, Chen H, Liu L, et al (2024) Yolov10: Real-time end-to-end object detection. arXiv preprint arXiv:2405.14458

  37. Khanam R, Hussain M (2024) Yolov11: An overview of the key architectural enhancements. arXiv preprint arXiv:2410.17725

Download references

Funding

This project was supported by Liaoning Provincial Department of Education item (LJKFZ20220206) and Dalian Science and Technology Bureau project (2019J13SN102).

Author information

Authors and Affiliations

Authors

Contributions

YhS contributed to the conception of the study and wrote the manuscript. YpG and XxL performed the data analyzes. ZpL, YgS, YrW and YwM completed the revision and touch-up of the manuscript. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Zhenping Lan or Yanguo Sun.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Y., Lan, Z., Sun, Y. et al. Htfd-yolo: Small target detection in drone aerial photography based on YOLOv8s. J Supercomput 81, 545 (2025). https://doi.org/10.1007/s11227-025-07067-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-07067-3

Keywords