Abstract
The high imaging resolution, presence of multiple targets in close proximity, complex backgrounds and severe overlaps in UAV images present a significant challenge for the detection of small targets in such images. The issue of achieving faster and more accurate detection has been a significant concern. In order to adequately address these issues, this article proposes the implementation of a Low-Altitude Drone Aerial Small Target Detector algorithm (LDSTD) based on YOLOv8. This paper proposes a bidirectional growth fusion network (BGFN) to address the issue of the network’s difficulty in discriminating targets in complex backgrounds. The proposed BGFN effectively enhances the classification and localisation ability in complex backgrounds by effectively using deep and shallow features to enhance target suppression and background estimation. On this basis, the addition of a high-resolution detection head and the removal of a low-resolution detection head serve to enhance the detection ability of small targets, while simultaneously reducing the number of parameters. Furthermore, this paper presents the design of a Spatial-Channel Enhancement Module (SCEM), which enhances the feature information of the target during feature extraction, filters the superfluous interference information and addresses the issue of the loss of information pertaining to small targets in the sampling process. This paper proposes a novel lightweight multi-scale feature extraction module (LMSC) and its integration with YOLOv8’s C2f, resulting in a new structure, C2f-LMSC. This structure enhances the extraction of features from scalable receptive fields at higher levels of the network while simultaneously reducing the computational burden through the introduction of a lightweight convolutional module. The experiments demonstrate that the LDSTD algorithm presented in this paper exhibits substantial enhancements in both the publicly accessible datasets VisDrone2019 and NWPU VHR-10. For the VisDrone2019 dataset, the algorithm attains a mAP50 of 38.1% in the test set, signifying a 4.1% increase compared to YOLOv8s. Additionally, it achieves a mAP0.5–0.95 of 21.8%, marking an 2.8% rise over YOLOv8s.











Similar content being viewed by others
Data availability
The data used to support the findings of this study are available from the corresponding author upon request.
References
Li S, Han J, Chen F et al (2024) Fire-net: rapid recognition of forest fires in UAV remote sensing imagery using embedded devices. Remote Sens 16(15):2846
Zhang Z (2023) Drone-yolo: an efficient neural network method for target detection in drone images. Drones 7(8):526
Gong Y, Yu X, Ding Y et al (2021) Effective fusion factor in FPN for tiny object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1160–1168
Hui Y, Wang J, Li B (2024) Stf-yolo: a small target detection algorithm for UAV remote sensing images based on improved swintransformer and class weighted classification decoupling head. Measurement 224:113936
Jing R, Zhang W, Li Y et al (2024) Feature aggregation network for small object detection. Expert Syst Appl 255:124686
Liu Z, Cheng J (2023) Cb-fpn: object detection feature pyramid network based on context information and bidirectional efficient fusion. Pattern Anal Appl 26(3):1441–1452
Gong Y, Yu X, Ding Y, et al (2021) Effective fusion factor in fpn for tiny object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1160–1168
Xuejun L, Linfei Q, Zhang Y et al (2024) Improved faster-rcnn algorithm for traffic sign detection. Sci Insights Discov Rev 1:82–90
Li J, Zhu Z, Liu H et al (2023) Strawberry r-cnn: recognition and counting model of strawberry based on improved faster r-cnn. Eco Inform 77:102210
Priyadharshini G, Dolly DRJ (2023) Comparative investigations on tomato leaf disease detection and classification using cnn, r-cnn, fast r-cnn and faster r-cnn. In: 2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS), IEEE, pp 1540–1545
Cheng T, Song L, Ge Y et al (2024) Yolo-world: real-time open-vocabulary object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16901–16911
Qian H, Wang H, Feng S et al (2023) Fessd: Ssd target detection based on feature fusion and feature enhancement. J Real-Time Image Proc 20(1):2
Wang B, Yang G, Yang H et al (2023) Multiscale maize tassel identification based on improved retinanet model and UAV images. Remote Sens 15(10):2530
Talaat FM, ZainEldin H (2023) An improved fire detection approach based on yolo-v8 for smart cities. Neural Comput Appl 35(28):20939–20954
Tang S, Zhang S, Fang Y (2024) Hic-yolov5: improved yolov5 for small object detection. In: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp 6614–6619
Li C, Li L, Jiang H et al (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976
Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475
Huang S, Ren S, Wu W et al (2024) Discriminative features enhancement for low-altitude UAV object detection. Pattern Recogn 147:110041
Hamzenejadi MH, Mohseni H (2023) Fine-tuned yolov5 for real-time vehicle detection in UAV imagery: architectural improvements and performance boost. Expert Syst Appl 231:120845
Yin M, Chen Z, Zhang C (2023) A cnn-transformer network combining CBAM for change detection in high-resolution remote sensing images. Remote Sens 15(9):2406
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13713–13722
Jing R, Zhang W, Li Y et al (2024) Feature aggregation network for small object detection. Expert Syst Appl 255:124686
Sun W, Dai L, Zhang X et al (2022) Rsod: real-time small object detection algorithm in uav-based traffic monitoring. Appl Intell pp 1–16
Yang G, Lei J, Zhu Z, et al (2023) Afpn: asymptotic feature pyramid network for object detection. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, pp 2184–2189
Deng C, Wang M, Liu L et al (2021) Extended feature pyramid network for small object detection. IEEE Trans Multimed 24:1968–1979
Wang Q, Wu B, Zhu P et al (2020) Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11534–11542
Nihal RA, Yen B, Itoyama K et al (2024) From blurry to brilliant detection: Yolov5-based aerial object detection with super resolution. arXiv preprint arXiv:2401.14661
Huang S, Ren S, Wu W et al (2024) Discriminative features enhancement for low-altitude UAV object detection. Pattern Recogn 147:110041
Xiao J, Guo H, Zhou J et al (2023) Tiny object detection with context enhancement and feature purification. Expert Syst Appl 211:118665
Kulkarni U, Meena S, Gurlahosur SV et al (2021) Quantization friendly mobilenet (qf-mobilenet) architecture for vision based applications on embedded platforms. Neural Netw 136:28–39
Liu Z, Hao Z, Han K et al (2024) Ghostnetv3: exploring the training strategies for compact models. arXiv preprint arXiv:2404.11202
Muzammul M, Algarni AM, Ghadi YY et al (2024) Enhancing UAV aerial image analysis: integrating advanced sahi techniques with real-time detection models on the visdrone dataset. IEEE Access
Sheng W, Yu X, Lin J et al (2023) Faster rcnn target detection algorithm integrating cbam and fpn. Appl Sci 13(12):6913
Han X, Kumar S, Tsvetkov Y et al (2023) Ssd-2: scaling and inference-time fusion of diffusion language models. arXiv preprint arXiv:2305.14771
Sunkara R, Luo T (2022) No more strided convolutions or pooling: a new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, pp 443–459
Zhu X, Lyu S, Wang X et al (2021) Tph-yolov5: improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2778–2788
Nihal RA, Yen B, Itoyama K et al (2024) From blurry to brilliant detection: Yolov5-based aerial object detection with super resolution. arXiv preprint arXiv:2401.14661
Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475
Tang S, Zhang S, Fang Y (2024) Hic-yolov5: improved yolov5 for small object detection. In: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp 6614–6619
Wang Y, Wang X, Hao R et al (2024) Metal surface defect detection method based on improved cascade r-cnn. J Comput Inf Sci Eng 24(4):041002
Ye T, Qin W, Zhao Z et al (2023) Real-time object detection network in UAV-vision based on cnn and transformer. IEEE Trans Instrum Meas 72:1–13
Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10012–10022
Khanam R, Hussain M (2024) Yolov11: an overview of the key architectural enhancements. arXiv preprint arXiv:2410.17725
Wang A, Chen H, Liu L et al (2024) Yolov10: real-time end-to-end object detection. arXiv preprint arXiv:2405.14458
Tahir NUA, Long Z, Zhang Z et al (2024) Pvswin-yolov8s: UAV-based pedestrian and vehicle detection for traffic management in smart cities using improved yolov8. Drones 8(3):84
Su J, Qin Y, Jia Z et al (2024) Mpe-yolo: enhanced small target detection in aerial imaging. Sci Rep 14(1):17799
Wu Q, Zhang B, Guo C et al (2023) Multi-branch parallel networks for object detection in high-resolution UAV remote sensing images. Drones 7(7):439
Funding
This project was supported by Liaoning Provincial Department of Education item (LJKFZ20220206) and Dalian Science and Technology Bureau project (2019J13SN102).
Author information
Authors and Affiliations
Contributions
YhS contributed to the conception of the study and wrote the manuscript. YpG and XxL performed the data analyses. ZpL, YgS, YrW and BL completed the revision and touch-up of the manuscript. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, Y., Lan, Z., Sun, Y. et al. Ldstd: low-altitude drone aerial small target detector. J Supercomput 81, 414 (2025). https://doi.org/10.1007/s11227-025-06950-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-06950-3