Abstract
In the field of unmanned aerial vehicle (UAV) target detection, the significant vertical fluctuations of UAVs pose a considerable challenge to image detection, particularly in detecting small targets, due to the large variation in the size of the main subjects. To overcome this challenge, we propose a novel algorithm architecture based on YOLO-v8, named MSD-YOLO. Firstly, we design a more innovative network for feature extraction and integration (MUBIFPN) to replace the original Neck part, enabling the model to better fuse features. Secondly, we also design a Feature Pyramid Pooling structure (SPPFCSPC-SM) to replace the original SPPF, enhancing the receptive field of this part. Finally, we introduce an advanced multi-dimensional perception detection head (DyHead) as the detection head of this network, significantly enhancing the expression ability of the detection head. Experiments show that the proposed method achieves a 4.7% improvement in recall rate and a 5.8% improvement in mAP50 on the VisDrone2019 dataset compared to the original YOLO-v8n model. The mAP50-90 is improved by 4.0%. Compared to the larger YOLO-v8s, not only is there a slight improvement in recall rate and accuracy, but also the parameters is reduced by 58.6%, and GFLOPs are reduced by 54.7%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Dai, X., et al.: Dynamic head: unifying object detection heads with attentions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7373–7382 (2021)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Hsu, W.Y., Lin, W.Y.: Ratio-and-scale-aware yolo for pedestrian detection. IEEE Trans. Image Process. 30, 934–947 (2020)
Jocher, G., et al.: ultralytics/yolov5: v3. 1-bug fixes and performance improvements. Zenodo (2020)
Li, C., et al.: Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976 (2022)
Li, S., Yang, X., Lin, X., Zhang, Y., Wu, J.: Real-time vehicle detection from UAV aerial images based on improved yolov5. Sensors 23(12), 5634 (2023)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Liu, J., Lu, Y., Chen, Y., Zhao, Q., Qin, Z., Fu, Y.: Research on low-altitude UAV aerial photography target detection. In: 2022 International Conference on Computer Network, Electronic and Automation (ICCNEA), pp. 369–372. IEEE (2022)
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Liu, Z., Gao, X., Wan, Y., Wang, J., Lyu, H.: An improved yolov5 method for small object detection in UAV capture scenes. IEEE Access 11, 14365–14374 (2023)
Lou, H., et al.: Dc-yolov8: small-size object detection algorithm based on camera sensor. Electronics 12(10), 2323 (2023)
Luo, X., Wu, Y., Wang, F.: Target detection method of UAV aerial imagery based on improved yolov5. Remote Sens. 14(19), 5063 (2022)
Ma, C., Fu, Y., Wang, D., Guo, R., Zhao, X., Fang, J.: Yolo-UAV: object detection method of unmanned aerial vehicle imagery based on efficient multi-scale feature fusion. IEEE Access (2023)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
Saydirasulovich, S.N., Mukhiddinov, M., Djuraev, O., Abdusalomov, A., Cho, Y.I.: An improved wildfire smoke detection based on yolov8 and UAV images. Sensors 23(20), 8374 (2023)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)
Wang, C.Y., Yeh, I.H., Liao, H.Y.M.: Yolov9: learning what you want to learn using programmable gradient information. arXiv preprint arXiv:2402.13616 (2024)
Wang, F., Wang, H., Qin, Z., Tang, J.: UAV target detection algorithm based on improved yolov8. IEEE Access (2023)
Acknowledgement
This research was partially supported by the Inner Mongolia Autonomous RegioNatural Science Foundation Key Project (No. 2024ZD27) and the Inner Mongcous Region Social Science Foundation Comnissioned Key Projeclia Autono! (No. 2024WTZD02).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liu, D., Zhu, Y., Liu, R., Xing, Z., Geng, W., Wang, Y. (2025). MSD-YOLO: An Efficient Algorithm for Small Target Detection. In: Ide, I., et al. MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15522. Springer, Singapore. https://doi.org/10.1007/978-981-96-2064-7_5
Download citation
DOI: https://doi.org/10.1007/978-981-96-2064-7_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-2063-0
Online ISBN: 978-981-96-2064-7
eBook Packages: Computer ScienceComputer Science (R0)