Skip to main content

MSD-YOLO: An Efficient Algorithm for Small Target Detection

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2025)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15522))

Included in the following conference series:

  • 336 Accesses

Abstract

In the field of unmanned aerial vehicle (UAV) target detection, the significant vertical fluctuations of UAVs pose a considerable challenge to image detection, particularly in detecting small targets, due to the large variation in the size of the main subjects. To overcome this challenge, we propose a novel algorithm architecture based on YOLO-v8, named MSD-YOLO. Firstly, we design a more innovative network for feature extraction and integration (MUBIFPN) to replace the original Neck part, enabling the model to better fuse features. Secondly, we also design a Feature Pyramid Pooling structure (SPPFCSPC-SM) to replace the original SPPF, enhancing the receptive field of this part. Finally, we introduce an advanced multi-dimensional perception detection head (DyHead) as the detection head of this network, significantly enhancing the expression ability of the detection head. Experiments show that the proposed method achieves a 4.7% improvement in recall rate and a 5.8% improvement in mAP50 on the VisDrone2019 dataset compared to the original YOLO-v8n model. The mAP50-90 is improved by 4.0%. Compared to the larger YOLO-v8s, not only is there a slight improvement in recall rate and accuracy, but also the parameters is reduced by 58.6%, and GFLOPs are reduced by 54.7%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)

  2. Dai, X., et al.: Dynamic head: unifying object detection heads with attentions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7373–7382 (2021)

    Google Scholar 

  3. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

    Google Scholar 

  4. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  5. Hsu, W.Y., Lin, W.Y.: Ratio-and-scale-aware yolo for pedestrian detection. IEEE Trans. Image Process. 30, 934–947 (2020)

    Article  MATH  Google Scholar 

  6. Jocher, G., et al.: ultralytics/yolov5: v3. 1-bug fixes and performance improvements. Zenodo (2020)

    Google Scholar 

  7. Li, C., et al.: Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976 (2022)

  8. Li, S., Yang, X., Lin, X., Zhang, Y., Wu, J.: Real-time vehicle detection from UAV aerial images based on improved yolov5. Sensors 23(12), 5634 (2023)

    Article  MATH  Google Scholar 

  9. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  10. Liu, J., Lu, Y., Chen, Y., Zhao, Q., Qin, Z., Fu, Y.: Research on low-altitude UAV aerial photography target detection. In: 2022 International Conference on Computer Network, Electronic and Automation (ICCNEA), pp. 369–372. IEEE (2022)

    Google Scholar 

  11. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  MATH  Google Scholar 

  12. Liu, Z., Gao, X., Wan, Y., Wang, J., Lyu, H.: An improved yolov5 method for small object detection in UAV capture scenes. IEEE Access 11, 14365–14374 (2023)

    Article  Google Scholar 

  13. Lou, H., et al.: Dc-yolov8: small-size object detection algorithm based on camera sensor. Electronics 12(10), 2323 (2023)

    Article  MATH  Google Scholar 

  14. Luo, X., Wu, Y., Wang, F.: Target detection method of UAV aerial imagery based on improved yolov5. Remote Sens. 14(19), 5063 (2022)

    Article  MATH  Google Scholar 

  15. Ma, C., Fu, Y., Wang, D., Guo, R., Zhao, X., Fang, J.: Yolo-UAV: object detection method of unmanned aerial vehicle imagery based on efficient multi-scale feature fusion. IEEE Access (2023)

    Google Scholar 

  16. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)

    Google Scholar 

  17. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  18. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  19. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)

    Article  MATH  Google Scholar 

  20. Saydirasulovich, S.N., Mukhiddinov, M., Djuraev, O., Abdusalomov, A., Cho, Y.I.: An improved wildfire smoke detection based on yolov8 and UAV images. Sensors 23(20), 8374 (2023)

    Article  Google Scholar 

  21. Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)

    Google Scholar 

  22. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)

    Google Scholar 

  23. Wang, C.Y., Yeh, I.H., Liao, H.Y.M.: Yolov9: learning what you want to learn using programmable gradient information. arXiv preprint arXiv:2402.13616 (2024)

  24. Wang, F., Wang, H., Qin, Z., Tang, J.: UAV target detection algorithm based on improved yolov8. IEEE Access (2023)

    Google Scholar 

Download references

Acknowledgement

This research was partially supported by the Inner Mongolia Autonomous RegioNatural Science Foundation Key Project (No. 2024ZD27) and the Inner Mongcous Region Social Science Foundation Comnissioned Key Projeclia Autono! (No. 2024WTZD02).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongyu Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, D., Zhu, Y., Liu, R., Xing, Z., Geng, W., Wang, Y. (2025). MSD-YOLO: An Efficient Algorithm for Small Target Detection. In: Ide, I., et al. MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15522. Springer, Singapore. https://doi.org/10.1007/978-981-96-2064-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-96-2064-7_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-96-2063-0

  • Online ISBN: 978-981-96-2064-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics