Skip to main content
Log in

ARBFPN-YOLOv8: auxiliary reversible bidirectional feature pyramid network for UAV small target detection

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Drones are widely used in fields such as agriculture, environmental protection, and public safety. In these applications, the ability to detect small targets typically directly determines the effectiveness of drone image analysis. Due to the small number of pixels in the image, feature extraction is very difficult for small targets. Traditional algorithms find it difficult to capture the details of small targets. Although multi-scale feature fusion technology can improve detection capability, feature loss and interference still occur after multiple samplings. To effectively address this challenge, an innovative architecture called Auxiliary Reversible Bidirectional Feature Pyramid Network (ARBFPN) has been proposed. The core design concept is to enhance the integrity of feature information by introducing auxiliary structures, and to prevent feature loss during transmission by using residual connections, thereby preserving more detailed information, which is crucial for small object detection in the feature extraction stage. Meanwhile, by optimizing the detection head through detail enhancement mechanism and gating mechanism, a Lightweight Detail Enhanced Gated Head (LDEGH) was innovatively proposed to improve the overall detection accuracy. To verify the effectiveness of the proposed architecture, relevant experiments were conducted on the VisDrone2019 dataset. The experimental results show that compared with existing technologies, its performance is significantly better than the state-of-the-art technology (SOTA), bringing new breakthroughs to the field of small object detection in drone images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. Rolly, R.M., Malarvezhi, P., Lagkas, T.D.: Unmanned aerial vehicles: Applications, techniques, and challenges as aerial base stations. Int. J. Distrib. Sens. Netw. 18(9), 15501329221123932 (2022)

    Article  Google Scholar 

  2. Liu, M., Wang, X., Zhou, A., et al.: Uav-yolo: small object detection on unmanned aerial vehicle perspective. Sensors 20(8), 2238 (2020)

    Article  MATH  Google Scholar 

  3. Zou, Z., Chen, K., Shi, Z., et al.: Object detection in 20 years: a survey. Proc. IEEE 111(3), 257–276 (2023)

    Article  MATH  Google Scholar 

  4. Kang, J., Tariq, S., Oh, H., et al.: A survey of deep learning-based object detection methods and datasets for overhead imagery. IEEE Access 10, 20118–20134 (2022)

    Article  MATH  Google Scholar 

  5. Saeed, Z., Yousaf, M.H., Ahmed, R., et al.: On-board small-scale object detection for unmanned aerial vehicles (UAVs). Drones 7(5), 310 (2023)

    Article  MATH  Google Scholar 

  6. Zhou, H., Ma, A., Niu, Y., et al.: Small-object detection for UAV-based images using a distance metric method. Drones 6(10), 308 (2022)

    Article  MATH  Google Scholar 

  7. Liu, Y., Sun, P., Wergeles, N., et al.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)

    Article  MATH  Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton G.E: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)

  9. Han, K., Wang, Y., Chen, H., et al.: A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 87–110 (2022)

    Article  MATH  Google Scholar 

  10. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

  11. Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  MATH  Google Scholar 

  12. He, K., Gkioxari, G., Dollár, P., et al.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

  13. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  14. Jocher, G., Chaurasia, A., Qiu, J.: Ultralytics YOLO (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics (2023)

  15. Wang, C.Y., Yeh, I.H., Liao, H.Y.M.: YOLOv9: learning what you want to learn using programmable gradient information. arXiv preprint arXiv:2402.13616 (2024)

  16. Wang, A., Chen, H., Liu, L., et al.: Yolov10: real-time end-to-end object detection. arXiv preprint arXiv:2405.14458 (2024)

  17. Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, pp. 21–37 (2016)

  18. Stojnić, V., Risojević, V., Muštra, M., et al.: A method for detection of small moving objects in UAV videos. Remote Sens. 13(4), 653 (2021)

    Article  MATH  Google Scholar 

  19. Zhu, X., Su, W., Lu, L., et al.: Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

  20. Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

  21. Chen, Z., He, Z., Lu, Z.M.: DEA-Net: single image dehazing based on detail-enhanced convolution and content-guided attention. IEEE Trans. Image Process. 33, 1002–1015 (2024)

    Article  MATH  Google Scholar 

  22. Shi, D.: TransNeXt: robust foveal visual perception for vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17773–17783 (2024)

  23. Zhao, Y., Lv, W., Xu, S., et al.: Detrs beat yolos on real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16965–16974 (2024)

  24. Tong, K., Wu, Y., Zhou, F.: Recent advances in small object detection based on deep learning: a review. Image Vis. Comput. 97, 103910 (2020)

    Article  MATH  Google Scholar 

  25. Chen, C., Liu, M.Y., Tuzel, O., et al.: R-CNN for small object detection. In: Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20–24, 2016, Revised Selected Papers, Part V 13. Springer, pp. 214–230 (2017)

  26. Bosquet, B., Cores, D., Seidenari, L., et al.: A full data augmentation pipeline for small object detection based on generative adversarial networks. Pattern Recogn. 133, 108998 (2023)

    Article  MATH  Google Scholar 

  27. Xu, S., Gu, J., Hua, Y., et al.: Dktnet: dual-key transformer network for small object detection. Neurocomputing 525, 29–41 (2023)

    Article  MATH  Google Scholar 

  28. Gong, H., Mu, T., Li, Q., et al.: Swin-transformer-enabled YOLOv5 with attention mechanism for small object detection on satellite images. Remote Sens. 14(12), 2861 (2022)

    Article  MATH  Google Scholar 

  29. Ye, T., Qin, W., Zhao, Z., et al.: Real-time object detection network in UAV-vision based on CNN and transformer. IEEE Trans. Instrum. Meas. 72, 1–13 (2023)

    MATH  Google Scholar 

  30. Kim, M., Jeong, J., Kim, S.: ECAP-YOLO: efficient channel attention pyramid YOLO for small object detection in aerial image. Remote Sens. 13(23), 4851 (2021)

    Article  MATH  Google Scholar 

  31. Lai, H., Chen, L., Liu, W., et al.: STC-YOLO: small object detection network for traffic signs in complex environments. Sensors 23(11), 5307 (2023)

    Article  MATH  Google Scholar 

  32. Ji, S.J., Ling, Q.H., Han, F.: An improved algorithm for small object detection based on YOLO v4 and multi-scale contextual information. Comput. Electr. Eng. 105, 108490 (2023)

    Article  MATH  Google Scholar 

  33. Zhang, Y., Ye, M., Zhu, G., Liu, Y., Guo, P., Yan, J.: FFCA-YOLO for small object detection in remote sensing images. IEEE Trans. Geosci. Remote Sens. 62, 1–15 (2024)

  34. Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

  35. Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)

  36. Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10781–10790 (2020)

  37. Huang, S., Liu, Q.: Addressing scale imbalance for small object detection with dense detector. Neurocomputing 473, 68–78 (2022)

    Article  MATH  Google Scholar 

  38. Chen, Y., Zhang, C., Chen, B., et al.: Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases. Comput. Biol. Med. 170, 107917 (2024)

    Article  MATH  Google Scholar 

  39. Yang, G., Lei, J., Zhu, Z., et al.: AFPN: asymptotic feature pyramid network for object detection. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, pp. 2184–2189 (2023)

  40. Liu, J., Qi, J., Chen, W., et al.: Multi-branch fusion auxiliary learning for the detection of pneumonia from chest X-ray images. Comput. Biol. Med. 147, 105732 (2022)

    Article  MATH  Google Scholar 

  41. Chen, Z., Ji, H., Zhang, Y., et al.: High-resolution feature pyramid network for small object detection on drone view. IEEE Trans. Circuits Syst. Video Technol. 34(1), 475–489 (2023)

    Article  MATH  Google Scholar 

  42. Fu, X., Yuan, Z., Yu, T., et al.: DA-FPN: deformable convolution and feature alignment for object detection. Electronics 12(6), 1354 (2023)

    Article  MATH  Google Scholar 

  43. Song, G., Liu, Y., Wang, X.: Revisiting the sibling head in object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11563–11572 (2020)

  44. Wu, Y., Chen, Y., Yuan, L., et al.: Rethinking classification and localization for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10186–10195 (2020)

  45. Zhai, X., Huang, Z., Li, T., et al.: YOLO-Drone: an optimized YOLOv8 network for tiny UAV object detection. Electronics 12(17), 3664 (2023)

    Article  MATH  Google Scholar 

  46. Yang, X., Yan, J., Liao, W., et al.: Scrdet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2384–2399 (2022)

    Article  MATH  Google Scholar 

  47. Wu, M., Yun, L., Wang, Y., et al.: Detection algorithm for dense small objects in high altitude image. Digit. Signal Process. 146, 104390 (2024)

    Article  MATH  Google Scholar 

  48. Liu, S., Zhu, M., Tao, R., et al.: Fine-grained feature perception for unmanned aerial vehicle target detection algorithm. Drones 8(5), 181 (2024)

    Article  MATH  Google Scholar 

  49. Li, M., Chen, Y., Zhang, T., et al.: TA-YOLO: a lightweight small object detection model based on multi-dimensional trans-attention module for remote sensing images. Complex Intell. Syst. 10, 5459–5473 (2024)

    Article  MATH  Google Scholar 

  50. Wang, G., Chen, Y., An, P., et al.: UAV-YOLOv8: a small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios. Sensors 23(16), 7190 (2023)

    Article  Google Scholar 

  51. Li, Y., Wang, Y., Ma, Z., et al. Sod-Uav: small object detection for unmanned aerial vehicle images via improved Yolov7. In: ICASSP 2024–2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 7610–761 (2024)

Download references

Acknowledgements

The work is partially supported by the NSFC (Nos. 61976006), NSF_AH (No. 2108085MF206).

Author information

Authors and Affiliations

Authors

Contributions

W. Bian made a significant intellectual contribution to the theoretical development, experimental design. F. Luo participated in the design of prototype development, experiments and wrote the original manuscript. B. Jie provided assistance for the theoretical development, data analysis, and manuscript preparation. And furthermore, Bian and Jie performed manuscript review, and carefully revised this manuscript for intellectual content. H. Dong and L. Fu participated in the analysis and interpretation of data associated with the work contained in the article. All authors have read and approved the final version of the article as accepted for publication, including references.

Corresponding author

Correspondence to Weixin Bian.

Ethics declarations

Competing interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, F., Bian, W., Jie, B. et al. ARBFPN-YOLOv8: auxiliary reversible bidirectional feature pyramid network for UAV small target detection. SIViP 19, 63 (2025). https://doi.org/10.1007/s11760-024-03661-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11760-024-03661-9

Keywords

Navigation