Abstract
Drones are widely used in fields such as agriculture, environmental protection, and public safety. In these applications, the ability to detect small targets typically directly determines the effectiveness of drone image analysis. Due to the small number of pixels in the image, feature extraction is very difficult for small targets. Traditional algorithms find it difficult to capture the details of small targets. Although multi-scale feature fusion technology can improve detection capability, feature loss and interference still occur after multiple samplings. To effectively address this challenge, an innovative architecture called Auxiliary Reversible Bidirectional Feature Pyramid Network (ARBFPN) has been proposed. The core design concept is to enhance the integrity of feature information by introducing auxiliary structures, and to prevent feature loss during transmission by using residual connections, thereby preserving more detailed information, which is crucial for small object detection in the feature extraction stage. Meanwhile, by optimizing the detection head through detail enhancement mechanism and gating mechanism, a Lightweight Detail Enhanced Gated Head (LDEGH) was innovatively proposed to improve the overall detection accuracy. To verify the effectiveness of the proposed architecture, relevant experiments were conducted on the VisDrone2019 dataset. The experimental results show that compared with existing technologies, its performance is significantly better than the state-of-the-art technology (SOTA), bringing new breakthroughs to the field of small object detection in drone images.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03661-9/MediaObjects/11760_2024_3661_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03661-9/MediaObjects/11760_2024_3661_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03661-9/MediaObjects/11760_2024_3661_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03661-9/MediaObjects/11760_2024_3661_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03661-9/MediaObjects/11760_2024_3661_Fig5_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03661-9/MediaObjects/11760_2024_3661_Fig6_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03661-9/MediaObjects/11760_2024_3661_Fig7_HTML.png)
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Rolly, R.M., Malarvezhi, P., Lagkas, T.D.: Unmanned aerial vehicles: Applications, techniques, and challenges as aerial base stations. Int. J. Distrib. Sens. Netw. 18(9), 15501329221123932 (2022)
Liu, M., Wang, X., Zhou, A., et al.: Uav-yolo: small object detection on unmanned aerial vehicle perspective. Sensors 20(8), 2238 (2020)
Zou, Z., Chen, K., Shi, Z., et al.: Object detection in 20 years: a survey. Proc. IEEE 111(3), 257–276 (2023)
Kang, J., Tariq, S., Oh, H., et al.: A survey of deep learning-based object detection methods and datasets for overhead imagery. IEEE Access 10, 20118–20134 (2022)
Saeed, Z., Yousaf, M.H., Ahmed, R., et al.: On-board small-scale object detection for unmanned aerial vehicles (UAVs). Drones 7(5), 310 (2023)
Zhou, H., Ma, A., Niu, Y., et al.: Small-object detection for UAV-based images using a distance metric method. Drones 6(10), 308 (2022)
Liu, Y., Sun, P., Wergeles, N., et al.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)
Krizhevsky, A., Sutskever, I., Hinton G.E: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)
Han, K., Wang, Y., Chen, H., et al.: A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 87–110 (2022)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
He, K., Gkioxari, G., Dollár, P., et al.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Jocher, G., Chaurasia, A., Qiu, J.: Ultralytics YOLO (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics (2023)
Wang, C.Y., Yeh, I.H., Liao, H.Y.M.: YOLOv9: learning what you want to learn using programmable gradient information. arXiv preprint arXiv:2402.13616 (2024)
Wang, A., Chen, H., Liu, L., et al.: Yolov10: real-time end-to-end object detection. arXiv preprint arXiv:2405.14458 (2024)
Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, pp. 21–37 (2016)
Stojnić, V., Risojević, V., Muštra, M., et al.: A method for detection of small moving objects in UAV videos. Remote Sens. 13(4), 653 (2021)
Zhu, X., Su, W., Lu, L., et al.: Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Chen, Z., He, Z., Lu, Z.M.: DEA-Net: single image dehazing based on detail-enhanced convolution and content-guided attention. IEEE Trans. Image Process. 33, 1002–1015 (2024)
Shi, D.: TransNeXt: robust foveal visual perception for vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17773–17783 (2024)
Zhao, Y., Lv, W., Xu, S., et al.: Detrs beat yolos on real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16965–16974 (2024)
Tong, K., Wu, Y., Zhou, F.: Recent advances in small object detection based on deep learning: a review. Image Vis. Comput. 97, 103910 (2020)
Chen, C., Liu, M.Y., Tuzel, O., et al.: R-CNN for small object detection. In: Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20–24, 2016, Revised Selected Papers, Part V 13. Springer, pp. 214–230 (2017)
Bosquet, B., Cores, D., Seidenari, L., et al.: A full data augmentation pipeline for small object detection based on generative adversarial networks. Pattern Recogn. 133, 108998 (2023)
Xu, S., Gu, J., Hua, Y., et al.: Dktnet: dual-key transformer network for small object detection. Neurocomputing 525, 29–41 (2023)
Gong, H., Mu, T., Li, Q., et al.: Swin-transformer-enabled YOLOv5 with attention mechanism for small object detection on satellite images. Remote Sens. 14(12), 2861 (2022)
Ye, T., Qin, W., Zhao, Z., et al.: Real-time object detection network in UAV-vision based on CNN and transformer. IEEE Trans. Instrum. Meas. 72, 1–13 (2023)
Kim, M., Jeong, J., Kim, S.: ECAP-YOLO: efficient channel attention pyramid YOLO for small object detection in aerial image. Remote Sens. 13(23), 4851 (2021)
Lai, H., Chen, L., Liu, W., et al.: STC-YOLO: small object detection network for traffic signs in complex environments. Sensors 23(11), 5307 (2023)
Ji, S.J., Ling, Q.H., Han, F.: An improved algorithm for small object detection based on YOLO v4 and multi-scale contextual information. Comput. Electr. Eng. 105, 108490 (2023)
Zhang, Y., Ye, M., Zhu, G., Liu, Y., Guo, P., Yan, J.: FFCA-YOLO for small object detection in remote sensing images. IEEE Trans. Geosci. Remote Sens. 62, 1–15 (2024)
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10781–10790 (2020)
Huang, S., Liu, Q.: Addressing scale imbalance for small object detection with dense detector. Neurocomputing 473, 68–78 (2022)
Chen, Y., Zhang, C., Chen, B., et al.: Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases. Comput. Biol. Med. 170, 107917 (2024)
Yang, G., Lei, J., Zhu, Z., et al.: AFPN: asymptotic feature pyramid network for object detection. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, pp. 2184–2189 (2023)
Liu, J., Qi, J., Chen, W., et al.: Multi-branch fusion auxiliary learning for the detection of pneumonia from chest X-ray images. Comput. Biol. Med. 147, 105732 (2022)
Chen, Z., Ji, H., Zhang, Y., et al.: High-resolution feature pyramid network for small object detection on drone view. IEEE Trans. Circuits Syst. Video Technol. 34(1), 475–489 (2023)
Fu, X., Yuan, Z., Yu, T., et al.: DA-FPN: deformable convolution and feature alignment for object detection. Electronics 12(6), 1354 (2023)
Song, G., Liu, Y., Wang, X.: Revisiting the sibling head in object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11563–11572 (2020)
Wu, Y., Chen, Y., Yuan, L., et al.: Rethinking classification and localization for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10186–10195 (2020)
Zhai, X., Huang, Z., Li, T., et al.: YOLO-Drone: an optimized YOLOv8 network for tiny UAV object detection. Electronics 12(17), 3664 (2023)
Yang, X., Yan, J., Liao, W., et al.: Scrdet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2384–2399 (2022)
Wu, M., Yun, L., Wang, Y., et al.: Detection algorithm for dense small objects in high altitude image. Digit. Signal Process. 146, 104390 (2024)
Liu, S., Zhu, M., Tao, R., et al.: Fine-grained feature perception for unmanned aerial vehicle target detection algorithm. Drones 8(5), 181 (2024)
Li, M., Chen, Y., Zhang, T., et al.: TA-YOLO: a lightweight small object detection model based on multi-dimensional trans-attention module for remote sensing images. Complex Intell. Syst. 10, 5459–5473 (2024)
Wang, G., Chen, Y., An, P., et al.: UAV-YOLOv8: a small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios. Sensors 23(16), 7190 (2023)
Li, Y., Wang, Y., Ma, Z., et al. Sod-Uav: small object detection for unmanned aerial vehicle images via improved Yolov7. In: ICASSP 2024–2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 7610–761 (2024)
Acknowledgements
The work is partially supported by the NSFC (Nos. 61976006), NSF_AH (No. 2108085MF206).
Author information
Authors and Affiliations
Contributions
W. Bian made a significant intellectual contribution to the theoretical development, experimental design. F. Luo participated in the design of prototype development, experiments and wrote the original manuscript. B. Jie provided assistance for the theoretical development, data analysis, and manuscript preparation. And furthermore, Bian and Jie performed manuscript review, and carefully revised this manuscript for intellectual content. H. Dong and L. Fu participated in the analysis and interpretation of data associated with the work contained in the article. All authors have read and approved the final version of the article as accepted for publication, including references.
Corresponding author
Ethics declarations
Competing interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Luo, F., Bian, W., Jie, B. et al. ARBFPN-YOLOv8: auxiliary reversible bidirectional feature pyramid network for UAV small target detection. SIViP 19, 63 (2025). https://doi.org/10.1007/s11760-024-03661-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11760-024-03661-9