Skip to main content

Advertisement

Log in

An efficient feature aggregation network for small object detection in UAV aerial images

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Unmanned aerial vehicles (UAVs) possess high mobility and a wide field of view, leading to challenges such as a high proportion of small objects, significant variation in object size, object aggregation, and complex backgrounds in aerial images. Existing object detection methods often overlook the texture information in high-level features, which is crucial for detecting small objects in complex backgrounds. To improve the detection performance of small objects in complex scenes, we propose an efficient feature aggregation network (EFA-Net) based on YOLOv7. The backbone of the network seamlessly integrates a lightweight hybrid feature extraction module (LHFE), which replaces traditional convolutions with depthwise convolutions and employs a hybrid channel attention mechanism to capture local and global information concurrently. This design can effectively reduce the parameters without sacrificing detection accuracy and enhance the network’s representative capacity. In the neck, we design an innovative adaptive multi-scale feature fusion module (AMSFM) that improves the model’s adaptability to small objects and complex backgrounds by fusing multi-scale features with high-level semantic information and capturing the texture information in high-level features. Additionally, we incorporate a residual spatial pyramid pooling (RSPP) module to strengthen information fusion from various receptive fields and reduce the interference of complex backgrounds on small object detection. To further improve the model’s robustness and generalization ability, we propose an enhanced complete intersection over union (ECIoU) loss function to balance the influence of large and small objects during training. Experimental results demonstrate the effectiveness of the proposed method, achieving \({mAP_{50}}\) scores of 51.6% and 48.5%, and mAP scores of 29.6% and 29.5% on the VisDrone 2019 and UAVDT datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Xue Y, Jin G, Shen T, Tan L, Wang N, Gao J, Wang L (2023) Smalltrack: wavelet pooling and graph enhanced classification for UAV small object tracking. IEEE Trans Geosci Remote Sens. https://doi.org/10.1109/TGRS.2023.3305728

    Article  MATH  Google Scholar 

  2. Tao S, Yang M, Wang M, Yang R, Shen Q (2024) Small object change detection in UAV imagery via a siamese network enhanced with temporal mutual attention and contextual features: A case study concerning solar water heaters. ISPRS J Photogramm Remote Sens 218:352–367

    Article  MATH  Google Scholar 

  3. Zhong R, Peng E, Li Z, Ai Q, Han T, Tang Y (2024) Spd-yolov8: an small-size object detection model of uav imagery in complex scene. J Supercomput. https://doi.org/10.1007/s11227-024-06121-w

    Article  Google Scholar 

  4. Lyu Y, Zhang T, Li X, Liu A, Shi G (2025) Lightuav-yolo: a lightweight object detection model for unmanned aerial vehicle image. J Supercomput 81(1):105

    Article  Google Scholar 

  5. Ye T, Qin W, Zhao Z, Gao X, Deng X, Ouyang Y (2023) Real-time object detection network in UAV-vision based on CNN and transformer. IEEE Trans Instrum Meas 72:1–13

    MATH  Google Scholar 

  6. Wang B, Tian Z, Liu X, Xia Y, She W, Liu W (2025) A multi-center federated learning mechanism based on consortium blockchain for data secure sharing. Knowl-Based Syst 310:112962

    Article  MATH  Google Scholar 

  7. Chen N, Li Y, Yang Z, Lu Z, Wang S, Wang J (2023) Lodnu: lightweight object detection network in UAV vision. J Supercomput 79(9):10117–10138

    Article  MATH  Google Scholar 

  8. Lu W, Lan C, Niu C, Liu W, Lyu L, Shi Q, Wang S (2023) A CNN-transformer hybrid model based on cswin transformer for UAV image object detection. IEEE J Select Top Appl Earth Observ Remote Sens 16:1211–1231

    Article  MATH  Google Scholar 

  9. Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307

    Article  MATH  Google Scholar 

  10. Yu Y, Zhang K, Wang X, Wang N, Gao X (2023) An adaptive region proposal network with progressive attention propagation for tiny person detection from UAV images. IEEE Transactions on Circuits and Systems for Video Technology

  11. Lin J, Zhao Y, Wang S, Tang Y (2023) Yolo-da: an efficient yolo-based detector for remote sensing object detection. IEEE Geosci Remote Sens Lett. https://doi.org/10.1109/LGRS.2023.3303896

    Article  MATH  Google Scholar 

  12. Jiang L, Yuan B, Du J, Chen B, Xie H, Tian J, Yuan Z (2024) Mffsodnet: multi-scale feature fusion small object detection network for UAV aerial images. IEEE Trans Instr Measur. https://doi.org/10.1109/TIM.2024.3381272

    Article  MATH  Google Scholar 

  13. Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48

    Article  MATH  Google Scholar 

  14. Liu M, Wang X, Zhou A, Fu X, Ma Y, Piao C (2020) Uav-yolo: small object detection on unmanned aerial vehicle perspective. Sensors 20(8):2238

    Article  Google Scholar 

  15. Wu X, Hong D, Tian J, Chanussot J, Li W, Tao R (2019) Orsim detector: a novel object detection framework in optical remote sensing imagery using spatial-frequency channel features. IEEE Trans Geosci Remote Sens 57(7):5146–5158

    Article  Google Scholar 

  16. Wu X, Hong D, Chanussot J (2022) Uiu-net: U-net in u-net for infrared small object detection. IEEE Trans Image Process 32:364–376

    Article  Google Scholar 

  17. Wang J, Guo W, Pan T, Yu H, Duan L, Yang W (2018) Bottle detection in the wild using low-altitude unmanned aerial vehicles. In: 2018 21st International Conference on Information Fusion (FUSION). IEEE, pp. 439–444

  18. Sun W, Dai L, Zhang X, Chang P, He X (2022) Rsod: real-time small object detection algorithm in UAV-based traffic monitoring. Appl Intell. https://doi.org/10.1007/s10489-021-02893-3

    Article  MATH  Google Scholar 

  19. Chalavadi V, Jeripothula P, Datla R, Ch SB et al (2022) msodanet: a network for multi-scale object detection in aerial images using hierarchical dilated convolutions. Patt Recogn 126:108548

    Article  Google Scholar 

  20. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125

  21. Li J, Xie C, Wu S, Ren Y (2024) Uav-yolov5: a swin-transformer-enabled small object detection model for long-range UAV images. Ann Data Sci. https://doi.org/10.1007/s40745-024-00546-z

    Article  Google Scholar 

  22. Yuan Y, Wu Y, Fan X, Gong M, Miao Q, Ma W (2024) Inlier confidence calibration for point cloud registration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5312–5321

  23. Yuan Y, Wu Y, Gong M, Miao Q, Qin AK (2024) One-nearest neighborhood guides inlier estimation for unsupervised point cloud registration. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2024.3476114

    Article  MATH  Google Scholar 

  24. Yuan Y, Wu Y, Yue M, Gong M, Fan X, Ma W, Miao Q (2024) Learning discriminative features via multi-hierarchical mutual information for unsupervised point cloud registration. IEEE Trans Circ Syst Video Technol. https://doi.org/10.1109/TCSVT.2024.3379220

    Article  MATH  Google Scholar 

  25. Zhao Z, Wang T, Xin H, Wang R, Nie F (2025) Multi-view clustering via high-order bipartite graph fusion. Inform Fusion 113:102630

    Article  MATH  Google Scholar 

  26. Xiong S, Li B, Zhu S (2023) Dcgnn: a single-stage 3d object detection network based on density clustering and graph neural network. Compl Intell Syst 9(3):3399–3408

    Article  MATH  Google Scholar 

  27. Yang F, Fan H, Chu P, Blasch E, Ling H (2019) Clustered object detection in aerial images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8311–8320

  28. Koyun OC, Keser RK, Akkaya IB, Töreyin BU (2022) Focus-and-detect: a small object detection framework for aerial images. Signal Process: Image Commun 104:116675

    Google Scholar 

  29. Liao J, Piao Y, Su J, Cai G, Huang X, Chen L, Huang Z, Wu Y (2021) Unsupervised cluster guided object detection in aerial images. IEEE J Select Topics Appl Earth Observ Remote Sens 14:11204–11216

    Article  Google Scholar 

  30. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30

  31. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141

  32. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19

  33. Guo M-H, Xu T-X, Liu J-J, Liu Z-N, Jiang P-T, Mu T-J, Zhang S-H, Martin RR, Cheng M-M, Hu S-M (2022) Attention mechanisms in computer vision: a survey. Comput. Visual Media 8(3):331–368

    Article  MATH  Google Scholar 

  34. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988

  35. Cai Z, Vasconcelos N (2018) Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162

  36. Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475

  37. Liu S, Zha J, Sun J, Li Z, Wang G (2023) Edgeyolo: An edge-real-time object detector. arXiv preprint arXiv:2302.07483

  38. Jocher G, Chaurasia JQA (2023) Yolov8 by ultralytics. https://github.com/ultralytics/ultralytics

  39. Khanam R, Hussain M (2024) Yolov11: an overview of the key architectural enhancements. arXiv preprint arXiv:2410.17725

  40. Duan C, Wei Z, Zhang C, Qu S, Wang H (2021) Coarse-grained density map guided object detection in aerial images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2789–2798

  41. Zhou L, Liu Z, Zhao H, Hou Y-E, Liu Y, Zuo X, Dang L (2023) A multi-scale object detector based on coordinate and global information aggregation for uav aerial images. Remote Sens 15(14):3468

    Article  MATH  Google Scholar 

  42. Li C, Yang T, Zhu S, Chen C, Guan S (2020) Density map guided object detection in aerial images, pp 737–746

  43. Zhao Y, Lv W, Xu S, Wei J, Wang G, Dang Q, Liu Y, Chen J (2024) Detrs beat yolos on real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16965–16974

  44. Sun F, He N, Li R, Wang X, Xu S (2024) Gd-pan: a multiscale fusion architecture applied to object detection in UAV aerial images. Multim Syst 30(3):143

    Article  Google Scholar 

  45. Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430

Download references

Author information

Authors and Affiliations

Authors

Contributions

X.L. wrote the main manuscript text. G.Z. provided some suggestions for revision of the manuscript. G.Z. provided funding. B.Z. suggested the structure of the manuscript. G.Z. provided some support on the experimental equipment. B.Z. gave some help to the typesetting of the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Guangwei Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Code availability

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Zhang, G. & Zhou, B. An efficient feature aggregation network for small object detection in UAV aerial images. J Supercomput 81, 548 (2025). https://doi.org/10.1007/s11227-025-06987-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-06987-4

Keywords