Skip to main content

Advertisement

Log in

IRE-YOLO: Infrared weak target detection algorithm based on the fusion of multi-scale receptive fields and efficient convolution

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

A Correction to this article was published on 01 April 2025

This article has been updated

Abstract

In recent years, the technology for detecting small and dim infrared targets has played a crucial role in both military and civilian security fields. Deep learning-based methods have also achieved remarkable progress in this area. However, it is still restricted by challenges such as small target size, low signal-to-noise ratio, and complex backgrounds. Therefore, this paper proposes an improved model IRE-YOLO based on You Only Look Once (YOLO) to enhance the detection accuracy of small targets. To improve the model’s feature extraction ability for targets, a receptive field enhancement module based on dilated convolution and shared weights is proposed. By expanding the receptive field of the feature map, it can extract the detailed features and local information of multi-scale targets. Secondly, to address the difficulties of small target size and low image resolution, a space-to-depth convolution is added to the backbone network. By converting the spatial dimension into the depth dimension, it can effectively capture the context information of small targets. In addition, to enhance the accuracy of the model for real detection boxes, this paper proposes an SNS algorithm, which can effectively remove redundant detection boxes. IRE-YOLO is compared and evaluated with other models on two public datasets, IDTA and SIRST. The experimental results show that compared with the baseline YOLOv5s, the mean average precision (mAP) of IRE-YOLO has increased by 2% and 2.1%, respectively, significantly improving the detection accuracy of small and dim infrared targets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

No datasets were generated or analysed during the current study.

Change history

  • 27 March 2025

    The original online version of this article was revised: " The Funding information section was missing from this article and should have read '1. National Natural Science Foundation of China (Grant No. 62261004) 2. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2025GXNSFFA069014) 3. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2023GXNSFFA026002)'".

  • 01 April 2025

    A Correction to this paper has been published: https://doi.org/10.1007/s11227-025-07223-9

References

  1. Hou F, Zhang Y, Zhou Y, Zhang M, Lv B, Wu J (2022) Review on infrared imaging technology. Sustainability 14(18):11161

    MATH  Google Scholar 

  2. Sun C, Dai G, Wang M, Peng L, Chen X, Song Z (2024) High-resolution network for static infrared weak and small targets detection. Eng Appl Artif Intell 133:107924

    MATH  Google Scholar 

  3. Li B, Xiao C, Wang L, Wang Y, Lin Z, Li M, An W, Guo Y (2022) Dense nested attention network for infrared small target detection. IEEE Trans Image Process 32:1745–1758

    MATH  Google Scholar 

  4. Gao C, Meng D, Yang Y, Wang Y, Zhou X, Hauptmann AG (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009

    MathSciNet  MATH  Google Scholar 

  5. Han J, Ma Y, Zhou B, Fan F, Liang K, Fang Y (2014) A robust infrared small target detection algorithm based on human visual system. IEEE Geosci Remote Sens Lett 11(12):2168–2172

    MATH  Google Scholar 

  6. Ran Q, Wang Q, Zhao B, Wu Y, Pu S, Li Z (2021) Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images. IEEE J Select Topics Appl Earth Observ Remote Sens 14:5786–5795

    MATH  Google Scholar 

  7. Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162

  8. Girshick R (2015) Fast r-cnn. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169

  9. Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    MATH  Google Scholar 

  10. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer

  11. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788

  12. Zhou J, Feng K, Li W, Han J, Pan F (2022) Ts4net: Two-stage sample selective strategy for rotating object detection. Neurocomputing 501:753–764

    MATH  Google Scholar 

  13. Zhang J, Wu X, Hoi SC, Zhu J (2020) Feature agglomeration networks for single stage face detection. Neurocomputing 380:180–189

    MATH  Google Scholar 

  14. Wang L, Hua S, Zhang C, Yang G, Ren J, Li J (2024) YOLOdrive: A Lightweight Autonomous Driving Single-Stage Target Detection Approach. IEEE Internet of Things Journal 11(22):36099-36113

    MATH  Google Scholar 

  15. Jocher G, Chaurasia A, Stoken A, Borovec J, Kwon Y, Michael K, Fang J, Yifu Z, Wong C, Montes D, et al (2022) ultralytics/yolov5: v7. 0-yolov5 sota realtime instance segmentation. Zenodo. https://doi.org/10.5281/zenodo.7347926

  16. Wang X, Peng Z, Zhang P, He Y (2017) Infrared small target detection via nonnegativity-constrained variational mode decomposition. IEEE Geosci Remote Sens Lett 14(10):1700–1704

    MATH  Google Scholar 

  17. Gao C, Meng D, Yang Y, Wang Y, Zhou X, Hauptmann AG (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009

    MathSciNet  MATH  Google Scholar 

  18. Kim S, Lee J (2012) Scale invariant small target detection by optimizing signal-to-clutter ratio in heterogeneous background for infrared search and track. Pattern Recogn 45(1):393–406

    MATH  Google Scholar 

  19. Ma T, Cheng K, Chai T, Prasad S, Zhao D, Li J, Zhou H (2024) Mdcenet: Multi-dimensional cross-enhanced network for infrared small target detection. Infrared Phys Technol 141:105475

    Google Scholar 

  20. Tomasi C, Manduchi R (1998) Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 839–846. IEEE

  21. Cao Y, Liu R, Yang J (2008) Small target detection using two-dimensional least mean square (tdlms) filter based on neighborhood analysis. Int J Infrared Millimeter Waves 29:188–200

    MATH  Google Scholar 

  22. Comaniciu D (2003) An algorithm for data-driven bandwidth selection. IEEE Trans Pattern Anal Mach Intell 25(2):281–288

    MATH  Google Scholar 

  23. Bai X, Zhou F (2010) Analysis of new top-hat transformation and the application for infrared dim small target detection. Pattern Recogn 43(6):2145–2156

    MATH  Google Scholar 

  24. Gao C, Meng D, Yang Y, Wang Y, Zhou X, Hauptmann AG (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009

    MathSciNet  MATH  Google Scholar 

  25. Dai Y, Wu Y, Song Y, Guo J (2017) Non-negative infrared patch-image model: Robust target-background separation via partial sum minimization of singular values. Inf Phys Technol 81:182–194

    MATH  Google Scholar 

  26. Chen CP, Li H, Wei Y, Xia T, Tang YY (2013) A local contrast method for small infrared target detection. IEEE Trans Geosci Remote Sens 52(1):574–581

    MATH  Google Scholar 

  27. Han J, Ma Y, Zhou B, Fan F, Liang K, Fang Y (2014) A robust infrared small target detection algorithm based on human visual system. IEEE Geosci Remote Sens Lett 11(12):2168–2172

    MATH  Google Scholar 

  28. Li C, Zhou A, Yao A (2022) Omni-dimensional dynamic convolution. arXiv preprint arXiv:2209.07947

  29. Chen Y, Dai X, Liu M, Chen D, Yuan L, Liu Z (2020) Dynamic convolution: Attention over convolution kernels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11030–11039

  30. Wang J, Xu C, Yang W, Yu L (2021) A normalized gaussian wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389

  31. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19

  32. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934

  33. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125

  34. Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. arXiv preprint arXiv:1805.10180

  35. Zheng Z, Wang P, Ren D, Liu W, Ye R, Hu Q, Zuo W (2021) Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cyber 52(8):8574–8586

    MATH  Google Scholar 

  36. Yu Z, Huang H, Chen W, Su Y, Liu Y, Wang X. Yolo-facev2: A scale and occlusion aware face detector. arxiv 2022. arXiv preprint arXiv:2208.02019

  37. Sunkara R, Luo T (2022) No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459. Springer

  38. Bodla N, Singh B, Chellappa R, Davis LS (2017) Soft-nms–improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5561–5569

  39. Gevorgyan Z (2022) Siou loss: More powerful learning for bounding box regression. arXiv preprint arXiv:2205.12740

  40. Hui B, Song Z, Fan H, Zhong P, Hu W, Zhang X, Ling J, Su H, Jin W, Zhang Y et al (2020) A dataset for infrared detection and tracking of dim-small aircraft targets under ground/air background. China Sci Data 5(3):291–302

    Google Scholar 

  41. Dai Y, Wu Y, Zhou F, Barnard K (2021) Asymmetric contextual modulation for infrared small target detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 950–959

  42. Arthur D, Vassilvitskii S (2006) k-means++: The advantages of careful seeding. Technical report, Stanford

  43. Fan X, Ding W, Qin W, Xiao D, Min L, Yuan H (2023) Fusing self-attention and coordconv to improve the yolov5s algorithm for infrared weak target detection. Sensors 23(15):6755

    MATH  Google Scholar 

Download references

Funding

National Natural Science Foundation of China (Grant No. 62261004, Guangxi Distinguished Young Scholars Research Fund (Grant No. 2025GXNSFFA069014), Guangxi Distinguished Young Scholars Research Fund (Grant No. 2023GXNSFFA026002).

Author information

Authors and Affiliations

Authors

Contributions

Ma: Undertook the primary writing of the first draft of the dissertation. Organically integrated the research results from various sections and presented them in a clear and logically coherent manner. Carefully organized the structure of the paper, including various sections such as abstract, introduction, methods, results, discussion and conclusion, to ensure that the paper is complete and well-organized. Fan: Took the lead in planning the entire study, presenting the research questions and general ideas. Based on the current status and potential research gaps in the research area, research directions and objectives were defined, and a framework was set up for subsequent research work. Chen: After the completion of the first draft of the thesis, the language expression, logical clarity and formatting specifications of the thesis were meticulously checked and embellished. We improved the readability of the thesis in terms of grammar, vocabulary and sentence structure, and made many constructive suggestions for revision, which made the thesis more accurate and concise in expression. Li:Mainly responsible for the drawing of tables and pictures throughout the paper.

Corresponding author

Correspondence to Xiangsuo Fan.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: " The Funding information section was missing from this article and should have read '1. National Natural Science Foundation of China (Grant No. 62261004) 2. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2025GXNSFFA069014) 3. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2023GXNSFFA026002)'".

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Q., Fan, X., Chen, H. et al. IRE-YOLO: Infrared weak target detection algorithm based on the fusion of multi-scale receptive fields and efficient convolution. J Supercomput 81, 558 (2025). https://doi.org/10.1007/s11227-025-07057-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-07057-5

Keywords