IRE-YOLO: Infrared weak target detection algorithm based on the fusion of multi-scale receptive fields and efficient convolution

Ma, Qingxiao; Fan, Xiangsuo; Chen, Huajin; Li, Tingting

doi:10.1007/s11227-025-07057-5

IRE-YOLO: Infrared weak target detection algorithm based on the fusion of multi-scale receptive fields and efficient convolution

Published: 28 February 2025

Volume 81, article number 558, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Qingxiao Ma¹,
Xiangsuo Fan¹,
Huajin Chen² &
…
Tingting Li¹

251 Accesses
Explore all metrics

A Correction to this article was published on 01 April 2025

This article has been updated

Abstract

In recent years, the technology for detecting small and dim infrared targets has played a crucial role in both military and civilian security fields. Deep learning-based methods have also achieved remarkable progress in this area. However, it is still restricted by challenges such as small target size, low signal-to-noise ratio, and complex backgrounds. Therefore, this paper proposes an improved model IRE-YOLO based on You Only Look Once (YOLO) to enhance the detection accuracy of small targets. To improve the model’s feature extraction ability for targets, a receptive field enhancement module based on dilated convolution and shared weights is proposed. By expanding the receptive field of the feature map, it can extract the detailed features and local information of multi-scale targets. Secondly, to address the difficulties of small target size and low image resolution, a space-to-depth convolution is added to the backbone network. By converting the spatial dimension into the depth dimension, it can effectively capture the context information of small targets. In addition, to enhance the accuracy of the model for real detection boxes, this paper proposes an SNS algorithm, which can effectively remove redundant detection boxes. IRE-YOLO is compared and evaluated with other models on two public datasets, IDTA and SIRST. The experimental results show that compared with the baseline YOLOv5s, the mean average precision (mAP) of IRE-YOLO has increased by 2% and 2.1%, respectively, significantly improving the detection accuracy of small and dim infrared targets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection of Small Targets in Aerial Images Based on Feature Fusion

A real-time small target detection network

Article 22 February 2021

Multi-receptive Field and Multi-stage Fusion Network for Infrared Small Target Detection

Data Availability

No datasets were generated or analysed during the current study.

Change history

27 March 2025
The original online version of this article was revised: " The Funding information section was missing from this article and should have read '1. National Natural Science Foundation of China (Grant No. 62261004) 2. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2025GXNSFFA069014) 3. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2023GXNSFFA026002)'".
01 April 2025
A Correction to this paper has been published: https://doi.org/10.1007/s11227-025-07223-9

References

Hou F, Zhang Y, Zhou Y, Zhang M, Lv B, Wu J (2022) Review on infrared imaging technology. Sustainability 14(18):11161
MATH Google Scholar
Sun C, Dai G, Wang M, Peng L, Chen X, Song Z (2024) High-resolution network for static infrared weak and small targets detection. Eng Appl Artif Intell 133:107924
MATH Google Scholar
Li B, Xiao C, Wang L, Wang Y, Lin Z, Li M, An W, Guo Y (2022) Dense nested attention network for infrared small target detection. IEEE Trans Image Process 32:1745–1758
MATH Google Scholar
Gao C, Meng D, Yang Y, Wang Y, Zhou X, Hauptmann AG (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009
MathSciNet MATH Google Scholar
Han J, Ma Y, Zhou B, Fan F, Liang K, Fang Y (2014) A robust infrared small target detection algorithm based on human visual system. IEEE Geosci Remote Sens Lett 11(12):2168–2172
MATH Google Scholar
Ran Q, Wang Q, Zhao B, Wu Y, Pu S, Li Z (2021) Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images. IEEE J Select Topics Appl Earth Observ Remote Sens 14:5786–5795
MATH Google Scholar
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162
Girshick R (2015) Fast r-cnn. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
MATH Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788
Zhou J, Feng K, Li W, Han J, Pan F (2022) Ts4net: Two-stage sample selective strategy for rotating object detection. Neurocomputing 501:753–764
MATH Google Scholar
Zhang J, Wu X, Hoi SC, Zhu J (2020) Feature agglomeration networks for single stage face detection. Neurocomputing 380:180–189
MATH Google Scholar
Wang L, Hua S, Zhang C, Yang G, Ren J, Li J (2024) YOLOdrive: A Lightweight Autonomous Driving Single-Stage Target Detection Approach. IEEE Internet of Things Journal 11(22):36099-36113
MATH Google Scholar
Jocher G, Chaurasia A, Stoken A, Borovec J, Kwon Y, Michael K, Fang J, Yifu Z, Wong C, Montes D, et al (2022) ultralytics/yolov5: v7. 0-yolov5 sota realtime instance segmentation. Zenodo. https://doi.org/10.5281/zenodo.7347926
Wang X, Peng Z, Zhang P, He Y (2017) Infrared small target detection via nonnegativity-constrained variational mode decomposition. IEEE Geosci Remote Sens Lett 14(10):1700–1704
MATH Google Scholar
Gao C, Meng D, Yang Y, Wang Y, Zhou X, Hauptmann AG (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009
MathSciNet MATH Google Scholar
Kim S, Lee J (2012) Scale invariant small target detection by optimizing signal-to-clutter ratio in heterogeneous background for infrared search and track. Pattern Recogn 45(1):393–406
MATH Google Scholar
Ma T, Cheng K, Chai T, Prasad S, Zhao D, Li J, Zhou H (2024) Mdcenet: Multi-dimensional cross-enhanced network for infrared small target detection. Infrared Phys Technol 141:105475
Google Scholar
Tomasi C, Manduchi R (1998) Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 839–846. IEEE
Cao Y, Liu R, Yang J (2008) Small target detection using two-dimensional least mean square (tdlms) filter based on neighborhood analysis. Int J Infrared Millimeter Waves 29:188–200
MATH Google Scholar
Comaniciu D (2003) An algorithm for data-driven bandwidth selection. IEEE Trans Pattern Anal Mach Intell 25(2):281–288
MATH Google Scholar
Bai X, Zhou F (2010) Analysis of new top-hat transformation and the application for infrared dim small target detection. Pattern Recogn 43(6):2145–2156
MATH Google Scholar
Gao C, Meng D, Yang Y, Wang Y, Zhou X, Hauptmann AG (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009
MathSciNet MATH Google Scholar
Dai Y, Wu Y, Song Y, Guo J (2017) Non-negative infrared patch-image model: Robust target-background separation via partial sum minimization of singular values. Inf Phys Technol 81:182–194
MATH Google Scholar
Chen CP, Li H, Wei Y, Xia T, Tang YY (2013) A local contrast method for small infrared target detection. IEEE Trans Geosci Remote Sens 52(1):574–581
MATH Google Scholar
Han J, Ma Y, Zhou B, Fan F, Liang K, Fang Y (2014) A robust infrared small target detection algorithm based on human visual system. IEEE Geosci Remote Sens Lett 11(12):2168–2172
MATH Google Scholar
Li C, Zhou A, Yao A (2022) Omni-dimensional dynamic convolution. arXiv preprint arXiv:2209.07947
Chen Y, Dai X, Liu M, Chen D, Yuan L, Liu Z (2020) Dynamic convolution: Attention over convolution kernels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11030–11039
Wang J, Xu C, Yang W, Yu L (2021) A normalized gaussian wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125
Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. arXiv preprint arXiv:1805.10180
Zheng Z, Wang P, Ren D, Liu W, Ye R, Hu Q, Zuo W (2021) Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cyber 52(8):8574–8586
MATH Google Scholar
Yu Z, Huang H, Chen W, Su Y, Liu Y, Wang X. Yolo-facev2: A scale and occlusion aware face detector. arxiv 2022. arXiv preprint arXiv:2208.02019
Sunkara R, Luo T (2022) No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459. Springer
Bodla N, Singh B, Chellappa R, Davis LS (2017) Soft-nms–improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5561–5569
Gevorgyan Z (2022) Siou loss: More powerful learning for bounding box regression. arXiv preprint arXiv:2205.12740
Hui B, Song Z, Fan H, Zhong P, Hu W, Zhang X, Ling J, Su H, Jin W, Zhang Y et al (2020) A dataset for infrared detection and tracking of dim-small aircraft targets under ground/air background. China Sci Data 5(3):291–302
Google Scholar
Dai Y, Wu Y, Zhou F, Barnard K (2021) Asymmetric contextual modulation for infrared small target detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 950–959
Arthur D, Vassilvitskii S (2006) k-means++: The advantages of careful seeding. Technical report, Stanford
Fan X, Ding W, Qin W, Xiao D, Min L, Yuan H (2023) Fusing self-attention and coordconv to improve the yolov5s algorithm for infrared weak target detection. Sensors 23(15):6755
MATH Google Scholar

Download references

Funding

National Natural Science Foundation of China (Grant No. 62261004, Guangxi Distinguished Young Scholars Research Fund (Grant No. 2025GXNSFFA069014), Guangxi Distinguished Young Scholars Research Fund (Grant No. 2023GXNSFFA026002).

Author information

Authors and Affiliations

School of Automation, Guangxi University of Science and Technology, No. 19, Guantang Avenue, Yufeng District, Liuzhou, 545006, Guangxi, China
Qingxiao Ma, Xiangsuo Fan & Tingting Li
School of Electronic Engineering, Guangxi University of Science and Technology, No. 268, Donghuan Avenue, Chengzhong District, Liuzhou, 545006, Guangxi, China
Huajin Chen

Authors

Qingxiao Ma
View author publications
You can also search for this author inPubMed Google Scholar
Xiangsuo Fan
View author publications
You can also search for this author inPubMed Google Scholar
Huajin Chen
View author publications
You can also search for this author inPubMed Google Scholar
Tingting Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Ma: Undertook the primary writing of the first draft of the dissertation. Organically integrated the research results from various sections and presented them in a clear and logically coherent manner. Carefully organized the structure of the paper, including various sections such as abstract, introduction, methods, results, discussion and conclusion, to ensure that the paper is complete and well-organized. Fan: Took the lead in planning the entire study, presenting the research questions and general ideas. Based on the current status and potential research gaps in the research area, research directions and objectives were defined, and a framework was set up for subsequent research work. Chen: After the completion of the first draft of the thesis, the language expression, logical clarity and formatting specifications of the thesis were meticulously checked and embellished. We improved the readability of the thesis in terms of grammar, vocabulary and sentence structure, and made many constructive suggestions for revision, which made the thesis more accurate and concise in expression. Li：Mainly responsible for the drawing of tables and pictures throughout the paper.

Corresponding author

Correspondence to Xiangsuo Fan.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: " The Funding information section was missing from this article and should have read '1. National Natural Science Foundation of China (Grant No. 62261004) 2. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2025GXNSFFA069014) 3. Guangxi Distinguished Young Scholars Research Fund (Grant No. 2023GXNSFFA026002)'".

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, Q., Fan, X., Chen, H. et al. IRE-YOLO: Infrared weak target detection algorithm based on the fusion of multi-scale receptive fields and efficient convolution. J Supercomput 81, 558 (2025). https://doi.org/10.1007/s11227-025-07057-5

Download citation

Accepted: 12 February 2025
Published: 28 February 2025
DOI: https://doi.org/10.1007/s11227-025-07057-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

IRE-YOLO: Infrared weak target detection algorithm based on the fusion of multi-scale receptive fields and efficient convolution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Detection of Small Targets in Aerial Images Based on Feature Fusion

A real-time small target detection network

Multi-receptive Field and Multi-stage Fusion Network for Infrared Small Target Detection

Data Availability

Change history

27 March 2025

01 April 2025

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now