An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion

Liu, Hongzhe; Yan, Hua

doi:10.1007/s11042-022-14314-9

An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion

Published: 27 December 2022

Volume 82, pages 20139–20156, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

484 Accesses
1 Altmetric
Explore all metrics

Abstract

Infrared and visible image fusion aims to obtain a more informative fusion image by merging the infrared and visible images. However, the existing methods have some shortcomings, such as detail information loss, unclear boundaries, and not being end-to-end. In this paper, we propose an end-to-end network architecture for infrared and visible image fusion task. Our network contains three essential parts: encoders, residual fusion module, and decoder. First, we input infrared and visible images to two encoders to extract shallow features, respectively. Subsequently, the two sets of features are concatenated and fed to the residual fusion module to extract multi-scale features and fuse them adequately. Finally, the fused image is obtained by the decoder. We conduct objective and subjective experiments on two public datasets. The comparison results with the state-of-art methods prove that the fusion results of the proposed method have better objective metrics and contain more detail information and more explicit boundary.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aslantas V, Bendes E (2015) A new image quality metric for image fusion: the sum of the correlations of differences. Aeu-international J Electron Commun 69(12):1890–1896. https://doi.org/10.1016/j.aeue.2015.09.004
Article Google Scholar
Bavirisetti DP, Dhuli R (2016) Two-scale image fusion of visible and infrared images using saliency detection. Infrared Phys Technol 76:52–64
Article Google Scholar
Bavirisetti DP, Xiao G, Liu G (2017) Multi-sensor image fusion based on fourth order partial differential equations. In: 2017 20th International conference on information fusion (Fusion), IEEE, pp 1–9. https://doi.org/10.23919/ICIF.2017.8009719
Darbari A, Kumar K, Darbari S, Patil PL (2021) Requirement of artificial intelligence technology awareness for thoracic surgeons. Cardiothorac Surgeon 29(1):1–10
Article Google Scholar
Fu Y, Wu XJ (2021) A dual-branch network for infrared and visible image fusion. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 10675–10680. https://doi.org/10.1109/ICPR48806.2021.9412293
Haghighat M, Razian MA (2014) Fast-FMI: non-reference image fusion metric. In: 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), pp 1–3. https://doi.org/10.1109/ICAICT.2014.7036000
Hanna BV, Gorbach AM, Gage FA et al (2008) Intraoperative assessment of critical biliary structures with visible range/infrared image fusion. J Am Coll Surg 206(6):1227–1231. https://doi.org/10.1016/j.jamcollsurg.2007.10.012
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106
Article Google Scholar
Hwang S, Park J, Kim N, Choi Y, So Kweon I (2015) Multispectral pedestrian detection: benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045
Jagalingam P, Hegde AV (2015) A review of quality metrics for fused image. Aquat Procedia 4:133–142. https://doi.org/10.1016/j.aqpro.2015.02.019
Article Google Scholar
Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Kämäräinen JK et al (2020) The eighth visual object tracking VOT2020 challenge results. In: European conference on computer vision. Springer, Cham, pp 547–601
Li H, Wu XJ (2018) DenseFuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28(5):2614–2623. https://doi.org/10.1109/TIP.2018.2887342
Article MathSciNet Google Scholar
Li S, Kang X, Hu J (2013) Image fusion with guided filtering. IEEE Trans Image Process 22(7):2864–2875. https://doi.org/10.1109/TIP.2013.2244222
Article Google Scholar
Li S, Kang X, Fang L, Hu J, Yin H (2017) Pixel-level image fusion: a survey of the state of the art. Inf Fusion 33:100–112. https://doi.org/10.1016/j.inffus.2016.05.004
Article Google Scholar
Li H, Wu XJ, Kittler J (2018) Infrared and visible image fusion using a deep learning framework. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 2705–2710. https://doi.org/10.1109/ICPR.2018.8546006
Li H, Wu XJ, Durrani TS (2019) Infrared and visible image fusion with ResNet and zero-phase component analysis. Infrared Phys Technol 102:103039
Article Google Scholar
Li Q, Lu L, Li Z, Wu W, Liu Z, Jeon G, Yang X (2019) Coupled GAN with relativistic discriminators for infrared and visible images fusion. IEEE Sens J 21(6):7458–7467. https://doi.org/10.1109/JSEN.2019.2921803
Article Google Scholar
Li H, Wu XJ, Durrani T (2020) NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans Instrum Meas 69(12):9645–9656. https://doi.org/10.1109/TIM.2020.3005230
Article Google Scholar
Li H, Wu XJ, Kittler J (2021) RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Inf Fusion 73:72–86. https://doi.org/10.1016/j.inffus.2021.02.023
Article Google Scholar
Li J, Li B, Jiang Y, Cai W (2022) MSAt-GAN: a generative adversarial network based on multi-scale and deep attention mechanism for infrared and visible light image fusion. Complex Intell Syst:1–29
Liu Y, Liu S, Wang Z (2015) A general framework for image fusion based on multi-scale transform and sparse representation. Inf Fusion 24:147–164. https://doi.org/10.1016/j.inffus.2014.09.004
Liu Y, Chen X, Peng H, Wang Z (2017) Multi-focus image fusion with a deep convolutional neural network. Inf Fusion 36:191–207. https://doi.org/10.1016/j.inffus.2016.12.001
Article Google Scholar
Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets Multiresolut Inf Process 16(03):1850018. https://doi.org/10.1142/S0219691318500182
Article MathSciNet MATH Google Scholar
Ma K, Zeng K, Wang Z (2015) Perceptual quality assessment for multi-exposure image fusion. IEEE Trans Image Process 24(11):3345–3356. https://doi.org/10.1109/TIP.2015.2442920
Article MathSciNet MATH Google Scholar
Ma J, Chen C, Li C, Huang J (2016) Infrared and visible image fusion via gradient transfer and total variation minimization. Inf Fusion 31:100–109. https://doi.org/10.1016/j.inffus.2016.02.001
Article Google Scholar
Ma J, Yu W, Liang P, Li C, Jiang J (2019) FusionGAN: a generative adversarial network for infrared and visible image fusion. Inf Fusion 48:11–26
Article Google Scholar
Ma J, Xu H, Jiang J, Mei X, Zhang X (2020) DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans Image Process 29:4980–4995. https://doi.org/10.1109/TIP.2020.2977573
Article MATH Google Scholar
Ma J, Zhang H, Shao Z, Liang P, Xu H (2020) GANMcC: a generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Trans Instrum Meas 70:1–14. https://doi.org/10.1109/TIM.2020.3038013
Article Google Scholar
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 552–568
Negi A, Kumar K (2021) Classification and detection of citrus diseases using deep learning. In: Data science and its applications. Chapman and Hall/CRC, pp 63–85
Negi A, Kumar K (2022) AI-based implementation of decisive technology for prevention and fight with COVID-19. In: Cyber-physical systems. Academic Press, pp 1–14
Negi A, Kumar K, Chauhan P (2021) Deep neural network-based multi‐class image classification for plant diseases. Agricultural informatics: automation using the IoT and machine learning, pp 117–129
Nejati M, Samavi S, Shirani S (2015) Multi-focus image fusion using dictionary-based sparse representation. Inf Fusion 25:72–84. https://doi.org/10.1016/j.inffus.2014.10.004
Article Google Scholar
Ram Prabhakar K, Sai Srikar V, Venkatesh Babu R (2017) Deepfuse: a deep unsupervised approach for exposure fusion with extreme exposure image pairs. In: Proceedings of the IEEE international conference on computer vision, pp 4714–4722
Reinhard E, Adhikhmin M, Gooch B, Shirley P (2001) Color transfer between images. IEEE Comput Graph Appl 21(5):34–41. https://doi.org/10.1109/38.946629
Article Google Scholar
Shreyamsha Kumar BK (2015) Image fusion based on pixel significance using cross bilateral filter. SIViP 9(5):1193–1204
Article Google Scholar
Simone G, Farina A, Morabito FC, Serpico SB, Bruzzone L (2002) Image fusion techniques for remote sensing applications. Inform fusion 3(1):3–15. https://doi.org/10.1016/S1566-2535(01)00056-2
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Toet A (2014) TNO image fusion dataset. https://figshare.com/articles/TN_Image_Fusion_Dataset/1008029
Singh R, Vatsa M, Noore A (2008) Integrated multilevel image fusion and match score fusion of visible and infrared face images for robust face recognition. Pattern Recogn 41(3):880–893. https://doi.org/10.1016/j.patcog.2007.06.022
Article MATH Google Scholar
Sun C, Zhang C, Xiong N (2020) Infrared and visible image fusion techniques based on deep learning: a review. Electronics 9(12):2162. https://doi.org/10.3390/electronics9122162
Article Google Scholar
Sun K, Zhang B, Chen Y, Luo Z, Zheng K, Wu H, Sun X (2021) The facial expression recognition method based on image fusion and CNN. Integr Ferroelectr 217(1):198–213
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612. https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Xydeas CA, Petrovic V (2000) Objective image fusion performance measure. Electron Lett 36(4):308–309
Article Google Scholar
Zhang X, Ye P, Leung H, Gong K, Xiao G (2020) Object fusion tracking based on visible and infrared images: a comprehensive review. Inf Fusion 63:166–187
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Electronics and Information Engineering, Sichuan University, Chengdu, Sichuan, 610065, China
Hongzhe Liu & Hua Yan

Authors

Hongzhe Liu
View author publications
You can also search for this author inPubMed Google Scholar
Hua Yan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Hua Yan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest to this work. The people involved in the experiment have been informed and formally accepted.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, H., Yan, H. An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion. Multimed Tools Appl 82, 20139–20156 (2023). https://doi.org/10.1007/s11042-022-14314-9

Download citation

Received: 06 February 2022
Revised: 13 October 2022
Accepted: 10 December 2022
Published: 27 December 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11042-022-14314-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion

Abstract

Access this article

Subscribe and save

Buy Now

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now