Abstract
Infrared and visible image fusion aims to obtain a more informative fusion image by merging the infrared and visible images. However, the existing methods have some shortcomings, such as detail information loss, unclear boundaries, and not being end-to-end. In this paper, we propose an end-to-end network architecture for infrared and visible image fusion task. Our network contains three essential parts: encoders, residual fusion module, and decoder. First, we input infrared and visible images to two encoders to extract shallow features, respectively. Subsequently, the two sets of features are concatenated and fed to the residual fusion module to extract multi-scale features and fuse them adequately. Finally, the fused image is obtained by the decoder. We conduct objective and subjective experiments on two public datasets. The comparison results with the state-of-art methods prove that the fusion results of the proposed method have better objective metrics and contain more detail information and more explicit boundary.






References
Aslantas V, Bendes E (2015) A new image quality metric for image fusion: the sum of the correlations of differences. Aeu-international J Electron Commun 69(12):1890–1896. https://doi.org/10.1016/j.aeue.2015.09.004
Bavirisetti DP, Dhuli R (2016) Two-scale image fusion of visible and infrared images using saliency detection. Infrared Phys Technol 76:52–64
Bavirisetti DP, Xiao G, Liu G (2017) Multi-sensor image fusion based on fourth order partial differential equations. In: 2017 20th International conference on information fusion (Fusion), IEEE, pp 1–9. https://doi.org/10.23919/ICIF.2017.8009719
Darbari A, Kumar K, Darbari S, Patil PL (2021) Requirement of artificial intelligence technology awareness for thoracic surgeons. Cardiothorac Surgeon 29(1):1–10
Fu Y, Wu XJ (2021) A dual-branch network for infrared and visible image fusion. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 10675–10680. https://doi.org/10.1109/ICPR48806.2021.9412293
Haghighat M, Razian MA (2014) Fast-FMI: non-reference image fusion metric. In: 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), pp 1–3. https://doi.org/10.1109/ICAICT.2014.7036000
Hanna BV, Gorbach AM, Gage FA et al (2008) Intraoperative assessment of critical biliary structures with visible range/infrared image fusion. J Am Coll Surg 206(6):1227–1231. https://doi.org/10.1016/j.jamcollsurg.2007.10.012
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106
Hwang S, Park J, Kim N, Choi Y, So Kweon I (2015) Multispectral pedestrian detection: benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045
Jagalingam P, Hegde AV (2015) A review of quality metrics for fused image. Aquat Procedia 4:133–142. https://doi.org/10.1016/j.aqpro.2015.02.019
Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Kämäräinen JK et al (2020) The eighth visual object tracking VOT2020 challenge results. In: European conference on computer vision. Springer, Cham, pp 547–601
Li H, Wu XJ (2018) DenseFuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28(5):2614–2623. https://doi.org/10.1109/TIP.2018.2887342
Li S, Kang X, Hu J (2013) Image fusion with guided filtering. IEEE Trans Image Process 22(7):2864–2875. https://doi.org/10.1109/TIP.2013.2244222
Li S, Kang X, Fang L, Hu J, Yin H (2017) Pixel-level image fusion: a survey of the state of the art. Inf Fusion 33:100–112. https://doi.org/10.1016/j.inffus.2016.05.004
Li H, Wu XJ, Kittler J (2018) Infrared and visible image fusion using a deep learning framework. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 2705–2710. https://doi.org/10.1109/ICPR.2018.8546006
Li H, Wu XJ, Durrani TS (2019) Infrared and visible image fusion with ResNet and zero-phase component analysis. Infrared Phys Technol 102:103039
Li Q, Lu L, Li Z, Wu W, Liu Z, Jeon G, Yang X (2019) Coupled GAN with relativistic discriminators for infrared and visible images fusion. IEEE Sens J 21(6):7458–7467. https://doi.org/10.1109/JSEN.2019.2921803
Li H, Wu XJ, Durrani T (2020) NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans Instrum Meas 69(12):9645–9656. https://doi.org/10.1109/TIM.2020.3005230
Li H, Wu XJ, Kittler J (2021) RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Inf Fusion 73:72–86. https://doi.org/10.1016/j.inffus.2021.02.023
Li J, Li B, Jiang Y, Cai W (2022) MSAt-GAN: a generative adversarial network based on multi-scale and deep attention mechanism for infrared and visible light image fusion. Complex Intell Syst:1–29
Liu Y, Liu S, Wang Z (2015) A general framework for image fusion based on multi-scale transform and sparse representation. Inf Fusion 24:147–164. https://doi.org/10.1016/j.inffus.2014.09.004
Liu Y, Chen X, Peng H, Wang Z (2017) Multi-focus image fusion with a deep convolutional neural network. Inf Fusion 36:191–207. https://doi.org/10.1016/j.inffus.2016.12.001
Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets Multiresolut Inf Process 16(03):1850018. https://doi.org/10.1142/S0219691318500182
Ma K, Zeng K, Wang Z (2015) Perceptual quality assessment for multi-exposure image fusion. IEEE Trans Image Process 24(11):3345–3356. https://doi.org/10.1109/TIP.2015.2442920
Ma J, Chen C, Li C, Huang J (2016) Infrared and visible image fusion via gradient transfer and total variation minimization. Inf Fusion 31:100–109. https://doi.org/10.1016/j.inffus.2016.02.001
Ma J, Yu W, Liang P, Li C, Jiang J (2019) FusionGAN: a generative adversarial network for infrared and visible image fusion. Inf Fusion 48:11–26
Ma J, Xu H, Jiang J, Mei X, Zhang X (2020) DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans Image Process 29:4980–4995. https://doi.org/10.1109/TIP.2020.2977573
Ma J, Zhang H, Shao Z, Liang P, Xu H (2020) GANMcC: a generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Trans Instrum Meas 70:1–14. https://doi.org/10.1109/TIM.2020.3038013
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the european conference on computer vision (ECCV), pp 552–568
Negi A, Kumar K (2021) Classification and detection of citrus diseases using deep learning. In: Data science and its applications. Chapman and Hall/CRC, pp 63–85
Negi A, Kumar K (2022) AI-based implementation of decisive technology for prevention and fight with COVID-19. In: Cyber-physical systems. Academic Press, pp 1–14
Negi A, Kumar K, Chauhan P (2021) Deep neural network-based multi‐class image classification for plant diseases. Agricultural informatics: automation using the IoT and machine learning, pp 117–129
Nejati M, Samavi S, Shirani S (2015) Multi-focus image fusion using dictionary-based sparse representation. Inf Fusion 25:72–84. https://doi.org/10.1016/j.inffus.2014.10.004
Ram Prabhakar K, Sai Srikar V, Venkatesh Babu R (2017) Deepfuse: a deep unsupervised approach for exposure fusion with extreme exposure image pairs. In: Proceedings of the IEEE international conference on computer vision, pp 4714–4722
Reinhard E, Adhikhmin M, Gooch B, Shirley P (2001) Color transfer between images. IEEE Comput Graph Appl 21(5):34–41. https://doi.org/10.1109/38.946629
Shreyamsha Kumar BK (2015) Image fusion based on pixel significance using cross bilateral filter. SIViP 9(5):1193–1204
Simone G, Farina A, Morabito FC, Serpico SB, Bruzzone L (2002) Image fusion techniques for remote sensing applications. Inform fusion 3(1):3–15. https://doi.org/10.1016/S1566-2535(01)00056-2
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Toet A (2014) TNO image fusion dataset. https://figshare.com/articles/TN_Image_Fusion_Dataset/1008029
Singh R, Vatsa M, Noore A (2008) Integrated multilevel image fusion and match score fusion of visible and infrared face images for robust face recognition. Pattern Recogn 41(3):880–893. https://doi.org/10.1016/j.patcog.2007.06.022
Sun C, Zhang C, Xiong N (2020) Infrared and visible image fusion techniques based on deep learning: a review. Electronics 9(12):2162. https://doi.org/10.3390/electronics9122162
Sun K, Zhang B, Chen Y, Luo Z, Zheng K, Wu H, Sun X (2021) The facial expression recognition method based on image fusion and CNN. Integr Ferroelectr 217(1):198–213
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612. https://doi.org/10.1109/TIP.2003.819861
Xydeas CA, Petrovic V (2000) Objective image fusion performance measure. Electron Lett 36(4):308–309
Zhang X, Ye P, Leung H, Gong K, Xiao G (2020) Object fusion tracking based on visible and infrared images: a comprehensive review. Inf Fusion 63:166–187
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to this work. The people involved in the experiment have been informed and formally accepted.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, H., Yan, H. An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion. Multimed Tools Appl 82, 20139–20156 (2023). https://doi.org/10.1007/s11042-022-14314-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-14314-9