Skip to main content
Log in

Infrared and visible image fusion using a feature attention guided perceptual generative adversarial network

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

In recent years, the performance of infrared and visible image fusion has been dramatically improved by using deep learning techniques. However, the fusion results are still not satisfactory as the fused images frequently suffer from blurred details, unenhanced vital regions, and artifacts. To resolve these problems, we have developed a novel feature attention-guided perceptual generative adversarial network (FAPGAN) for fusing infrared and visible images. In FAPGAN, a feature attention module is proposed to incorporate with the generator aiming to produce a fused image that maintains the detailed information while highlighting the vital regions in the source images. Our feature attention module consists of spatial attention and pixel attention parts. The spatial attention aims to enhance the vital regions while the pixel attention aims to make the network focus on high frequency information to retain the detailed information. Furthermore, we introduce a perceptual loss combined with adversarial loss and content loss to optimize the generator. The perceptual loss is to make the fused image more similar to the source infrared image at the semantic level, which can not only make the fused image maintain the vital target and detailed information from the infrared image, but also remove the halo artifacts by reducing the discrepancy. Experimental results on public datasets demonstrate that our FAPGAN is superior to those of state-of-the-art approaches in both subjective visual effect and objective assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Beck A, Teboulle M, Teboulle M (2009) Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans Image Process 18(11):2419–2434

    Article  MathSciNet  MATH  Google Scholar 

  • Burt PJ, Adelson EH (1987) The Laplacian pyramid as a compact image code. Readings Comp Vis 31(4):671–679

    Google Scholar 

  • Chen J, Li X, Luo L, Mei X, Ma J (2020) Infrared and visible image fusion based on target-enhanced multiscale transform decomposition. Inf Sci 508:64–78

    Article  Google Scholar 

  • Cui Y, Du H, Mei W (2019) Infrared and visible image fusion using detail enhanced channel attention network. IEEE Access 7:182185–182197

    Article  Google Scholar 

  • Cvejic N, Loza A, Bull D, Canagarajah N (2005) A similarity metric for assessment of image fusion algorithms. Int J Signal Process 2(3):178–182

    Google Scholar 

  • Han J, Pauwels EJ, De Zeeuw P (2013a) Fast saliency-aware multi-modality image fusion. Neurocomputing 111:70–80

    Article  Google Scholar 

  • Han Y, Cai Y, Cao Y, Xu X (2013b) A new image fusion performance metric based on visual information fidelity. Inf Fusion 14(2):127–135

    Article  Google Scholar 

  • Hou R, Zhou D, Nie R, Liu D, Xiong L, Guo Y, Yu C (2020) VIF-Net: an unsupervised framework for infrared and visible image fusion. IEEE Trans Comput Imaging 6:640–651

    Article  Google Scholar 

  • Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2016) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  • Jing P, Su Y, Nie L, Bai X, Liu J, Wang M (2017) Low-rank multi-view embedding learning for micro-video popularity prediction. IEEE Trans Knowl Data Eng 30(8):1519–1532

    Article  Google Scholar 

  • Jing P, Ye S, Nie L, Liu J, Su Y (2019) Low-rank regularized multi-representation learning for fashion compatibility prediction. IEEE Trans Multimedia 22(6):1555–1566

    Article  Google Scholar 

  • Kong W, Lei Y, Zhao H (2014) Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization. Infrared Phys Technol 67:161–172

    Article  Google Scholar 

  • Lewis JJ, O’Callaghan RJ, Nikolov SG, Bull DR, Canagarajah N (2007) Pixel-and region-based image fusion with complex wavelets. Inf Fusion 8(2):119–130

    Article  Google Scholar 

  • Li H, Wu XJ (2018) Densefuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28(5):2614–2623

    Article  MathSciNet  Google Scholar 

  • Li S, Yin H, Fang L (2012) Group-sparse representation with dictionary learning for medical image denoising and fusion. IEEE Trans Biomed Eng 59(12):3450–3459

    Article  Google Scholar 

  • Li H, Wu XJ, Kittler J (2018) Infrared and visible image fusion using a deep learning framework. In: Proceedings of the 24th international conference on pattern recognition, pp 2705–2710

  • Li H, Wu XJ, Durrani TS (2019a) Infrared and visible image fusion with ResNet and zero-phase component analysis. Infrared Phys Technol 102:103039

    Article  Google Scholar 

  • Li Q, Lu L, Li Z, Wu W, Liu Z, Jeon G, Yang X (2019b) Coupled GAN with relativistic discriminators for infrared and visible images fusion. IEEE Sens J 21(6):7458–7467

    Article  Google Scholar 

  • Li J, Huo H, Liu K, Li C (2020) Infrared and visible image fusion using dual discriminators generative adversarial networks with Wasserstein distance. Inf Sci 529:28–41

    Article  MathSciNet  Google Scholar 

  • Li J, Huo H, Li C, Wang R, Feng Q (2021) AttentionFGAN: infrared and visible image fusion using attention-based generative adversarial networks. IEEE Trans Multimedia 23:1383–1396

    Article  Google Scholar 

  • Liu Y, Liu S, Wang Z (2015) A general framework for image fusion based on multi-scale transform and sparse representation. Inf Fusion 24:147–164

    Article  Google Scholar 

  • Liu CH, Qi Y, Ding WR (2017) Infrared and visible image fusion method based on saliency detection in sparse domain. Infrared Phys Technol 83:94–102

    Article  Google Scholar 

  • Liu M, Nie L, Wang X, Tian Q, Chen B (2018a) Online data organizer: micro-video categorization by structure-guided multimodal dictionary learning. IEEE Trans Image Process 28(3):1235–1247

    Article  MathSciNet  Google Scholar 

  • Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018b) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets Multiresolut Inf Process 16(3):1850018

    Article  MathSciNet  MATH  Google Scholar 

  • Ma J, Chen C, Li C, Huang J (2016) Infrared and visible image fusion via gradient transfer and total variation minimization. Inf Fusion 31:100–109

    Article  Google Scholar 

  • Ma J, Ma Y, Li C (2018) Infrared and visible image fusion methods and applications: a survey. Inf Fusion 45:153–178

    Article  Google Scholar 

  • Ma J, Yu W, Liang P, Li C, Jiang J (2019) FusionGAN: a generative adversarial network for infrared and visible image fusion. Inf Fusion 48:11–26

    Article  Google Scholar 

  • Ma J, Liang P, Yu W, Chen C, Guo X, Wu J, Jiang J (2020a) Infrared and visible image fusion via detail preserving adversarial learning. Inf Fusion 54:85–98

    Article  Google Scholar 

  • Ma J, Xu H, Jiang J, Mei X, Zhang XP (2020b) DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans Image Process 29:4980–4995

    Article  MATH  Google Scholar 

  • Nencini F, Garzelli A, Baronti S, Alparone L (2007) Remote sensing image fusion using the curvelet transform. Inf Fusion 8(2):143–156

    Article  Google Scholar 

  • Nirmalraj S, Nagarajan G (2019) An adaptive fusion of infrared and visible image based on learning of sparse fuzzy cognitive maps on compressive sensing. J Ambient Intell Humaniz Comput 28:1–11

    Google Scholar 

  • Petrović V, Xydeas C (2004) Evaluation of image fusion performance with visible differences. Proceeding of the 8th European conference on computer vision. Springer, Berlin, Heidelberg, pp 380–391

    MATH  Google Scholar 

  • Rajkumar S, Mouli PC (2014) Infrared and visible image fusion using entropy and neuro-fuzzy concepts. In: ICT and critical infrastructure: proceedings of the 48th annual convention of computer society of India, vol I. Springer, Cham, pp 93–100

  • Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767. Apr 8

  • Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the international conference on learning representations, pp 1–14

  • Takumi K, Watanabe K, Ha Q, Tejero-De-Pablos A, Ushiku Y, Harada T (2017) Multispectral object detection for autonomous vehicles. In: Proceedings of the on thematic workshops of ACM multimedia, pp 35–43

  • Toet A (1989) Image fusion by a ratio of low-pass pyramid. Pattern Recogn Lett 9(4):245–253

    Article  MATH  Google Scholar 

  • Toet A, Hogervorst M (2012) Progress in color night vision. Opt Eng 51(1):010901

    Article  Google Scholar 

  • Woo S, Park J, Lee JY, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19

  • Xu J, Shi X, Qin S, Lu K, Wang H, Ma J (2020a) LBP-BEGAN: a generative adversarial network architecture for infrared and visible image fusion. Infrared Phys Technol 104:103144

    Article  Google Scholar 

  • Xu H, Ma J, Le Z, Jiang J, Guo X (2020b) Fusiondn: a unified densely connected network for image fusion. In: Proceedings of the AAAI conference on artificial intelligence, pp 12484–12491

  • Xu H, Ma J, Jiang J, Guo X, Ling H (2020c) U2Fusion: a unified unsupervised image fusion network. IEEE Trans Pattern Anal Mach Intell 44(1):502–518

    Article  Google Scholar 

  • Yang C, Zhang J, Wang X, Liu X (2008) A novel similarity based quality metric for image fusion. Inf Fusion 9(2):156–160

    Article  Google Scholar 

  • Yin H (2015) Sparse representation with learned multiscale dictionary for image fusion. Neurocomputing 148:600–610

    Article  Google Scholar 

  • Zhang Q, Guo BL (2009) Multifocus image fusion using the nonsubsampled contourlet transform. Signal Process 89(7):1334–1346

    Article  MATH  Google Scholar 

  • Zhang Q, Liu Y, Blum RS, Han J, Tao D (2018) Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: a review. Inf Fusion 40:57–75

    Article  Google Scholar 

  • Zhao J, Chen Y, Feng H, Xu Z, Li Q (2014) Infrared image enhancement through saliency feature analysis based on multi-scale decomposition. Infrared Phys Technol 62:86–93

    Article  Google Scholar 

  • Zhao J, Cui G, Gong X, Zang Y, Tao S, Wang D (2017) Fusion of visible and infrared images using global entropy and gradient constrained regularization. Infrared Phys Technol 81:201–209

    Article  Google Scholar 

Download references

Acknowledgements

This material is based on work supported by the Ministry of Trade, Industry & Energy (MOTIE, Korea) under Industrial Technology Innovation Program (10080619).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyunchul Shin.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Zheng, W. & Shin, H. Infrared and visible image fusion using a feature attention guided perceptual generative adversarial network. J Ambient Intell Human Comput 14, 9099–9112 (2023). https://doi.org/10.1007/s12652-022-04414-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-022-04414-7

Keywords

Navigation