Skip to main content
Log in

Infrared and visible image fusion based on NSCT and stacked sparse autoencoders

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

To integrate the infrared object into the fused image effectively, a novel infrared (IR) and visible (VI) image fusion method by using nonsubsampled contourlet transform (NSCT) and stacked sparse autoencoders (SSAE) is proposed. Firstly, the IR and VI images are decomposed into low-frequency subbands and high-frequency subbands by using NSCT. Secondly, SSAE is performed on the low frequency subband of IR image to calculate the object reliabilities (OR) of the low frequency subband coefficients. Subsequently, an adaptive multi-strategy fusion rule based on OR is designed for the fusion of low frequency subbands and a choose-max fusion rule with the absolute values of high frequency subband coefficients are employed for the fusion of high frequency subbands. Experimental results show the proposed method is superior to the conventional methods in highlighting the infrared objects as well as keeping the background information in VI image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Arthur L, Cunha D, Zhou J, Do MN (2006) The nonsubsampled contourlet transform: theory, design, and applications. IEEE Trans Image Process 15(10):3089–3101

    Article  Google Scholar 

  2. Cai J, Cheng Q, Peng M, Song Y (2017) Fusion of infrared and visible images based on nonsubsampled contourlet transform and sparse k-SVD dictionary learning. Infrared Phys Technol 82(5):85–95

    Article  Google Scholar 

  3. Chai X, Wang Q, Zhao Y, Li Y (2016) Unsupervised domain adaptation techniques based on auto-encoder for non-stationary EEG-based emotion recognition. Comput Biol Med 79:205–214

    Article  Google Scholar 

  4. Chen Y, Xiong J, Liu H, Fan Q (2014) Fusion method of infrared and visible images based on neighborhood characteristic and regionalization in NSCT domain. Optik-Int J Light Electron Opt 125(17):4980–4984

    Article  Google Scholar 

  5. Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T-S (2017) SCA-CNN: Spatial and Channel-Wise attention in convolutional networks for image captioning. IEEE International Conference on Computer Vision 2017:6298–6306

  6. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  7. Cui G, Feng H, Xu Z, Chen Y (2015) Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition. Opt Commun 341:199–209

    Article  Google Scholar 

  8. Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306

    Article  MathSciNet  MATH  Google Scholar 

  9. Eckhorn R, Reitboeck HJ, Arndt M, Dicke P (1989) A neural network for feature linking via synchronous activity: Results from cat visual cortex and from simulations. Models of Brain Function. Cambridge University Press, pp 255–272

  10. Fu Z, Dai X, Li Y, Wu H, Wang X (2014) An Improved visible and infrared image fusion based on local energy and fuzzy logic. In: Signal processing (ICSP), pp 861–865

  11. Fu Z, Wang X, Xu J, Zhou N, Zhao Y (2016) Infrared and visible images fusion based on RPCA and NSCT. Infrared Phys Technol 77:114–123

    Article  Google Scholar 

  12. Gan W, Wu X, Wu W, Liu K (2015) Infrared and visible image fusion with the use of multi-scale edge-preserving decomposition and guided image filter. Infrared Phys Technol 72:37–51

    Article  Google Scholar 

  13. Gao C, Meng D, Yang Y, Wang Y, Zhou X (2013) Infrared Patch-Image model for small target detection in a single image. IEEE Trans Image Process 22 (12):4996–5009

    Article  MathSciNet  MATH  Google Scholar 

  14. Geng X, Zhang H, Bian J, Chua T-S (2015) Learning image and user features for recommendation in social networks. In: IEEE International conference on computer vision, pp 4274-4282

  15. Geng P, Sun X, Liu J (2017) Adopting quaternion wavelet transform to fuse Multi-Modal medical images. Multimed Tools Appl 37(2):230–239

    Google Scholar 

  16. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  17. Li H, Manjunath BS, Mitra SK (1995) Multisensor image fusion using the wavelet transform. Graph Model Image Process 57(3):235–245

    Article  Google Scholar 

  18. Li S, Kang X, Hu J (2013) Image fusion with guided filtering. IEEE Trans Image Process 22(7):2864

    Article  Google Scholar 

  19. Li H, Qiu H, Yu Z, Zhang Y (2016) Infrared and visible image fusion scheme based on NSCT and low-level visual features. Infrared Phys Technol 76:174–184

    Article  Google Scholar 

  20. Liang J, He Y, Liu D, Zeng X (2012) Image fusion using higher order singular value decomposition. IEEE Trans Image Process 21(5):2898–2909

    Article  MathSciNet  MATH  Google Scholar 

  21. Lu B, Miao C (2010) Structure tensor based image fusion, Proceedings of the International Symposium on Electronic Commerc

  22. Ma Y, Zhai Y, Geng P, Yan P (2011) A novel algorithm of image fusion based on PCNN and shearlet. Int J Digit Content Technol Appl 5(12):347–354

    Article  Google Scholar 

  23. Pajares G, de la Cruz JM (2004) A wavelet-based image fusion tutorial. Pattern Recogn 37(9):1855–1872

    Article  Google Scholar 

  24. Qu X, Yan J, Xiao H, Zhu Z (2008) Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Autom Sin 34(12):1508–1514

    Article  MATH  Google Scholar 

  25. Ranzato M, Poultney CS, Chopra S, Lecun Y (2006) Efficient learning of sparse representations with an Energy-Based model. Adv Neural Inf Process Syst 19:1137–1144

    Google Scholar 

  26. Seal A, Bhattacharjee D, Nasipuri M (2016) Human face recognition using random forest based fusion of -trous wavelet transform coefficients from thermal and visible images. AEU - Int J Electron Commun 70(8):1041–1049

    Article  Google Scholar 

  27. Wang L, Li B, Tian L (2014) EGGDD: an explicit dependency model for multi-modal medical image fusion in shift-invariant shearlet transform domain. Informa Fusion 19:29–37

    Article  Google Scholar 

  28. Wang M, Chen Y, Wang X (2014) Recognition of Handwritten Characters in Chinese Legal Amounts by Stacked Autoencoders, 2014 22nd International Conference on Pattern Recognition, pp 3002–3007

  29. Xiang T, Yan L, Gao R (2015) A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain. Infrared Phys Technol 69:53–61

    Article  Google Scholar 

  30. Xie L, Zhu L, Chen G (2016) Unsupervised multi-graph cross-modal hashing for large-scale multimedia retrieval. Multimed Tools Appl 75(15):9185

    Article  Google Scholar 

  31. Xydeas CS, Petrovic V (2000) Objective image fusion performance measure. Electron Lett 36(4):308–309

    Article  Google Scholar 

  32. Yan Y, Nie F, Li W, Gao C, Yang Y, Xu D (2016) Image classification by cross-media active learning with privileged information. IEEE Trans Multimed 18(12):2494–2502

    Article  Google Scholar 

  33. Yang L, Guo BL, Ni W (2008) Multimodality medical image fusion based on multiscale geometric analysis of contourlet transform. Neurocomputing 72(1):203–211

    Article  Google Scholar 

  34. Yang S, Wang M, Lu Y, Jiao L (2009) Fusion of multiparametric SAR images based on SW-nonsubsampled contourlet and PCNN. Signal Process 89(12):2596–2608

    Article  MATH  Google Scholar 

  35. Yang B, Li S (2010) Multifocus image fusion and restoration with sparse representations. IEEE Trans Instrum Meas 59(4):884–892

    Article  Google Scholar 

  36. Yang Y, Nie F, Xu D, Luo J, Zhuang Y, Pan Y (2012) A multimedia retrieval framework based on Semi-Supervised ranking and relevance feedback. IEEE Trans Pattern Anal Mach Intell 34(4):723–742

    Article  Google Scholar 

  37. Yang Y, Ma Z, Hauptmann AG, Sebe N (2013) Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans Multimed 15(3):661–669

    Article  Google Scholar 

  38. Yue C, Liu L, Li H, Huang W (2015) A fusion algorithm for infrared and low light level images based on edge information and support value transform. Infrared Phys Technol 71:313–321

    Article  Google Scholar 

  39. Zhang Q, Maldague X (2016) An adaptive fusion approach for infrared and visible images based on NSCT and compressed sensing. Infrared Phys Technol 74:11–20

    Article  Google Scholar 

  40. Zhang X, Zhang H, Zhang Y, Yang Y, Wang M, Luan H, Li J, Chua T-S (2015) Deep fusion of multiple semantic cues for complex event recognition. IEEE Trans Image Process 25(3):1033–1046

    Article  MathSciNet  Google Scholar 

  41. Zhang H, Shang X, Luan H, Chua T-S (2016) Learning from Collective Intelligence: Feature learning using social images and tags. ACM Trans Multimed Comput Commun Appl 13(1):1–23

    Article  Google Scholar 

  42. Zhang X, Li X, Feng Y (2016) Image fusion based on simultaneous empirical wavelet transform. Multimed Tools Appl 76(6):8175–8193

    Article  Google Scholar 

  43. Zhang H, Kyaw Z, Chang S-F, Chua T-S (2017) Visual translation embedding network for visual relation detection. In: IEEE International conference on computer vision and pattern recognition, pp 3107–3115

  44. Zhang H, Kyaw Z, Yu J, Chang SF (2017) PPR-FCN: Weakly supervised visual relation detection via parallel pairwise r-FCN. IEEE International Conference on Computer Vision 2017:4243–4251

  45. Zhu L, Jin H, Zheng R, Feng X (2014) Effective naive Bayes nearest neighbor based image classification on GPU. J Supercomput 68(2):820

    Article  Google Scholar 

  46. Zhu L, Jin H, Zheng R, Feng X (2014) Weighting scheme for image retrieval based on bag-of-visual-words. IET Image Process 8(9):509–518

    Article  Google Scholar 

  47. Zhu L, Shen J, Liu X, Xie L, Nie L (2016) Learning compact visual representation with canonical views for robust mobile landmark search. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), pp 3959-3965

  48. Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised topic hypergraph hashing for efficient mobile image retrieval. IEEE Trans Cybern 47(11):3941–3954

    Article  Google Scholar 

  49. Zhu L, Xu Z, Yang Y, Hauptmann AG (2017) Uncovering the temporal context for video question answering. Int J Comput Vis 124(3):1–13

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of P. R. China under grant no.61772237, the Provincial research grant no. BK20151358, BK20151202, the Suzhou science and technology project under Grant SYG201702, the Fundamental Research Funds for the Central Universities JUSRP51618B and the Equipment Development and Ministry of Education union fund 6141A02033312.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhancheng Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, X., Li, X., Wang, P. et al. Infrared and visible image fusion based on NSCT and stacked sparse autoencoders. Multimed Tools Appl 77, 22407–22431 (2018). https://doi.org/10.1007/s11042-018-5985-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5985-6

Keywords

Navigation