Abstract
To integrate the infrared object into the fused image effectively, a novel infrared (IR) and visible (VI) image fusion method by using nonsubsampled contourlet transform (NSCT) and stacked sparse autoencoders (SSAE) is proposed. Firstly, the IR and VI images are decomposed into low-frequency subbands and high-frequency subbands by using NSCT. Secondly, SSAE is performed on the low frequency subband of IR image to calculate the object reliabilities (OR) of the low frequency subband coefficients. Subsequently, an adaptive multi-strategy fusion rule based on OR is designed for the fusion of low frequency subbands and a choose-max fusion rule with the absolute values of high frequency subband coefficients are employed for the fusion of high frequency subbands. Experimental results show the proposed method is superior to the conventional methods in highlighting the infrared objects as well as keeping the background information in VI image.
Similar content being viewed by others
References
Arthur L, Cunha D, Zhou J, Do MN (2006) The nonsubsampled contourlet transform: theory, design, and applications. IEEE Trans Image Process 15(10):3089–3101
Cai J, Cheng Q, Peng M, Song Y (2017) Fusion of infrared and visible images based on nonsubsampled contourlet transform and sparse k-SVD dictionary learning. Infrared Phys Technol 82(5):85–95
Chai X, Wang Q, Zhao Y, Li Y (2016) Unsupervised domain adaptation techniques based on auto-encoder for non-stationary EEG-based emotion recognition. Comput Biol Med 79:205–214
Chen Y, Xiong J, Liu H, Fan Q (2014) Fusion method of infrared and visible images based on neighborhood characteristic and regionalization in NSCT domain. Optik-Int J Light Electron Opt 125(17):4980–4984
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T-S (2017) SCA-CNN: Spatial and Channel-Wise attention in convolutional networks for image captioning. IEEE International Conference on Computer Vision 2017:6298–6306
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Cui G, Feng H, Xu Z, Chen Y (2015) Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition. Opt Commun 341:199–209
Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306
Eckhorn R, Reitboeck HJ, Arndt M, Dicke P (1989) A neural network for feature linking via synchronous activity: Results from cat visual cortex and from simulations. Models of Brain Function. Cambridge University Press, pp 255–272
Fu Z, Dai X, Li Y, Wu H, Wang X (2014) An Improved visible and infrared image fusion based on local energy and fuzzy logic. In: Signal processing (ICSP), pp 861–865
Fu Z, Wang X, Xu J, Zhou N, Zhao Y (2016) Infrared and visible images fusion based on RPCA and NSCT. Infrared Phys Technol 77:114–123
Gan W, Wu X, Wu W, Liu K (2015) Infrared and visible image fusion with the use of multi-scale edge-preserving decomposition and guided image filter. Infrared Phys Technol 72:37–51
Gao C, Meng D, Yang Y, Wang Y, Zhou X (2013) Infrared Patch-Image model for small target detection in a single image. IEEE Trans Image Process 22 (12):4996–5009
Geng X, Zhang H, Bian J, Chua T-S (2015) Learning image and user features for recommendation in social networks. In: IEEE International conference on computer vision, pp 4274-4282
Geng P, Sun X, Liu J (2017) Adopting quaternion wavelet transform to fuse Multi-Modal medical images. Multimed Tools Appl 37(2):230–239
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Li H, Manjunath BS, Mitra SK (1995) Multisensor image fusion using the wavelet transform. Graph Model Image Process 57(3):235–245
Li S, Kang X, Hu J (2013) Image fusion with guided filtering. IEEE Trans Image Process 22(7):2864
Li H, Qiu H, Yu Z, Zhang Y (2016) Infrared and visible image fusion scheme based on NSCT and low-level visual features. Infrared Phys Technol 76:174–184
Liang J, He Y, Liu D, Zeng X (2012) Image fusion using higher order singular value decomposition. IEEE Trans Image Process 21(5):2898–2909
Lu B, Miao C (2010) Structure tensor based image fusion, Proceedings of the International Symposium on Electronic Commerc
Ma Y, Zhai Y, Geng P, Yan P (2011) A novel algorithm of image fusion based on PCNN and shearlet. Int J Digit Content Technol Appl 5(12):347–354
Pajares G, de la Cruz JM (2004) A wavelet-based image fusion tutorial. Pattern Recogn 37(9):1855–1872
Qu X, Yan J, Xiao H, Zhu Z (2008) Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Autom Sin 34(12):1508–1514
Ranzato M, Poultney CS, Chopra S, Lecun Y (2006) Efficient learning of sparse representations with an Energy-Based model. Adv Neural Inf Process Syst 19:1137–1144
Seal A, Bhattacharjee D, Nasipuri M (2016) Human face recognition using random forest based fusion of -trous wavelet transform coefficients from thermal and visible images. AEU - Int J Electron Commun 70(8):1041–1049
Wang L, Li B, Tian L (2014) EGGDD: an explicit dependency model for multi-modal medical image fusion in shift-invariant shearlet transform domain. Informa Fusion 19:29–37
Wang M, Chen Y, Wang X (2014) Recognition of Handwritten Characters in Chinese Legal Amounts by Stacked Autoencoders, 2014 22nd International Conference on Pattern Recognition, pp 3002–3007
Xiang T, Yan L, Gao R (2015) A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain. Infrared Phys Technol 69:53–61
Xie L, Zhu L, Chen G (2016) Unsupervised multi-graph cross-modal hashing for large-scale multimedia retrieval. Multimed Tools Appl 75(15):9185
Xydeas CS, Petrovic V (2000) Objective image fusion performance measure. Electron Lett 36(4):308–309
Yan Y, Nie F, Li W, Gao C, Yang Y, Xu D (2016) Image classification by cross-media active learning with privileged information. IEEE Trans Multimed 18(12):2494–2502
Yang L, Guo BL, Ni W (2008) Multimodality medical image fusion based on multiscale geometric analysis of contourlet transform. Neurocomputing 72(1):203–211
Yang S, Wang M, Lu Y, Jiao L (2009) Fusion of multiparametric SAR images based on SW-nonsubsampled contourlet and PCNN. Signal Process 89(12):2596–2608
Yang B, Li S (2010) Multifocus image fusion and restoration with sparse representations. IEEE Trans Instrum Meas 59(4):884–892
Yang Y, Nie F, Xu D, Luo J, Zhuang Y, Pan Y (2012) A multimedia retrieval framework based on Semi-Supervised ranking and relevance feedback. IEEE Trans Pattern Anal Mach Intell 34(4):723–742
Yang Y, Ma Z, Hauptmann AG, Sebe N (2013) Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans Multimed 15(3):661–669
Yue C, Liu L, Li H, Huang W (2015) A fusion algorithm for infrared and low light level images based on edge information and support value transform. Infrared Phys Technol 71:313–321
Zhang Q, Maldague X (2016) An adaptive fusion approach for infrared and visible images based on NSCT and compressed sensing. Infrared Phys Technol 74:11–20
Zhang X, Zhang H, Zhang Y, Yang Y, Wang M, Luan H, Li J, Chua T-S (2015) Deep fusion of multiple semantic cues for complex event recognition. IEEE Trans Image Process 25(3):1033–1046
Zhang H, Shang X, Luan H, Chua T-S (2016) Learning from Collective Intelligence: Feature learning using social images and tags. ACM Trans Multimed Comput Commun Appl 13(1):1–23
Zhang X, Li X, Feng Y (2016) Image fusion based on simultaneous empirical wavelet transform. Multimed Tools Appl 76(6):8175–8193
Zhang H, Kyaw Z, Chang S-F, Chua T-S (2017) Visual translation embedding network for visual relation detection. In: IEEE International conference on computer vision and pattern recognition, pp 3107–3115
Zhang H, Kyaw Z, Yu J, Chang SF (2017) PPR-FCN: Weakly supervised visual relation detection via parallel pairwise r-FCN. IEEE International Conference on Computer Vision 2017:4243–4251
Zhu L, Jin H, Zheng R, Feng X (2014) Effective naive Bayes nearest neighbor based image classification on GPU. J Supercomput 68(2):820
Zhu L, Jin H, Zheng R, Feng X (2014) Weighting scheme for image retrieval based on bag-of-visual-words. IET Image Process 8(9):509–518
Zhu L, Shen J, Liu X, Xie L, Nie L (2016) Learning compact visual representation with canonical views for robust mobile landmark search. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), pp 3959-3965
Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised topic hypergraph hashing for efficient mobile image retrieval. IEEE Trans Cybern 47(11):3941–3954
Zhu L, Xu Z, Yang Y, Hauptmann AG (2017) Uncovering the temporal context for video question answering. Int J Comput Vis 124(3):1–13
Acknowledgements
This work was supported by the National Natural Science Foundation of P. R. China under grant no.61772237, the Provincial research grant no. BK20151358, BK20151202, the Suzhou science and technology project under Grant SYG201702, the Fundamental Research Funds for the Central Universities JUSRP51618B and the Equipment Development and Ministry of Education union fund 6141A02033312.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Luo, X., Li, X., Wang, P. et al. Infrared and visible image fusion based on NSCT and stacked sparse autoencoders. Multimed Tools Appl 77, 22407–22431 (2018). https://doi.org/10.1007/s11042-018-5985-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5985-6