Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain

Wang, Zeyu; Li, Xiongfei; Duan, Haoran; Zhang, Xiaoli; Wang, Hancheng

doi:10.1007/s11042-019-08070-6

Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain

Published: 23 August 2019

Volume 78, pages 34483–34512, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zeyu Wang^1,2,
Xiongfei Li^1,3,
Haoran Duan⁴,
Xiaoli Zhang ORCID: orcid.org/0000-0001-8412-4956^1,3 &
…
Hancheng Wang²

1026 Accesses
26 Citations
Explore all metrics

Abstract

In this paper, a novel multifocus image fusion algorithm based on the convolutional neural network (CNN) in the discrete wavelet transform (DWT) domain is proposed. The algorithm combines the advantages of spatial domain- and transform domain-based methods. The CNN is used to amplify features and generate different decision maps for different frequency subbands instead of image blocks or source images. In addition, the CNN, which can be seen as an adaptive fusion rule, replaces the traditional fusion rules. The proposed algorithm includes the following steps: first, we decompose each source image into one low frequency subband and several high frequency subbands using the DWT; second, these frequency subbands are used as input to the CNN to generate weight maps. To obtain a more accurate decision map, it is refined by a series of postprocessing operations, including the sum-modified-Laplacian (SML) and guided filter (GF). According to their decision maps, the frequency subbands are fused; finally, the fused image can be obtained using the inverse DWT. The experimental results show that our algorithm is superior to other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Fusion and Super-Resolution with Convolutional Neural Network

Multifocus image fusion using a convolutional elastic network

Article 02 October 2021

MFIF-DWT-CNN: Multi-focus ımage fusion based on discrete wavelet transform with deep convolutional neural network

Article 26 June 2023

References

Acerbi-Junior FW, Clevers JGPW, Schaepman ME (2006) The assessment of multi-sensor image fusion using wavelet transforms for mapping the Brazilian Savanna. Int J Appl Earth Obs Geoinf 8(4):278–288
Article Google Scholar
Amin-Naji M, Aghagolzadeh A (2018) Multi-focus image fusion in DCT domain using variance and energy of Laplacian and correlation coefficient for visual sensor networks. Journal of AI and Data Mining 6(2):233–250
Google Scholar
Anderson CH (1988) Filter-subtract-decimate hierarchical pyramid signal analyzing and synthesizing technique. US
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-Convolutional Siamese Networks for Object Tracking. Computer Vision - Eccv 2016 Workshops. Pt Ii 9914:850–865
Google Scholar
Burt PJ, Adelson EH, Fischler MA, Firschein O (1987) The Laplacian pyramid as a compact image code, Morgan Kaufmann, San Francisco
Google Scholar
Du CB, Gao SS (2017) Image segmentation-based multi-focus image fusion through multi-scale convolutional neural network. IEEE Access 5:15750–15761
Article Google Scholar
Fan D-P, Gong C, Cao Y, Ren B, Cheng M-M, Borji A Enhanced-alignment measure for binary foreground map evaluation. arXiv:180510421
Fan D-P, Cheng M-M, Liu Y, Borji LT (2017) A Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision. pp 4548–4557
Fan D-P, Cheng M-M, Liu J-J, Gao S-H, Borji HQ (2018) A salient objects in clutter: Bringing salient object detection to the foreground. In: Proceedings of the European conference on computer vision (ECCV). pp 186–202
Chapter Google Scholar
Fan D-P, Zhang S, Wu Y-H, Cheng M-M, Ren B, Ji R, Rosin PL (2018) Face sketch synthesis style similarity: a new structure co-occurrence texture measure. arXiv:180402975
Fan D-P, Wang W, Cheng M-M, Shen J (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8554–8564
Farfade SS, Saberian M, Li LJ (2015) Multi-view face detection using deep convolutional neural networks. Icmr’15: Proceedings of the 2015 ACM international conference on multimedia retrieval: 643–650
Gao Z, Wang D, Xue Y, Xu G, Zhang H, Wang Y (2018) 3D object recognition based on pairwise Multi-view Convolutional Neural Networks. J Vis Commun Image Represent 56:305–315
Article Google Scholar
Gao Z, Xuan H -Z, Zhang H, Wan S, Choo K-KR (2019) Adaptive fusion and category-level dictionary learning model for multi-view human action recognition. IEEE Internet of Things Journal
Guo X P, Nie RC, Cao JD, Zhou DM, Qian WH (2018) Fully Convolutional Network-Based Multifocus Image Fusion. Neural Comput 30(7):1775–1800
Article MathSciNet Google Scholar
Gutman I, Zhou B (2006) Laplacian energy of a graph. Linear Algebra Appl 414(1):29–37
Article MathSciNet MATH Google Scholar
Hareeta M, Mahendra K, Anurag P (2016) image fusion based on the modified curvelet transform. Smart Trends in Information Technology and Computer Communications. Smartcom 2016(628):111–118
Google Scholar
He KM, Sun J, Tang XO (2013) Guided image filtering. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409
Article Google Scholar
Holzinger A From machine learning to explainable AI. In: 2018 world symposium on digital intelligence for systems and machines (DISA). IEEE, pp 55–66
Hong C, Yu J, Tao D, Wang M (2014) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751
Google Scholar
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Article MathSciNet MATH Google Scholar
Hou Q, Cheng M-M, Liu J, Torr PH (2018) Webseg: Learning semantic segmentation from web searches. arXiv:180309859
Hu Y-T, Huang J-B, Schwing AG (2018) Unsupervised video object segmentation using motion saliency-guided spatio-temporal propagation. In: Proceedings of the European conference on computer vision (ECCV). pp 786–802
Chapter Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T Caffe: Convolutional architecture for fast feature embedding. Paper presented at the proceedings of the 22nd ACM international conference on Multimedia, Orlando, Florida, USA
Jin X, Hou J Y, Nie RC, Yao SW, Zhou DM, Jiang Q, He KJ (2018) A lightweight scheme for multi-focus image fusion. Multimed Tools Appl 77 (18):23501–23527
Article Google Scholar
Kong J, Zheng K, Zhang J, Feng X (2008) Multi-focus image fusion using spatial frequency and genetic algorithm. International Journal of Computer Science & Network Security 2:220–224
Google Scholar
Krizhevsky A, Sutskever I, Hinton G E (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Lee K, Ji S (2015) Multi-focus image fusion using energy of image gradient and gradual boundary smoothing. Tencon 2015 - 2015 IEEE Region 10 Conference
Lewis JJ, O’Callaghan RJ, Nikolov SG, Bull DR, Canagarajah N (2007) Pixel- and region-based image fusion with complex wavelets. Information Fusion 8(2):119–130
Article Google Scholar
Li K, He F, Yu H, Chen X A parallel and robust object tracking approach synthesizing adaptive Bayesian learning and improved incremental subspace learning. Frontiers of Computer Science:1–20
Li ST, Yang B (2008) Multifocus image fusion using region segmentation and spatial frequency. Image Vis Comput 26(7):971–979
Article Google Scholar
Li ZH, Jing ZL, Liu G, Sun SY, Leung H (2003) Pixel visibility based multifocus image fusion. Proceedings of 2003, International Conference on Neural Networks & Signal Processing, Proceedings, Vols 1 and 2:1050–1053
Li S, Kang XD, Hu JW, Yang B (2013) Image matting for fusion of multi-focus images in dynamic scenes. Information Fusion 14(2):147–162
Article Google Scholar
Li ST, Kang XD, Hu JW (2013) Image fusion with guided filtering. IEEE Trans Image Process 22(7):2864–2875
Article Google Scholar
Li K, He F-z, H-p Y u, Chen X (2017) A correlative classifiers approach based on particle filter and sample set for tracking occluded target. Applied Mathematics-A Journal of Chinese Universities 32(3):294–312
Article MathSciNet Google Scholar
Liu Z, Blasch E, Xue ZY, Zhao JY, Laganiere R, Wu W (2012) Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: A comparative study. IEEE Trans Pattern Anal Mach Intell 34(1):94–109
Article Google Scholar
Liu Y, Liu SP, Wang ZF (2015) Multi-focus image fusion with dense SIFT. Information Fusion 23:139–155
Article Google Scholar
Liu Y, Chen X, Peng H, Wang ZF (2017) Multi-focus image fusion with a deep convolutional neural network. Information Fusion 36:191–207
Article Google Scholar
Liu Y, Chen X, Wang ZF, Wang ZJ, Ward RK, Wang XS (2018) Deep learning for pixel-level image fusion: Recent advances and future prospects. Information Fusion 42:158–173
Article Google Scholar
Liu Y, Cheng M-M, Bian J, Zhang L, Jiang P-T, Cao Y (2018) Semantic edge detection with diverse deep supervision. arXiv:180402864
Liu Y, Fan DP, Nie GY, Zhang X, Cheng MM (2019) DNA: Deeply-supervised nonlinear aggregation for salient object detection
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. 2015 IEEE conference on computer vision and pattern recognition (CVPR):3431-3440
Lv X, He F, Yan X, Wu Y, Cheng Y (2019) Integrating selective undo of feature-based modeling operations for real-time collaborative CAD systems. Futur Gener Comput Syst 100:473–497
Article Google Scholar
Margolin R, Zelnik-Manor L, Tal A (2014) How to evaluate foreground maps? In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 248–255
Nie G-Y, Cheng M-M, Liu Y, Liang Z, Fan D-P, Liu Y, Wang Y (2019) Multi-level context ultra-aggregation for stereo matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3283–3291
Norouzi M, Fleet DJ, Salakhutdinov R (2018) Hamming distance metric learning. Adv Neural Inf Proces Syst 2:1061–1069
Google Scholar
Pajares G, de la Cruz JM (2004) A wavelet-based image fusion tutorial. Pattern Recogn 37(9):1855–1872
Article Google Scholar
Pan Y, He F, Yu H A correlative denoising autoencoder to model social influence for top-n recommender system. Frontiers of Computer Science
Pan Y, He F, Yu H (2019) A novel enhanced collaborative autoencoder with knowledge distillation for top-N recommender systems. Neurocomputing 332:137–148
Article Google Scholar
Piella G (2008) New quality measures for image fusion. Astronomische Nachrichten 173(16-17):267–268
Google Scholar
Qu XB (2009) Matlab image fusion toolbox for sum-modified-laplacian-based multifocus image fusion method in cycle spinning sharp frequency localized contourlet transform. Opt Precis Eng 17(5):1203–1212
Google Scholar
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y OverFeat: Integrated recognition, localization and detection using convolutional networks. Eprint Arxiv
Tang H, Xiao B, Li W S, Wang G Y (2018) Pixel convolutional neural network for multi-focus image fusion. Inf Sci 433:125–141
Article MathSciNet Google Scholar
Wei H, Jing ZL (2007) Evaluation of focus measures in multi-focus image fusion. Pattern Recogn Lett 28(4):493–500
Article Google Scholar
Wu Y, He F, Zhang D, Li X (2015) Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans Serv Comput 11 (2):341–353
Article Google Scholar
Wu Z, Huang Y, Zhang K (2018) Remote sensing image fusion method based on PCA and curvelet transform. J Indian Soc Remote Sens 3:1–9
Google Scholar
Xiao-Bo QU, Yan JW, Yang GD (2009) Multifocus image fusion method of sharp frequency localized Contourlet transform domain based on sum-modified-Laplacian. Opt Precis Eng 17(5):1203–1212
Google Scholar
Xu KP, Qin Z, Wang GL, Zhang HD, Huang K, Ye SX (2018) Multi-focus Image Fusion using Fully Convolutional Two-stream Network for Visual Sensors. Ksii T Internet Inf 12(5):2253–2272
Google Scholar
Yin M, Duan PH, Liu W, Liang XY (2017) A novel infrared and visible image fusion algorithm based on shift-invariant dual-tree complex shearlet transform and sparse representation. Neurocomputing 226:182–191
Article Google Scholar
Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032
Article MathSciNet MATH Google Scholar
Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Transactions on Cybernetics 45(4):767–779
Article Google Scholar
Yu J, Yang X, Gao F, Tao D (2016) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics 47(12):4014–4024
Article Google Scholar
Zhang Q, Guo BL (2009) Multifocus image fusion using the nonsubsampled contourlet transform. Signal Process 89(7):1334–1346
Article MATH Google Scholar
Zhang Q, Wang L, Li H J, Ma ZK (2011) Similarity-based multimodality image fusion with shiftable complex directional pyramid. Pattern Recogn Lett 32 (13):1544–1553
Article Google Scholar
Zhao H, Li Q, Feng HJ (2008) Multi-focus color image fusion in the HSI space using the sum-modified-laplacian and a coarse edge map. Image Vis Comput 26 (9):1285–1295
Article Google Scholar
Zhang J, Wang M, Lin L, Yang X, Gao J, Rui Y (2017) Saliency detection on light field: A multi-cue approach. ACM Trans Multimed Comput Commun Appl (TOMM) 13(3):32
Google Scholar
Zhao J -X, Cao Y, Fan D-P, Cheng M-M, Li X-Y, Zhang L (2019) Contrast prior and fluid pyramid integration for RGBD salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Zhou Z, Li S, Wang B (2014) Multi-scale weighted gradient-based fusion for multi-focus images. Information Fusion 20:60–72
Article Google Scholar

Download references

Acknowledgements

The work was supported by National Science & Technology Pillar Program of China (Grant NO.2012BAH48F02), National Natural Science Foundation of China (Grant No.61801190), Nature Science Foundation of Jilin Province (Grant No.20180101055JC), Outstanding Young Talent Foundation of Jilin Province (Grant No.20180520029JH), China Postdoctoral Science Foundation (Grant No.2017M611323), Industrial Technology Research and Development Funds of Jilin province (2019C054-3), and the Fundamental Research Funds for the Central Universities, JLU.

Author information

Authors and Affiliations

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
Zeyu Wang, Xiongfei Li & Xiaoli Zhang
College of Software, Jilin University, Changchun, 130012, China
Zeyu Wang & Hancheng Wang
College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Xiongfei Li & Xiaoli Zhang
School of Computing, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
Haoran Duan

Authors

Zeyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiongfei Li
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Duan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoli Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hancheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoli Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Li, X., Duan, H. et al. Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain. Multimed Tools Appl 78, 34483–34512 (2019). https://doi.org/10.1007/s11042-019-08070-6

Download citation

Received: 23 December 2018
Revised: 05 July 2019
Accepted: 02 August 2019
Published: 23 August 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11042-019-08070-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain

Abstract

Access this article

Similar content being viewed by others

Image Fusion and Super-Resolution with Convolutional Neural Network

Multifocus image fusion using a convolutional elastic network

MFIF-DWT-CNN: Multi-focus ımage fusion based on discrete wavelet transform with deep convolutional neural network

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain

Abstract

Access this article

Similar content being viewed by others

Image Fusion and Super-Resolution with Convolutional Neural Network

Multifocus image fusion using a convolutional elastic network

MFIF-DWT-CNN: Multi-focus ımage fusion based on discrete wavelet transform with deep convolutional neural network

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation