Skip to main content
Log in

Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, a novel multifocus image fusion algorithm based on the convolutional neural network (CNN) in the discrete wavelet transform (DWT) domain is proposed. The algorithm combines the advantages of spatial domain- and transform domain-based methods. The CNN is used to amplify features and generate different decision maps for different frequency subbands instead of image blocks or source images. In addition, the CNN, which can be seen as an adaptive fusion rule, replaces the traditional fusion rules. The proposed algorithm includes the following steps: first, we decompose each source image into one low frequency subband and several high frequency subbands using the DWT; second, these frequency subbands are used as input to the CNN to generate weight maps. To obtain a more accurate decision map, it is refined by a series of postprocessing operations, including the sum-modified-Laplacian (SML) and guided filter (GF). According to their decision maps, the frequency subbands are fused; finally, the fused image can be obtained using the inverse DWT. The experimental results show that our algorithm is superior to other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Acerbi-Junior FW, Clevers JGPW, Schaepman ME (2006) The assessment of multi-sensor image fusion using wavelet transforms for mapping the Brazilian Savanna. Int J Appl Earth Obs Geoinf 8(4):278–288

    Article  Google Scholar 

  2. Amin-Naji M, Aghagolzadeh A (2018) Multi-focus image fusion in DCT domain using variance and energy of Laplacian and correlation coefficient for visual sensor networks. Journal of AI and Data Mining 6(2):233–250

    Google Scholar 

  3. Anderson CH (1988) Filter-subtract-decimate hierarchical pyramid signal analyzing and synthesizing technique. US

  4. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-Convolutional Siamese Networks for Object Tracking. Computer Vision - Eccv 2016 Workshops. Pt Ii 9914:850–865

    Google Scholar 

  5. Burt PJ, Adelson EH, Fischler MA, Firschein O (1987) The Laplacian pyramid as a compact image code, Morgan Kaufmann, San Francisco

    Google Scholar 

  6. Du CB, Gao SS (2017) Image segmentation-based multi-focus image fusion through multi-scale convolutional neural network. IEEE Access 5:15750–15761

    Article  Google Scholar 

  7. Fan D-P, Gong C, Cao Y, Ren B, Cheng M-M, Borji A Enhanced-alignment measure for binary foreground map evaluation. arXiv:180510421

  8. Fan D-P, Cheng M-M, Liu Y, Borji LT (2017) A Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision. pp 4548–4557

  9. Fan D-P, Cheng M-M, Liu J-J, Gao S-H, Borji HQ (2018) A salient objects in clutter: Bringing salient object detection to the foreground. In: Proceedings of the European conference on computer vision (ECCV). pp 186–202

    Chapter  Google Scholar 

  10. Fan D-P, Zhang S, Wu Y-H, Cheng M-M, Ren B, Ji R, Rosin PL (2018) Face sketch synthesis style similarity: a new structure co-occurrence texture measure. arXiv:180402975

  11. Fan D-P, Wang W, Cheng M-M, Shen J (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8554–8564

  12. Farfade SS, Saberian M, Li LJ (2015) Multi-view face detection using deep convolutional neural networks. Icmr’15: Proceedings of the 2015 ACM international conference on multimedia retrieval: 643–650

  13. Gao Z, Wang D, Xue Y, Xu G, Zhang H, Wang Y (2018) 3D object recognition based on pairwise Multi-view Convolutional Neural Networks. J Vis Commun Image Represent 56:305–315

    Article  Google Scholar 

  14. Gao Z, Xuan H -Z, Zhang H, Wan S, Choo K-KR (2019) Adaptive fusion and category-level dictionary learning model for multi-view human action recognition. IEEE Internet of Things Journal

  15. Guo X P, Nie RC, Cao JD, Zhou DM, Qian WH (2018) Fully Convolutional Network-Based Multifocus Image Fusion. Neural Comput 30(7):1775–1800

    Article  MathSciNet  Google Scholar 

  16. Gutman I, Zhou B (2006) Laplacian energy of a graph. Linear Algebra Appl 414(1):29–37

    Article  MathSciNet  MATH  Google Scholar 

  17. Hareeta M, Mahendra K, Anurag P (2016) image fusion based on the modified curvelet transform. Smart Trends in Information Technology and Computer Communications. Smartcom 2016(628):111–118

    Google Scholar 

  18. He KM, Sun J, Tang XO (2013) Guided image filtering. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409

    Article  Google Scholar 

  19. Holzinger A From machine learning to explainable AI. In: 2018 world symposium on digital intelligence for systems and machines (DISA). IEEE, pp 55–66

  20. Hong C, Yu J, Tao D, Wang M (2014) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751

    Google Scholar 

  21. Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670

    Article  MathSciNet  MATH  Google Scholar 

  22. Hou Q, Cheng M-M, Liu J, Torr PH (2018) Webseg: Learning semantic segmentation from web searches. arXiv:180309859

  23. Hu Y-T, Huang J-B, Schwing AG (2018) Unsupervised video object segmentation using motion saliency-guided spatio-temporal propagation. In: Proceedings of the European conference on computer vision (ECCV). pp 786–802

    Chapter  Google Scholar 

  24. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T Caffe: Convolutional architecture for fast feature embedding. Paper presented at the proceedings of the 22nd ACM international conference on Multimedia, Orlando, Florida, USA

  25. Jin X, Hou J Y, Nie RC, Yao SW, Zhou DM, Jiang Q, He KJ (2018) A lightweight scheme for multi-focus image fusion. Multimed Tools Appl 77 (18):23501–23527

    Article  Google Scholar 

  26. Kong J, Zheng K, Zhang J, Feng X (2008) Multi-focus image fusion using spatial frequency and genetic algorithm. International Journal of Computer Science & Network Security 2:220–224

    Google Scholar 

  27. Krizhevsky A, Sutskever I, Hinton G E (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  28. Lee K, Ji S (2015) Multi-focus image fusion using energy of image gradient and gradual boundary smoothing. Tencon 2015 - 2015 IEEE Region 10 Conference

  29. Lewis JJ, O’Callaghan RJ, Nikolov SG, Bull DR, Canagarajah N (2007) Pixel- and region-based image fusion with complex wavelets. Information Fusion 8(2):119–130

    Article  Google Scholar 

  30. Li K, He F, Yu H, Chen X A parallel and robust object tracking approach synthesizing adaptive Bayesian learning and improved incremental subspace learning. Frontiers of Computer Science:1–20

  31. Li ST, Yang B (2008) Multifocus image fusion using region segmentation and spatial frequency. Image Vis Comput 26(7):971–979

    Article  Google Scholar 

  32. Li ZH, Jing ZL, Liu G, Sun SY, Leung H (2003) Pixel visibility based multifocus image fusion. Proceedings of 2003, International Conference on Neural Networks & Signal Processing, Proceedings, Vols 1 and 2:1050–1053

  33. Li S, Kang XD, Hu JW, Yang B (2013) Image matting for fusion of multi-focus images in dynamic scenes. Information Fusion 14(2):147–162

    Article  Google Scholar 

  34. Li ST, Kang XD, Hu JW (2013) Image fusion with guided filtering. IEEE Trans Image Process 22(7):2864–2875

    Article  Google Scholar 

  35. Li K, He F-z, H-p Y u, Chen X (2017) A correlative classifiers approach based on particle filter and sample set for tracking occluded target. Applied Mathematics-A Journal of Chinese Universities 32(3):294–312

    Article  MathSciNet  Google Scholar 

  36. Liu Z, Blasch E, Xue ZY, Zhao JY, Laganiere R, Wu W (2012) Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: A comparative study. IEEE Trans Pattern Anal Mach Intell 34(1):94–109

    Article  Google Scholar 

  37. Liu Y, Liu SP, Wang ZF (2015) Multi-focus image fusion with dense SIFT. Information Fusion 23:139–155

    Article  Google Scholar 

  38. Liu Y, Chen X, Peng H, Wang ZF (2017) Multi-focus image fusion with a deep convolutional neural network. Information Fusion 36:191–207

    Article  Google Scholar 

  39. Liu Y, Chen X, Wang ZF, Wang ZJ, Ward RK, Wang XS (2018) Deep learning for pixel-level image fusion: Recent advances and future prospects. Information Fusion 42:158–173

    Article  Google Scholar 

  40. Liu Y, Cheng M-M, Bian J, Zhang L, Jiang P-T, Cao Y (2018) Semantic edge detection with diverse deep supervision. arXiv:180402864

  41. Liu Y, Fan DP, Nie GY, Zhang X, Cheng MM (2019) DNA: Deeply-supervised nonlinear aggregation for salient object detection

  42. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. 2015 IEEE conference on computer vision and pattern recognition (CVPR):3431-3440

  43. Lv X, He F, Yan X, Wu Y, Cheng Y (2019) Integrating selective undo of feature-based modeling operations for real-time collaborative CAD systems. Futur Gener Comput Syst 100:473–497

    Article  Google Scholar 

  44. Margolin R, Zelnik-Manor L, Tal A (2014) How to evaluate foreground maps? In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 248–255

  45. Nie G-Y, Cheng M-M, Liu Y, Liang Z, Fan D-P, Liu Y, Wang Y (2019) Multi-level context ultra-aggregation for stereo matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3283–3291

  46. Norouzi M, Fleet DJ, Salakhutdinov R (2018) Hamming distance metric learning. Adv Neural Inf Proces Syst 2:1061–1069

    Google Scholar 

  47. Pajares G, de la Cruz JM (2004) A wavelet-based image fusion tutorial. Pattern Recogn 37(9):1855–1872

    Article  Google Scholar 

  48. Pan Y, He F, Yu H A correlative denoising autoencoder to model social influence for top-n recommender system. Frontiers of Computer Science

  49. Pan Y, He F, Yu H (2019) A novel enhanced collaborative autoencoder with knowledge distillation for top-N recommender systems. Neurocomputing 332:137–148

    Article  Google Scholar 

  50. Piella G (2008) New quality measures for image fusion. Astronomische Nachrichten 173(16-17):267–268

    Google Scholar 

  51. Qu XB (2009) Matlab image fusion toolbox for sum-modified-laplacian-based multifocus image fusion method in cycle spinning sharp frequency localized contourlet transform. Opt Precis Eng 17(5):1203–1212

    Google Scholar 

  52. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y OverFeat: Integrated recognition, localization and detection using convolutional networks. Eprint Arxiv

  53. Tang H, Xiao B, Li W S, Wang G Y (2018) Pixel convolutional neural network for multi-focus image fusion. Inf Sci 433:125–141

    Article  MathSciNet  Google Scholar 

  54. Wei H, Jing ZL (2007) Evaluation of focus measures in multi-focus image fusion. Pattern Recogn Lett 28(4):493–500

    Article  Google Scholar 

  55. Wu Y, He F, Zhang D, Li X (2015) Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans Serv Comput 11 (2):341–353

    Article  Google Scholar 

  56. Wu Z, Huang Y, Zhang K (2018) Remote sensing image fusion method based on PCA and curvelet transform. J Indian Soc Remote Sens 3:1–9

    Google Scholar 

  57. Xiao-Bo QU, Yan JW, Yang GD (2009) Multifocus image fusion method of sharp frequency localized Contourlet transform domain based on sum-modified-Laplacian. Opt Precis Eng 17(5):1203–1212

    Google Scholar 

  58. Xu KP, Qin Z, Wang GL, Zhang HD, Huang K, Ye SX (2018) Multi-focus Image Fusion using Fully Convolutional Two-stream Network for Visual Sensors. Ksii T Internet Inf 12(5):2253–2272

    Google Scholar 

  59. Yin M, Duan PH, Liu W, Liang XY (2017) A novel infrared and visible image fusion algorithm based on shift-invariant dual-tree complex shearlet transform and sparse representation. Neurocomputing 226:182–191

    Article  Google Scholar 

  60. Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032

    Article  MathSciNet  MATH  Google Scholar 

  61. Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Transactions on Cybernetics 45(4):767–779

    Article  Google Scholar 

  62. Yu J, Yang X, Gao F, Tao D (2016) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics 47(12):4014–4024

    Article  Google Scholar 

  63. Zhang Q, Guo BL (2009) Multifocus image fusion using the nonsubsampled contourlet transform. Signal Process 89(7):1334–1346

    Article  MATH  Google Scholar 

  64. Zhang Q, Wang L, Li H J, Ma ZK (2011) Similarity-based multimodality image fusion with shiftable complex directional pyramid. Pattern Recogn Lett 32 (13):1544–1553

    Article  Google Scholar 

  65. Zhao H, Li Q, Feng HJ (2008) Multi-focus color image fusion in the HSI space using the sum-modified-laplacian and a coarse edge map. Image Vis Comput 26 (9):1285–1295

    Article  Google Scholar 

  66. Zhang J, Wang M, Lin L, Yang X, Gao J, Rui Y (2017) Saliency detection on light field: A multi-cue approach. ACM Trans Multimed Comput Commun Appl (TOMM) 13(3):32

    Google Scholar 

  67. Zhao J -X, Cao Y, Fan D-P, Cheng M-M, Li X-Y, Zhang L (2019) Contrast prior and fluid pyramid integration for RGBD salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  68. Zhou Z, Li S, Wang B (2014) Multi-scale weighted gradient-based fusion for multi-focus images. Information Fusion 20:60–72

    Article  Google Scholar 

Download references

Acknowledgements

The work was supported by National Science & Technology Pillar Program of China (Grant NO.2012BAH48F02), National Natural Science Foundation of China (Grant No.61801190), Nature Science Foundation of Jilin Province (Grant No.20180101055JC), Outstanding Young Talent Foundation of Jilin Province (Grant No.20180520029JH), China Postdoctoral Science Foundation (Grant No.2017M611323), Industrial Technology Research and Development Funds of Jilin province (2019C054-3), and the Fundamental Research Funds for the Central Universities, JLU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoli Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Li, X., Duan, H. et al. Multifocus image fusion using convolutional neural networks in the discrete wavelet transform domain. Multimed Tools Appl 78, 34483–34512 (2019). https://doi.org/10.1007/s11042-019-08070-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-08070-6

Keywords

Navigation