Skip to main content
Log in

Efficient fused convolution neural network (EFCNN) for feature level fusion of medical images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper proposes an Efficient Fused Convolution Neural Network (EFCNN) for feature-level fusion of medical images. The proposed network architecture leverages the strengths of both deep Convolution Neural Networks (CNNs) and fusion techniques to achieve improved efficiency in medical image fusion. Image fusion of CT and MRI images can help medical professionals to make more informed diagnosis, plan more effective treatments, and ultimately improve patient outcomes. Recently many researchers are working to develop efficient medical fusion technique. To contribute to this field, authors have attempted to fuse images at feature level using Bilinear Activation Function (BAM) for feature extraction and softmax based Soft Attention (SA) fusion rule for fusion. The EFCNN model uses a two-stream CNN architecture to process input images, which are then fused at the feature level using an attention mechanism. The proposed approach is evaluated on Whole Brain Atlas Harvard dataset. The EFCNN model demonstrated superior performance in various performance indices, including ISSIM, MI, and PSNR, with respective values of 0.41, 4.42, and 57.21 when SA was utilized. Furthermore, the proposed model exhibited favourable performance in terms of Spatial Frequency, Average Gradient, and Edge-intensity, with corresponding values of 57.3, 16.83, and 157.72 on a medical dataset when EFCNN was applied without SA fusion. However, subjective evaluation indicated that images were improved with SA fusion. These results indicate that the EFCNN model surpasses state-of-the-art methods. An exhaustive ablation study was conducted to investigate the efficacy of the proposed model, which further confirmed its accuracy. The significance of this work is its potential implications for medical diagnosis and treatment planning, where precise and efficient image analysis is crucial.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Availability of data and materials

Publically available dataset i.e. Whole Brain Atlas Harvard medical dataset [21]

References

  1. Zhou T, Li L, Bredell G, Li J, Unkelbach J, Konukoglu E (2023) Volumetric memory network for interactive medical image segmentation. Med Image Anal 83:102599

    Article  Google Scholar 

  2. Cheng C, Xu T, Wu X-J (2023) Mufusion: A general unsupervised image fusion network based on memory unit. Inform Fusion 92:80–92

    Article  Google Scholar 

  3. Ding Z, Li H, Guo Y, Zhou D, Liu Y, Xie S (2023) M4fnet: Multimodal medical image fusion network via multi-receptive-field and multi-scale feature integration. Comput Biol Med 159:106923

    Article  Google Scholar 

  4. Zhang G, Nie R, Cao J, Chen L, Zhu Y (2023) Fdgnet: A pair feature difference guided network for multimodal medical image fusion. Biomedical Signal Processing and Control 81:104545

    Article  Google Scholar 

  5. Zhao Z, Bai H, Zhang J, Zhang Y, Xu S, Lin Z, Timofte R, Van Gool L (2023) Cddfuse: Correlation-driven dual-branch feature decomposition for multi-modality image fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5906–5916

  6. Liu J, Dian R, Li S, Liu H (2023) Sgfusion: A saliency guided deep-learning framework for pixel-level image fusion. Inform Fusion 91:205–214

    Article  Google Scholar 

  7. Goyal S, Singh V, Rani A, Yadav N (2022) Multimodal image fusion and denoising in nsct domain using cnn and fotgv. Biomedical Signal Processing and Control 71:103214

    Article  Google Scholar 

  8. Si Y et al (2021) Lppcnn: A laplacian pyramid-based pulse coupled neural network method for medical image fusion. J Appl Sci Eng 24(3):299–305

    Google Scholar 

  9. Liu Y, Chen X, Cheng J, Peng H (2017) A medical image fusion method based on convolutional neural networks. In: 2017 20th International conference on information fusion (Fusion). IEEE, pp 1–7

  10. Wang C, Yang G, Papanastasiou G, Tsaftaris SA, Newby DE, Gray C, Macnaught G, MacGillivray TJ (2021) Dicyc: Gan-based deformation invariant cross-domain information fusion for medical image synthesis. Inform Fusion 67:147–160

    Article  Google Scholar 

  11. Reddy M, Reddy P, Reddy P (2021) Segmentation of fused mr and ct images using dl-cnn with pgk and nlem filtered aacgk-fcm. Biomedical Signal Processing and Control 68:102618

    Article  Google Scholar 

  12. Rani M, Yadav J, Rathee N, Goyal S (2022) Comparative study of various preprocessing technique for cnn based image fusion. In: 2022 IEEE Delhi Section Conference (DELCON). IEEE, pp 1–4

  13. Ma J, Xu H, Jiang J, Mei X, Zhang X-P (2020) Ddcgan: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Transactions on image processing 29:4980–4995

    Article  Google Scholar 

  14. Lahoud F, Süsstrunk S (2019) Zero-learning fast medical image fusion. In: 2019 22th International Conference on Information Fusion (FUSION). IEEE, pp 1–8

  15. Li W, Li R, Fu J, Peng X (2022) Msenet: A multi-scale enhanced network based on unique features guidance for medical image fusion. Biomed Signal Process Control 74:103534

    Article  Google Scholar 

  16. Liu Y, Wang L, Li H, Chen X (2022) Multi-focus image fusion with deep residual learning and focus property detection. Inform Fusion 86:1–16

    Article  Google Scholar 

  17. Zhang Y, Liu Y, Sun P, Yan H, Zhao X, Zhang L (2020) Ifcnn: A general image fusion framework based on convolutional neural network. Inform Fusion 54:99–118

    Article  Google Scholar 

  18. Jin Z-R, Deng L-J, Zhang T-J, Jin X-X (2021) Bam: Bilateral activation mechanism for image fusion. In: Proceedings of the 29th ACM International conference on multimedia, pp 4315–4323

  19. Li H, Zhang L, Jiang M, Li Y (2021) Multi-focus image fusion algorithm based on supervised learning for fully convolutional neural network. Pattern Recognition Letters 141:45–53

    Article  Google Scholar 

  20. Zhou T, Wang S, Zhou Y, Yao Y, Li J, Shao L (2020) Motion-attentive transition for zero-shot video object segmentation. Proceedings of the AAAI Conference on artificial intelligence 34:13066–13073

    Article  Google Scholar 

  21. Johnson KA et al (2001) The whole brain atlas

  22. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition. Ieee, pp 248–255

  23. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CoRR abs/1512.03385. arXiv:1512.03385

  24. Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets, Multiresolution Inform Process 16(03):1850018

    Article  MathSciNet  Google Scholar 

  25. Bavirisetti DP, Dhuli R (2015) Fusion of infrared and visible sensor images based on anisotropic diffusion and karhunen-loeve transform. IEEE Sensors J 16(1):203–209

    Article  Google Scholar 

  26. Zhou Z, Wang B, Li S, Dong M (2016) Perceptual fusion of infrared and visible images through a hybrid multi-scale decomposition with gaussian and bilateral filters. Inform Fusion 30:15–26

    Article  Google Scholar 

  27. Zhang Y, Zhang L, Bai X, Zhang L (2017) Infrared and visual image fusion through infrared feature extraction and visual information preservation. Infrared Physics & Technology 83:227–237

    Article  Google Scholar 

  28. Li X, Zhou F, Tan H, Zhang W, Zhao C (2021) Multimodal medical image fusion based on joint bilateral filter and local gradient energy. Inform Sci 569:302–325

    Article  MathSciNet  Google Scholar 

  29. Zhang X, Ye P, Xiao G (2020) Vifb: A visible and infrared image fusion benchmark. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition workshops

  30. Yang C, Zhang J-Q, Wang X-R, Liu X (2008) A novel similarity based quality metric for image fusion. Inform Fusion 9(2):156–160

    Article  Google Scholar 

  31. Qu G, Zhang D, Yan P (2002) Information measure for performance of image fusion. Electron Lett 387:1

    Google Scholar 

  32. Naidu V (2010) Discrete cosine transform-based image fusion. Def Sci J 60(1):48

    Article  MathSciNet  Google Scholar 

  33. Li S, Yang B (2008) Multifocus image fusion using region segmentation and spatial frequency. Image and Vis Comput 26(7):971–979

    Article  Google Scholar 

  34. Zhao W, Wang D, Lu H (2018) Multi-focus image fusion with a natural enhancement via a joint multi-level deeply supervised convolutional neural network. IEEE Trans Circ Syst Vid Technol 29(4):1102–1115

    Article  Google Scholar 

  35. Rajalingam B, Priya R (2018) Hybrid multimodality medical image fusion technique for feature enhancement in medical diagnosis. Int J Eng Sci Invent 2(Special issue):5260

  36. Roberts JW, Van Aardt JA, Ahmed FB (2008) Assessment of image fusion procedures using entropy, image quality, and multispectral classi cation. J Appl Remote Sens 2(1):023522

    Article  Google Scholar 

  37. Shannon C (2001) A mathematical theory of communication. Mob Comput Commun Rev 5:3–55

    Article  Google Scholar 

  38. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  39. Eskicioglu AM, Fisher PS (1995) Image quality measures and their performance. IEEE Trans Commun 43(12):2959–2965

    Article  Google Scholar 

  40. Rajalingam B, Al-Turjman F, Santhoshkumar R, Rajesh M (2020) Intelligent multimodal medical image fusion with deep guided ltering. Multimed Syst 1–15

  41. Shannon C (2001) A mathematical theory of communication. Mobile Comput Commun Rev 5:3–55

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mamta Rani.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jyoti Yadav, Neeru Rathee and Sonal Goyal contributed equally to this work.

Appendices

Appendix

A Performance measure

The five evaluation metrics are used as performance measurement which are Mutual Information (MI) [31], Spatial frequency [33] and Average Gradient [34], Edge-intensity [35], and Entropy [36, 37]. In information theory mutual information (MI) represents statistical dependency of one variable on other. If X and Y are two source images and F the fused image, the MI can be calculated according to the formula given by (3) [31].

$$\begin{aligned} M^{XY}_{F}={I_{FX}(f,x)}+{I_{FY}(f,y)}. \end{aligned}$$
(3)

Where \(I_{FX}(f,x)\) and \(I_{FY}(f,y)\) represents amount of information F contains about X and Y respectively.Which can be calculated as given by (4 and 5).

$$\begin{aligned} I_{FX}(f,x)=\sum _{f,x}P_{FX}(f,x)log{\frac{P_{FX}(f,x)}{P_{F}(f)P_{X}(x)}} \end{aligned}$$
(4)
$$\begin{aligned} I_{FY}(f,y)=\sum _{f,y}P_{FY}(f,y)log{\frac{P_{FY}(f,y)}{P_{F}(f)P_{Y}(y)}} \end{aligned}$$
(5)

Here \(P_{F}(f)\), \(P_{Y}(y)\) & \(P_{X}(x)\) represents marginal probability distributions and \(P_{FX}(f,x) and P_{FY}(f,y)\) represents joint probability distribution of fused and source images. Firstly [38], proposed an evaluation metric showing structural similarity of source images, later [30] proposed an improved version of this metric i.e. ISSIM shown by the equation given below (6), successfully distinguishing conflicted and complementary information from redundant regions.

$$\begin{aligned} ISSIM(x,y,f(\mid )w)={\left\{ \begin{array}{ll}\lambda (w)SSIM(x,f(\mid )w)+(1-\lambda (w)SSIM(y,f(\mid )w), &{} \\ for SSIM(x,y(\mid )w\ge 0.75 &{} \\ max(SSIM(x,f(\mid )w),SSIM(y,f(\mid )w))&{} \\ for SSIM(x,y(\mid )w)< 0.75\end{array}\right. } \end{aligned}$$
(6)

Where \(\lambda (w)\) is local weight and s(x(\(\mid \))w) and s(y(\(\mid \))w) are variances of \(w_{x}\) and \(w_{y}\) respectively.

Spatial frequency (SF) [39],given by equation 6 represents the overall active level of the image.The metric selects the region with maximum spatial frequency by segregating the image by normalized cuts, and combining all such regions to reconstruct the final image [33]. Equations 7 and 8 shows the row (RF) and column frequency (CF) of the image.

$$\begin{aligned} RF=\sqrt{\frac{1}{XY}\times \sum _{x=0}^{X-1} \sum _{y=1}^{Y-1}[F(x,y)-F(x,y-1)]^2} \end{aligned}$$
(7)
$$\begin{aligned} CF=\sqrt{\frac{1}{XY}\times \sum _{y=0}^{Y-1}\sum _{x=1}^{X-1}[F(x,y)-F(x-1,y)]^2} \end{aligned}$$
(8)
$$\begin{aligned} SF=\sqrt{(RF)^{2}+(CF)^{2}} \end{aligned}$$
(9)

Average Gradient (AG) [34] given by (10). A higher value of AG represent more accurate edge details.

$$\begin{aligned} AG=\frac{1}{(M-1)(N-1)}\times \sum _{m=1}^{M-1}\sum _{n=1}^{N-1}(\frac{1}{4})\sqrt{{\frac{[\partial (m,n)}{\partial m]^2}}+{\frac{[\partial (m,n)}{\partial n]^2}}} \end{aligned}$$
(10)

Where (m, n) shows image co-ordinate and their gradients are horizontal and vertical gradients respectively. Peak signal to Noise Ratio (PSNR) is given by (11) [32].

$$\begin{aligned} PSNR=20log_{10}{\frac{L^2}{\frac{1}{MN} \sum _{i=1}^{M} \sum _{j=1}^{N} (I_{r}(i,j)-I_{f}(i,j))^2 }} \end{aligned}$$
(11)

A high value of PSNR indicates good results.

A higher value of edge intensity given by (12) [40] shows clearer fused image. The edge intensity can be calculated using Sobel operator.

$$\begin{aligned} EI=\sqrt{(S_{x})^2+(S_{y})^2} \end{aligned}$$
(12)

where

$$\begin{aligned} S_{x}=f*h_{x} \end{aligned}$$
(13)
$$\begin{aligned} S_{y}=f*h_{y} \end{aligned}$$
(14)
$$\begin{aligned} h_{x}= \left( \begin{array}{ccc} 1&{}0&{}1\\ -2&{}0&{}2\\ -1&{}0&{}1 \end{array}\right) \end{aligned}$$
(15)
$$\begin{aligned} h_{x}= \left( \begin{array}{ccc} -1&{}-2&{}-1\\ 0&{}0&{}0\\ 1&{}2&{}1 \end{array}\right) \end{aligned}$$
(16)

Another performance measure Entropy (EN) has been widely used in remote sensing applications to set out the information content of an image [36]. Shannon and Claude [41] were first to propose the use of entropy for estimating information content. The original shannon’s equaton is given by (17).

$$\begin{aligned} H=-\sum _{i=1}^{n} P(K_{i}log_{2}P(K_{i}) \end{aligned}$$
(17)

where i is the no. of messages in a set. and \(P(K_{i})\) is probability of transpiring a message. The same theory that is applied on messages can be applied on images too if distinct messages are considered as pixels and message set by 2D image. It assumed that the fused image will have more gray level than the individual one as more the amount of information more will be the gray levels. But often it is not the information but the noise that is contributing towards the increased number of gray levels[36].

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rani, M., Yadav, J., Rathee, N. et al. Efficient fused convolution neural network (EFCNN) for feature level fusion of medical images. Multimed Tools Appl 83, 40179–40214 (2024). https://doi.org/10.1007/s11042-023-16872-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16872-y

Keywords

Navigation