Abstract
This paper proposes an Efficient Fused Convolution Neural Network (EFCNN) for feature-level fusion of medical images. The proposed network architecture leverages the strengths of both deep Convolution Neural Networks (CNNs) and fusion techniques to achieve improved efficiency in medical image fusion. Image fusion of CT and MRI images can help medical professionals to make more informed diagnosis, plan more effective treatments, and ultimately improve patient outcomes. Recently many researchers are working to develop efficient medical fusion technique. To contribute to this field, authors have attempted to fuse images at feature level using Bilinear Activation Function (BAM) for feature extraction and softmax based Soft Attention (SA) fusion rule for fusion. The EFCNN model uses a two-stream CNN architecture to process input images, which are then fused at the feature level using an attention mechanism. The proposed approach is evaluated on Whole Brain Atlas Harvard dataset. The EFCNN model demonstrated superior performance in various performance indices, including ISSIM, MI, and PSNR, with respective values of 0.41, 4.42, and 57.21 when SA was utilized. Furthermore, the proposed model exhibited favourable performance in terms of Spatial Frequency, Average Gradient, and Edge-intensity, with corresponding values of 57.3, 16.83, and 157.72 on a medical dataset when EFCNN was applied without SA fusion. However, subjective evaluation indicated that images were improved with SA fusion. These results indicate that the EFCNN model surpasses state-of-the-art methods. An exhaustive ablation study was conducted to investigate the efficacy of the proposed model, which further confirmed its accuracy. The significance of this work is its potential implications for medical diagnosis and treatment planning, where precise and efficient image analysis is crucial.
Similar content being viewed by others
Availability of data and materials
Publically available dataset i.e. Whole Brain Atlas Harvard medical dataset [21]
References
Zhou T, Li L, Bredell G, Li J, Unkelbach J, Konukoglu E (2023) Volumetric memory network for interactive medical image segmentation. Med Image Anal 83:102599
Cheng C, Xu T, Wu X-J (2023) Mufusion: A general unsupervised image fusion network based on memory unit. Inform Fusion 92:80–92
Ding Z, Li H, Guo Y, Zhou D, Liu Y, Xie S (2023) M4fnet: Multimodal medical image fusion network via multi-receptive-field and multi-scale feature integration. Comput Biol Med 159:106923
Zhang G, Nie R, Cao J, Chen L, Zhu Y (2023) Fdgnet: A pair feature difference guided network for multimodal medical image fusion. Biomedical Signal Processing and Control 81:104545
Zhao Z, Bai H, Zhang J, Zhang Y, Xu S, Lin Z, Timofte R, Van Gool L (2023) Cddfuse: Correlation-driven dual-branch feature decomposition for multi-modality image fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5906–5916
Liu J, Dian R, Li S, Liu H (2023) Sgfusion: A saliency guided deep-learning framework for pixel-level image fusion. Inform Fusion 91:205–214
Goyal S, Singh V, Rani A, Yadav N (2022) Multimodal image fusion and denoising in nsct domain using cnn and fotgv. Biomedical Signal Processing and Control 71:103214
Si Y et al (2021) Lppcnn: A laplacian pyramid-based pulse coupled neural network method for medical image fusion. J Appl Sci Eng 24(3):299–305
Liu Y, Chen X, Cheng J, Peng H (2017) A medical image fusion method based on convolutional neural networks. In: 2017 20th International conference on information fusion (Fusion). IEEE, pp 1–7
Wang C, Yang G, Papanastasiou G, Tsaftaris SA, Newby DE, Gray C, Macnaught G, MacGillivray TJ (2021) Dicyc: Gan-based deformation invariant cross-domain information fusion for medical image synthesis. Inform Fusion 67:147–160
Reddy M, Reddy P, Reddy P (2021) Segmentation of fused mr and ct images using dl-cnn with pgk and nlem filtered aacgk-fcm. Biomedical Signal Processing and Control 68:102618
Rani M, Yadav J, Rathee N, Goyal S (2022) Comparative study of various preprocessing technique for cnn based image fusion. In: 2022 IEEE Delhi Section Conference (DELCON). IEEE, pp 1–4
Ma J, Xu H, Jiang J, Mei X, Zhang X-P (2020) Ddcgan: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Transactions on image processing 29:4980–4995
Lahoud F, Süsstrunk S (2019) Zero-learning fast medical image fusion. In: 2019 22th International Conference on Information Fusion (FUSION). IEEE, pp 1–8
Li W, Li R, Fu J, Peng X (2022) Msenet: A multi-scale enhanced network based on unique features guidance for medical image fusion. Biomed Signal Process Control 74:103534
Liu Y, Wang L, Li H, Chen X (2022) Multi-focus image fusion with deep residual learning and focus property detection. Inform Fusion 86:1–16
Zhang Y, Liu Y, Sun P, Yan H, Zhao X, Zhang L (2020) Ifcnn: A general image fusion framework based on convolutional neural network. Inform Fusion 54:99–118
Jin Z-R, Deng L-J, Zhang T-J, Jin X-X (2021) Bam: Bilateral activation mechanism for image fusion. In: Proceedings of the 29th ACM International conference on multimedia, pp 4315–4323
Li H, Zhang L, Jiang M, Li Y (2021) Multi-focus image fusion algorithm based on supervised learning for fully convolutional neural network. Pattern Recognition Letters 141:45–53
Zhou T, Wang S, Zhou Y, Yao Y, Li J, Shao L (2020) Motion-attentive transition for zero-shot video object segmentation. Proceedings of the AAAI Conference on artificial intelligence 34:13066–13073
Johnson KA et al (2001) The whole brain atlas
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition. Ieee, pp 248–255
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CoRR abs/1512.03385. arXiv:1512.03385
Liu Y, Chen X, Cheng J, Peng H, Wang Z (2018) Infrared and visible image fusion with convolutional neural networks. Int J Wavelets, Multiresolution Inform Process 16(03):1850018
Bavirisetti DP, Dhuli R (2015) Fusion of infrared and visible sensor images based on anisotropic diffusion and karhunen-loeve transform. IEEE Sensors J 16(1):203–209
Zhou Z, Wang B, Li S, Dong M (2016) Perceptual fusion of infrared and visible images through a hybrid multi-scale decomposition with gaussian and bilateral filters. Inform Fusion 30:15–26
Zhang Y, Zhang L, Bai X, Zhang L (2017) Infrared and visual image fusion through infrared feature extraction and visual information preservation. Infrared Physics & Technology 83:227–237
Li X, Zhou F, Tan H, Zhang W, Zhao C (2021) Multimodal medical image fusion based on joint bilateral filter and local gradient energy. Inform Sci 569:302–325
Zhang X, Ye P, Xiao G (2020) Vifb: A visible and infrared image fusion benchmark. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition workshops
Yang C, Zhang J-Q, Wang X-R, Liu X (2008) A novel similarity based quality metric for image fusion. Inform Fusion 9(2):156–160
Qu G, Zhang D, Yan P (2002) Information measure for performance of image fusion. Electron Lett 387:1
Naidu V (2010) Discrete cosine transform-based image fusion. Def Sci J 60(1):48
Li S, Yang B (2008) Multifocus image fusion using region segmentation and spatial frequency. Image and Vis Comput 26(7):971–979
Zhao W, Wang D, Lu H (2018) Multi-focus image fusion with a natural enhancement via a joint multi-level deeply supervised convolutional neural network. IEEE Trans Circ Syst Vid Technol 29(4):1102–1115
Rajalingam B, Priya R (2018) Hybrid multimodality medical image fusion technique for feature enhancement in medical diagnosis. Int J Eng Sci Invent 2(Special issue):5260
Roberts JW, Van Aardt JA, Ahmed FB (2008) Assessment of image fusion procedures using entropy, image quality, and multispectral classi cation. J Appl Remote Sens 2(1):023522
Shannon C (2001) A mathematical theory of communication. Mob Comput Commun Rev 5:3–55
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Eskicioglu AM, Fisher PS (1995) Image quality measures and their performance. IEEE Trans Commun 43(12):2959–2965
Rajalingam B, Al-Turjman F, Santhoshkumar R, Rajesh M (2020) Intelligent multimodal medical image fusion with deep guided ltering. Multimed Syst 1–15
Shannon C (2001) A mathematical theory of communication. Mobile Comput Commun Rev 5:3–55
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Jyoti Yadav, Neeru Rathee and Sonal Goyal contributed equally to this work.
Appendices
Appendix
A Performance measure
The five evaluation metrics are used as performance measurement which are Mutual Information (MI) [31], Spatial frequency [33] and Average Gradient [34], Edge-intensity [35], and Entropy [36, 37]. In information theory mutual information (MI) represents statistical dependency of one variable on other. If X and Y are two source images and F the fused image, the MI can be calculated according to the formula given by (3) [31].
Where \(I_{FX}(f,x)\) and \(I_{FY}(f,y)\) represents amount of information F contains about X and Y respectively.Which can be calculated as given by (4 and 5).
Here \(P_{F}(f)\), \(P_{Y}(y)\) & \(P_{X}(x)\) represents marginal probability distributions and \(P_{FX}(f,x) and P_{FY}(f,y)\) represents joint probability distribution of fused and source images. Firstly [38], proposed an evaluation metric showing structural similarity of source images, later [30] proposed an improved version of this metric i.e. ISSIM shown by the equation given below (6), successfully distinguishing conflicted and complementary information from redundant regions.
Where \(\lambda (w)\) is local weight and s(x(\(\mid \))w) and s(y(\(\mid \))w) are variances of \(w_{x}\) and \(w_{y}\) respectively.
Spatial frequency (SF) [39],given by equation 6 represents the overall active level of the image.The metric selects the region with maximum spatial frequency by segregating the image by normalized cuts, and combining all such regions to reconstruct the final image [33]. Equations 7 and 8 shows the row (RF) and column frequency (CF) of the image.
Average Gradient (AG) [34] given by (10). A higher value of AG represent more accurate edge details.
Where (m, n) shows image co-ordinate and their gradients are horizontal and vertical gradients respectively. Peak signal to Noise Ratio (PSNR) is given by (11) [32].
A high value of PSNR indicates good results.
A higher value of edge intensity given by (12) [40] shows clearer fused image. The edge intensity can be calculated using Sobel operator.
where
Another performance measure Entropy (EN) has been widely used in remote sensing applications to set out the information content of an image [36]. Shannon and Claude [41] were first to propose the use of entropy for estimating information content. The original shannon’s equaton is given by (17).
where i is the no. of messages in a set. and \(P(K_{i})\) is probability of transpiring a message. The same theory that is applied on messages can be applied on images too if distinct messages are considered as pixels and message set by 2D image. It assumed that the fused image will have more gray level than the individual one as more the amount of information more will be the gray levels. But often it is not the information but the noise that is contributing towards the increased number of gray levels[36].
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rani, M., Yadav, J., Rathee, N. et al. Efficient fused convolution neural network (EFCNN) for feature level fusion of medical images. Multimed Tools Appl 83, 40179–40214 (2024). https://doi.org/10.1007/s11042-023-16872-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16872-y