Abstract
Early diagnosis of Alzheimer’s disease (AD) (e.g., mild cognitive impairment, MCI), timely intervention, and treatment will effectively delay the further development of AD. Structural Magnetic Resonance Imaging (sMRI) and Positron Emission Computed Tomography (PET) play an essential role in diagnosing AD and MCI as they show signs of morphological changes in brain atrophy. However, it is difficult to learn more comprehensive information to diagnose AD and MCI thoroughly by single-modality brain imaging data, and it is challenging to locate the lesion area accurately. For diagnosing AD and MCI, convolutional neural networks (CNNs) have shown quite promising performance. However, the ability of CNNs to model global information is limited due to the properties of CNN sliding windows. In contrast, Transformer lacks modeling of local invariance, but it utilizes a self-attention mechanism to model long-term dependencies. Therefore, a novel cross-modal feature fusion based CNN-Transformer framework for AD and MCI diagnosis has been proposed. Specifically, we firstly exploit a large kernel attention (LKA) module to learn the attention map. Then, there will be two branches, CNN and Transformer, which are utilized to extract higher-level local and global features further. Furthermore, we utilize a modality feature fusion block to fuse the features of the two modalities. The extensive experimental results on the ADNI dataset show that our model outperforms the state-of-the-art methods.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gauthier, R.-N.P., Morais, S.J.A., Webster, C.: World Alzheimer Report 2021. Alzheimer’s Disease International, Journey through the diagnosis of dementia, London, England (2021)
Guan, H., Liu, Y., Yang, E., Yap, P.T., Shen, D., Liu, M.: Multi-site MRI harmonization via attention-guided deep domain adaptation for brain disorder identification. Med. Image Anal. 71, 102076 (2021)
Lao, H., Zhang, X.: Regression and classification of Alzheimer’s disease diagnosis using NMF-TDNet features from 3D brain MR image. IEEE J. Biomed. Health Inform. 26(3), 1103–1115 (2022)
Lian, C., Liu, M., Pan, Y., Shen, D.: Attention-guided hybrid network for dementia diagnosis with structural MR images. IEEE Trans. Cybern. 52(4), 1992–2003 (2022)
Pan, X., et al.: Multi-view separable pyramid network for AD prediction at MCI stage by (18)F-FDG brain PET imaging. IEEE Trans. Med. Imaging 40(1), 81–92 (2021)
Pan, X., Adel, M., Fossati, C., Gaidon, T., Guedj, E.: Multilevel feature representation of FDG-PET brain images for diagnosing Alzheimer’s disease. IEEE J Biomed. Health Inform. 23(4), 1499–1506 (2019)
Ning, Z., Xiao, Q., Feng, Q., Chen, W., Zhang, Y.: Relation-induced multi-modal shared representation learning for Alzheimer’s disease diagnosis. IEEE Trans. Med. Imaging 40(6), 1632–1645 (2021)
Liu, Y., et al.: Incomplete multi-modal representation learning for Alzheimer’s disease diagnosis. Med. Image Anal. 69, 101953 (2021)
Zhang, J., Liu, M., Le, A., Gao, Y., Shen, D.: Alzheimer’s disease diagnosis using landmark-based features from longitudinal structural MR images. IEEE J. Biomed. Health Inform. 21(6), 1607–1616 (2017)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017)
Graham, B., et al.: LeViT: a vision transformer in ConvNet’s clothing for faster inference. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal,Canada, pp. 12239–12249 (2021)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 4510–4520 (2018)
Tan, M., Le, Q.V.: MixConv: mixed depthwise convolutional kernels. In: 2019 British Machine Vision Conference, Cardiff, Wales, UK, pp. 116.111–116.113 (2019). https://doi.org/10.5244/C.33.116
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Wang, F., et al.: Residual attention network for image classification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, pp. 6450–6458 (2017)
Hu, J., Shen, L., Albanie, S., Sun, G., Vedaldi, A.: Gather-excite: exploiting feature context in convolutional neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 9423–9433 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Qiu, Z., Yang, P., Wang, T., Lei, B. (2022). Hybrid Network Based on Cross-Modal Feature Fusion for Diagnosis of Alzheimer’s Disease. In: Baxter, J.S.H., et al. Ethical and Philosophical Issues in Medical Imaging, Multimodal Learning and Fusion Across Scales for Clinical Decision Support, and Topological Data Analysis for Biomedical Imaging. EPIMI ML-CDS TDA4BiomedicalImaging 2022 2022 2022. Lecture Notes in Computer Science, vol 13755. Springer, Cham. https://doi.org/10.1007/978-3-031-23223-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-23223-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23222-0
Online ISBN: 978-3-031-23223-7
eBook Packages: Computer ScienceComputer Science (R0)