Abstract
It is common for doctors to simultaneously consider multi-modal information in diagnosis. However, how to use multi-modal medical images effectively has not been fully studied in the field of deep learning within such a context. In this paper, we address the task of end-to-end segmentation based on multi-modal data and propose a novel deep learning framework, multiple subspace attention-based deep multi-modal fusion network (referred to as MSAFusionNet hereon-forth). More specifically, MSAFusionNet consists of three main components: (1) a multiple subspace attention model that contains inter-attention modules and generalized squeeze-and-excitation modules, (2) a multi-modal fusion network which leverages CNN-LSTM layers to integrate sequential multi-modal input images, and (3) a densely-dilated U-Net as the encoder-decoder backbone for image segmentation. Experiments on ISLES 2018 data set have shown that MSAFusionNet achieves the state-of-the-art segmentation accuracy.
Keywords
S. Zhang and C. Zhang—Equal contribution. This work is done when Sen Zhang is an intern at Huawei.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Normalization is applied to improve the contrast.
- 2.
https://www.smir.ch/ISLES/Start2018#resultsTesting [Accessed March 25, 2019].
- 3.
No details are available of these methods.
References
ISLES 2018 Challenge. http://www.isles-challenge.org/
Dolz, J., et al.: HyperDense-Net: a hyper-densely connected CNN for multi-modal image segmentation. IEEE TMI 38, 1116–1126 (2018)
Zhang, W., et al.: Deep convolutional neural networks for multi-modality isointense infant brain image segmentation. NeuroImage 108, 214–224 (2015)
Nie, D., et al.: Fully convolutional networks for multi-modality isointense infant brain image segmentation. In: ISBI (2016)
Guo, Z., et al.: Medical image segmentation based on multi-modal convolutional neural network: study on image fusion schemes. In: ISBI (2018)
Tseng, K.L., et al.: Joint sequence learning and cross-modality convolution for 3D biomedical segmentation. In: CVPR (2017)
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Huang, G., et al.: Densely connected convolutional networks. In: CVPR (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Yu, F., et al.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
Yang, M., et al.: DenseASPP for semantic segmentation in street scenes. In: CVPR (2018)
Srivastava, N., et al.: Unsupervised learning of video representations using LSTMs. In: ICML (2015)
Hu, J., et al.: Squeeze-and-excitation network. In: CVPR (2018)
Yang, X., et al.: Deep multimodal representation learning from temporal data. In: CVPR (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, S. et al. (2019). MSAFusionNet: Multiple Subspace Attention Based Deep Multi-modal Fusion Network. In: Suk, HI., Liu, M., Yan, P., Lian, C. (eds) Machine Learning in Medical Imaging. MLMI 2019. Lecture Notes in Computer Science(), vol 11861. Springer, Cham. https://doi.org/10.1007/978-3-030-32692-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-32692-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32691-3
Online ISBN: 978-3-030-32692-0
eBook Packages: Computer ScienceComputer Science (R0)