Abstract
Diabetic Retinopathy (DR), a prevalent and severe complication of diabetes, affects millions of individuals globally, underscoring the need for accurate and timely diagnosis. Recent advancements in imaging technologies, such as Ultra-WideField Color Fundus Photography (UWF-CFP) imaging and Optical Coherence Tomography Angiography (OCTA), provide opportunities for the early detection of DR but also pose significant challenges given the disparate nature of the data they produce. This study introduces a novel multimodal approach that leverages these imaging modalities to notably enhance DR classification. Our approach integrates 2D UWF-CFP images and 3D high-resolution 6\(\,\times \,\)6 mm\(^3\) OCTA (both structure and flow) images using a fusion of ResNet50 and 3D-ResNet50 models, with Squeeze-and-Excitation (SE) blocks to amplify relevant features. Additionally, to increase the model’s generalization capabilities, a multimodal extension of Manifold Mixup, applied to concatenated multimodal features, is implemented. Experimental results demonstrate a remarkable enhancement in DR classification performance with the proposed multimodal approach compared to methods relying on a single modality only. The methodology laid out in this work holds substantial promise for facilitating more accurate, early detection of DR, potentially improving clinical outcomes for patients.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Early treatment diabetic retinopathy study design and baseline patient characteristics: Etdrs report number 7. Ophthalmology 98(5, Supplement), 741–756 (1991). https://doi.org/10.1016/S0161-6420(13)38009-9
Akhavan Aghdam, M., Sharifi, A., Pedram, M.M.: Combination of RS-fMRI and SMRI data to discriminate autism spectrum disorders in young children using deep belief network. J. Dig. Imaging 31, 895–903 (2018)
Al-Absi, H.R., Islam, M.T., Refaee, M.A., Chowdhury, M.E., Alam, T.: Cardiovascular disease diagnosis from DXA scan and retinal images using deep learning. Sensors 22(12), 4310 (2022)
El-Sappagh, S., Abuhmed, T., Islam, S.R., Kwak, K.S.: Multimodal multitask deep learning model for Alzheimer’s disease progression detection based on time series data. Neurocomputing 412, 197–215 (2020)
Hao, X., et al.: Mixgen: a new multi-modal data augmentation (2023)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7132–7141 (2018). https://doi.org/10.1109/CVPR.2018.00745
Lahsaini, I., El Habib Daho, M., Chikh, M.A.: Deep transfer learning based classification model for COVID-19 using chest CT-scans. Pattern Recogn. Lett. 152, 122–128 (2021). https://doi.org/10.1016/j.patrec.2021.08.035
Li, J., et al.: Ultra-widefield color fundus photography combined with high-speed ultra-widefield swept-source optical coherence tomography angiography for non-invasive detection of lesions in diabetic retinopathy. Front. Public Health 10 (2022). https://doi.org/10.3389/fpubh.2022.1047608
Li, T., et al.: Applications of deep learning in fundus images: a review (2021). https://arxiv.org/abs/2101.09864
Li, Y., et al.: Multimodal information fusion for glaucoma and diabetic retinopathy classification. In: Antony, B., Fu, H., Lee, C.S., MacGillivray, T., Xu, Y., Zheng, Y. (eds.) OMIA 2022. LNCS, vol. 13576, pp. 53–62. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16525-2_6
Lin, R., Hu, H.: Adapt and explore: multimodal mixup for representation learning. Available at SSRN (2023). https://doi.org/10.2139/ssrn.4461697
Liu, Z., et al.: Learning multimodal data augmentation in feature space (2023)
Qian, X., et al.: A combined ultrasonic b-mode and color doppler system for the classification of breast masses using neural network. Eur. Radiol. 30, 3023–3033 (2020)
Quellec, G., Al Hajj, H., Lamard, M., Conze, P.H., Massin, P., Cochener, B.: Explain: explanatory artificial intelligence for diabetic retinopathy diagnosis. Med. Image Anal. 72, 102118 (2021). https://doi.org/10.1016/j.media.2021.102118
Shamshad, F., et al.: Transformers in medical imaging: a survey. Med. Image Anal. 88, 102802 (2023). https://doi.org/10.1016/j.media.2023.102802
Silva, P.S., et al.: Diabetic retinopathy severity and peripheral lesions are associated with nonperfusion on ultrawide field angiography. Ophthalmology 122(12), 2465–2472 (2015). https://doi.org/10.1016/j.ophtha.2015.07.034
Sleeman, W.C., Kapoor, R., Ghosh, P.: Multimodal classification: current landscape, taxonomy and future directions. ACM Comput. Surv. 55(7) (2022). https://doi.org/10.1145/3543848
Sun, Z., Yang, D., Tang, Z., et al.: Optical coherence tomography angiography in diabetic retinopathy: an updated review. Eye 35(11), 149–161 (2021). https://doi.org/10.1038/s41433-020-01233-y
Teo, Z.L., et al.: Global prevalence of diabetic retinopathy and projection of burden through 2045: systematic review and meta-analysis. Ophthalmology 128(11), 1580–1591 (2021)
Verma, V., et al.: Manifold mixup: better representations by interpolating hidden states (2019)
Wisely, C.E., et al.: Convolutional neural network to identify symptomatic Alzheimer’s disease using multimodal retinal imaging. Br. J. Ophthalmol. 106(3), 388–395 (2022). https://doi.org/10.1136/bjophthalmol-2020-317659
Wu, J., et al.: Gamma challenge: glaucoma grading from multi-modality images. arXiv preprint arXiv:2202.06511 (2022)
Xiong, J., et al.: Multimodal machine learning using visual fields and peripapillary circular oct scans in detection of glaucomatous optic neuropathy. Ophthalmology 129(2), 171–180 (2022)
Yang, J., Zhang, B., Wang, E., et al.: Ultra-wide field swept-source optical coherence tomography angiography in patients with diabetes without clinically detectable retinopathy. BMC Ophthalmol. 21(1), 192 (2021). https://doi.org/10.1186/s12886-021-01933-3
Zang, P., et al.: A diabetic retinopathy classification framework based on deep-learning analysis of oct angiography. Transl. Vision Sci. Technol. 11(7), 10–10 (2022)
Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. CoRR abs/1710.09412 (2017). https://arxiv.org/abs/1710.09412
Zhao, X., Chen, Y., Liu, S., Zang, X., Xiang, Y., Tang, B.: TMMDA: a new token mixup multimodal data augmentation for multimodal sentiment analysis. In: Proceedings of the ACM Web Conference 2023. WWW 2023, pp. 1714–1722. Association for Computing Machinery (2023). https://doi.org/10.1145/3543507.3583406
Zong, W., Lee, J.K., Liu, C., Carver, E.N., Feldman, A.M., Janic, E.A.: A deep dive into understanding tumor foci classification using multiparametric MRI based on convolutional neural network. Med. Phys. 47(9), 4077–4086 (2020)
Acknowledgements
The work takes place in the framework of Evired, an ANR RHU project. This work benefits from State aid managed by the French National Research Agency under “Investissement d’Avenir” program bearing the reference ANR-18-RHUS-0008.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
El Habib Daho, M. et al. (2023). Improved Automatic Diabetic Retinopathy Severity Classification Using Deep Multimodal Fusion of UWF-CFP and OCTA Images. In: Antony, B., Chen, H., Fang, H., Fu, H., Lee, C.S., Zheng, Y. (eds) Ophthalmic Medical Image Analysis. OMIA 2023. Lecture Notes in Computer Science, vol 14096. Springer, Cham. https://doi.org/10.1007/978-3-031-44013-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-44013-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44012-0
Online ISBN: 978-3-031-44013-7
eBook Packages: Computer ScienceComputer Science (R0)