Abstract
Myopic Maculopathy (MM) is the leading cause of severe vision loss or blindness. Deep learning-based automated tools are indispensable in assisting clinicians in diagnosing and monitoring RD in modern medicine. Recently, an increasing number of works in this field have taken advantage of Vision Transformer to achieve state-of-the-art performance with more parameters and higher model complexity compared to Convolutional Neural Networks (CNNs). Such sophisticated model designs, however, are prone to be overfitting and hinder their advantages in specific tasks in medical image analysis. In this work, we argue that a well-calibrated CNN model may mitigate these problems. To this end, we empirically investigated the macro and micro designs of a CNN and its training strategies by starting with a standard MobileNet. Based on the investigation, we proposed a lightweight MobileNet training framework equipped with a series of optimal parameters and modules based on retinal images. As a result of performance, our model secured third place in the MICCAI MMAC 2023 Challenge - Classification of Myopic Maculopathy. Our software package is available at https://github.com/Retinal-Research/NN-MOBILENET
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arega, T.W., Legrand, F., Bricq, S., Meriaudeau, F.: Using MRI-specific data augmentation to enhance the segmentation of right ventricle in multi-disease, multi-center and multi-view cardiac MRI. In: Puyol Antón, E., et al. (eds.) STACOM 2021. LNCS, vol. 13131, pp. 250–258. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-93722-5_27
Che, H., Jin, H., Chen, H.: Learning robust representation for joint grading of ophthalmic diseases via adaptive curriculum and feature disentanglement. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13433, pp. 523–533. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_50
Cubuk, E.D., Zoph, B., et al.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)
Dai, L., et al.: A deep learning system for detecting diabetic retinopathy across the disease spectrum. Nat. Commun. 12(1), 3242 (2021)
Decencière, E., et al.: Feedback on a publicly distributed image database: the Messidor database. Image Anal. Stereol. 33, 231–234 (2014)
Han, D., Yun, S., Heo, B., Yoo, Y.: Rethinking channel dimensions for efficient model design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 732–741 (2021)
Heo, B., et al.: AdamP: slowing down the slowdown for momentum optimizers on scale-invariant weights. arXiv preprint arXiv:2006.08217 (2020)
Holden, B.A., et al.: Global prevalence of myopia and high myopia and temporal trends from 2000 through 2050. Ophthalmology 123(5), 1036–1042 (2016)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Jiang, Y., et al.: Satformer: saliency-guided abnormality-aware transformer for retinal disease classification in fundus image. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, pp. 987–994 (2022)
Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S., Shah, M.: Transformers in vision: a survey. ACM Comput. Surv. (CSUR) 54(10s), 1–41 (2022)
Li, X., Hu, X., Yu, L., Zhu, L., Fu, C.W., Heng, P.A.: CANet: cross-disease attention network for joint diabetic retinopathy and diabetic macular edema grading. IEEE Trans. Med. Imaging 39, 1483–1493 (2020)
Lin, Z., et al.: A framework for identifying diabetic retinopathy based on anti-noise detection and attention-based fusion. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 74–82. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_9
Liu, R., et al.: DeepDRiD: diabetic retinopathy-grading and image quality estimation challenge. Patterns 3(6), 100512 (2022)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 10012–10022 (2021)
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A ConvNet for the 2020s. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations
Ramachandran, P., Parmar, N., Vaswani, A., Bello, I., Levskaya, A., Shlens, J.: Stand-alone self-attention in vision models. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Sánchez, C.I., et al.: Evaluation of a computer-aided diagnosis system for diabetic retinopathy screening on public data. Invest. Ophthalmol. Vis. Sci. 52(7), 4866–4871 (2011)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Sun, R., Li, Y., Zhang, T., Mao, Z., Wu, F., Zhang, Y.: Lesion-aware transformers for diabetic retinopathy grading. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 10938–10947 (2021)
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 648–656 (2015)
Uysal, E.S., Bilici, M.Ş., Zaza, B.S., Özgenç, M.Y., Boyar, O.: Exploring the limits of data augmentation for retinal vessel segmentation. arXiv preprint arXiv:2105.09365 (2021)
Wang, Z., Yin, Y., Shi, J., Fang, W., Li, H., Wang, X.: Zoom-in-Net: deep mining lesions for diabetic retinopathy detection. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 267–275. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_31
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Yorston, D.: Retinal diseases and vision 2020. Commun. Eye Health 16(46), 19–20 (2003)
Yu, S., et al.: MIL-VT: multiple instance learning enhanced vision transformer for fundus image classification. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12908, pp. 45–54. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87237-3_5
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 6023–6032 (2019)
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. In: International Conference on Learning Representations (2018)
Zhong, Z., et al.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 13001–13008 (2020)
Zhou, Y., et al.: Collaborative learning of semi-supervised segmentation and classification for medical images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2019)
Zhu, W., et al.: Self-supervised equivariant regularization reconciles multiple instance learning: joint referable diabetic retinopathy classification and lesion segmentation. In: 18th International Symposium on Medical Information Processing and Analysis (SIPAIM) (2022)
Zhu, W., et al.: OTRE: where optimal transport guided unpaired image-to-image translation meets regularization by enhancing. In: Frangi, A., de Bruijne, M., Wassermann, D., Navab, N. (eds.) IPMI 2023. LNCS, vol. 13939, pp. 415–427. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-34048-2_32
Zhu, W., Qiu, P., Farazi, M., Nandakumar, K., Dumitrascu, O.M., Wang, Y.: Optimal transport guided unsupervised learning for enhancing low-quality retinal images. arXiv preprint arXiv:2302.02991 (2023)
Zhu, W., Qiu, P., Lepore, N., Dumitrascu, O.M., Wang, Y.: NNMobile-Net: rethinking cnn design for deep learning-based retinopathy research. arXiv preprint arXiv:2306.01289 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhu, W. et al. (2024). Beyond MobileNet: An Improved MobileNet for Retinal Diseases. In: Sheng, B., Chen, H., Wong, T.Y. (eds) Myopic Maculopathy Analysis. MICCAI 2023. Lecture Notes in Computer Science, vol 14563. Springer, Cham. https://doi.org/10.1007/978-3-031-54857-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-54857-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54856-7
Online ISBN: 978-3-031-54857-4
eBook Packages: Computer ScienceComputer Science (R0)