Beyond MobileNet: An Improved MobileNet for Retinal Diseases

Zhu, Wenhui; Qiu, Peijie; Chen, Xiwen; Li, Huayu; Wang, Hao; Lepore, Natasha; Dumitrascu, Oana M.; Wang, Yalin

doi:10.1007/978-3-031-54857-4_5

Wenhui Zhu ORCID: orcid.org/0009-0000-5207-6283¹⁰,
Peijie Qiu¹¹,
Xiwen Chen¹²,
Huayu Li¹³,
Hao Wang¹²,
Natasha Lepore¹⁵,
Oana M. Dumitrascu¹⁴ &
…
Yalin Wang ORCID: orcid.org/0000-0002-6241-735X¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14563))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

125 Accesses

Abstract

Myopic Maculopathy (MM) is the leading cause of severe vision loss or blindness. Deep learning-based automated tools are indispensable in assisting clinicians in diagnosing and monitoring RD in modern medicine. Recently, an increasing number of works in this field have taken advantage of Vision Transformer to achieve state-of-the-art performance with more parameters and higher model complexity compared to Convolutional Neural Networks (CNNs). Such sophisticated model designs, however, are prone to be overfitting and hinder their advantages in specific tasks in medical image analysis. In this work, we argue that a well-calibrated CNN model may mitigate these problems. To this end, we empirically investigated the macro and micro designs of a CNN and its training strategies by starting with a standard MobileNet. Based on the investigation, we proposed a lightweight MobileNet training framework equipped with a series of optimal parameters and modules based on retinal images. As a result of performance, our model secured third place in the MICCAI MMAC 2023 Challenge - Classification of Myopic Maculopathy. Our software package is available at https://github.com/Retinal-Research/NN-MOBILENET

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arega, T.W., Legrand, F., Bricq, S., Meriaudeau, F.: Using MRI-specific data augmentation to enhance the segmentation of right ventricle in multi-disease, multi-center and multi-view cardiac MRI. In: Puyol Antón, E., et al. (eds.) STACOM 2021. LNCS, vol. 13131, pp. 250–258. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-93722-5_27
Chapter Google Scholar
Che, H., Jin, H., Chen, H.: Learning robust representation for joint grading of ophthalmic diseases via adaptive curriculum and feature disentanglement. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13433, pp. 523–533. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_50
Chapter Google Scholar
Cubuk, E.D., Zoph, B., et al.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)
Google Scholar
Dai, L., et al.: A deep learning system for detecting diabetic retinopathy across the disease spectrum. Nat. Commun. 12(1), 3242 (2021)
Article Google Scholar
Decencière, E., et al.: Feedback on a publicly distributed image database: the Messidor database. Image Anal. Stereol. 33, 231–234 (2014)
Article Google Scholar
Han, D., Yun, S., Heo, B., Yoo, Y.: Rethinking channel dimensions for efficient model design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 732–741 (2021)
Google Scholar
Heo, B., et al.: AdamP: slowing down the slowdown for momentum optimizers on scale-invariant weights. arXiv preprint arXiv:2006.08217 (2020)
Holden, B.A., et al.: Global prevalence of myopia and high myopia and temporal trends from 2000 through 2050. Ophthalmology 123(5), 1036–1042 (2016)
Article Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Jiang, Y., et al.: Satformer: saliency-guided abnormality-aware transformer for retinal disease classification in fundus image. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, pp. 987–994 (2022)
Google Scholar
Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S., Shah, M.: Transformers in vision: a survey. ACM Comput. Surv. (CSUR) 54(10s), 1–41 (2022)
Article Google Scholar
Li, X., Hu, X., Yu, L., Zhu, L., Fu, C.W., Heng, P.A.: CANet: cross-disease attention network for joint diabetic retinopathy and diabetic macular edema grading. IEEE Trans. Med. Imaging 39, 1483–1493 (2020)
Article Google Scholar
Lin, Z., et al.: A framework for identifying diabetic retinopathy based on anti-noise detection and attention-based fusion. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 74–82. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_9
Chapter Google Scholar
Liu, R., et al.: DeepDRiD: diabetic retinopathy-grading and image quality estimation challenge. Patterns 3(6), 100512 (2022)
Article Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 10012–10022 (2021)
Google Scholar
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A ConvNet for the 2020s. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations
Google Scholar
Ramachandran, P., Parmar, N., Vaswani, A., Bello, I., Levskaya, A., Shlens, J.: Stand-alone self-attention in vision models. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Sánchez, C.I., et al.: Evaluation of a computer-aided diagnosis system for diabetic retinopathy screening on public data. Invest. Ophthalmol. Vis. Sci. 52(7), 4866–4871 (2011)
Article Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Sun, R., Li, Y., Zhang, T., Mao, Z., Wu, F., Zhang, Y.: Lesion-aware transformers for diabetic retinopathy grading. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 10938–10947 (2021)
Google Scholar
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 648–656 (2015)
Google Scholar
Uysal, E.S., Bilici, M.Ş., Zaza, B.S., Özgenç, M.Y., Boyar, O.: Exploring the limits of data augmentation for retinal vessel segmentation. arXiv preprint arXiv:2105.09365 (2021)
Wang, Z., Yin, Y., Shi, J., Fang, W., Li, H., Wang, X.: Zoom-in-Net: deep mining lesions for diabetic retinopathy detection. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 267–275. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_31
Chapter Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Yorston, D.: Retinal diseases and vision 2020. Commun. Eye Health 16(46), 19–20 (2003)
Google Scholar
Yu, S., et al.: MIL-VT: multiple instance learning enhanced vision transformer for fundus image classification. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12908, pp. 45–54. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87237-3_5
Chapter Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 6023–6032 (2019)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. In: International Conference on Learning Representations (2018)
Google Scholar
Zhong, Z., et al.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 13001–13008 (2020)
Google Scholar
Zhou, Y., et al.: Collaborative learning of semi-supervised segmentation and classification for medical images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Zhu, W., et al.: Self-supervised equivariant regularization reconciles multiple instance learning: joint referable diabetic retinopathy classification and lesion segmentation. In: 18th International Symposium on Medical Information Processing and Analysis (SIPAIM) (2022)
Google Scholar
Zhu, W., et al.: OTRE: where optimal transport guided unpaired image-to-image translation meets regularization by enhancing. In: Frangi, A., de Bruijne, M., Wassermann, D., Navab, N. (eds.) IPMI 2023. LNCS, vol. 13939, pp. 415–427. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-34048-2_32
Chapter Google Scholar
Zhu, W., Qiu, P., Farazi, M., Nandakumar, K., Dumitrascu, O.M., Wang, Y.: Optimal transport guided unsupervised learning for enhancing low-quality retinal images. arXiv preprint arXiv:2302.02991 (2023)
Zhu, W., Qiu, P., Lepore, N., Dumitrascu, O.M., Wang, Y.: NNMobile-Net: rethinking cnn design for deep learning-based retinopathy research. arXiv preprint arXiv:2306.01289 (2023)

Download references

Author information

Authors and Affiliations

School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
Wenhui Zhu & Yalin Wang
McKeley School of Engineering, Washington University in St. Louis, St. Louis, MO, USA
Peijie Qiu
School of Computing, Clemson University, Clemson, SC, USA
Xiwen Chen & Hao Wang
Department of Electrical and Computer Engineering, The University of Arizona, Tucson, AZ, USA
Huayu Li
Department of Neurology, Mayo Clinic, Phoenix, AZ, USA
Oana M. Dumitrascu
CIBORG Lab, Department of Radiology, Children’s Hospital Los Angeles, Los Angeles, CA, USA
Natasha Lepore

Authors

Wenhui Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Peijie Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Xiwen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Huayu Li
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Natasha Lepore
View author publications
You can also search for this author in PubMed Google Scholar
Oana M. Dumitrascu
View author publications
You can also search for this author in PubMed Google Scholar
Yalin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenhui Zhu .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Hong Kong University of Science and Technology, Hong Kong, Hong Kong
Hao Chen
Tsinghua University, Beijing, China
Tien Yin Wong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, W. et al. (2024). Beyond MobileNet: An Improved MobileNet for Retinal Diseases. In: Sheng, B., Chen, H., Wong, T.Y. (eds) Myopic Maculopathy Analysis. MICCAI 2023. Lecture Notes in Computer Science, vol 14563. Springer, Cham. https://doi.org/10.1007/978-3-031-54857-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-54857-4_5
Published: 29 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54856-7
Online ISBN: 978-3-031-54857-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Beyond MobileNet: An Improved MobileNet for Retinal Diseases