Skip to main content

Logarithmic Progressive-SMOTE: Oversampling Minorities in Retinal Fundus Multi-disease Image Dataset

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2022)

Abstract

Multiple retinal diseases co-occur more frequently. It is extremely challenging to effectively diagnose several diseases in an image and provide appropriate treatment for those diseases with high accuracy. The performance of such an imbalanced dataset, which has minority classes, can be enhanced. The Retinal Fundus Multi-Disease Image Dataset (RFMiD) is a collection of 3200 Multi-labeled Imbalanced Dataset (MLID) marked with 46 different disease labels. The suggested Logarithmic Progressive Synthetic Minority Oversampling Technique (LP-SMOTE) is intended to oversample the minority classes in imbalanced dataset. The Imbalance Ratio Per Label (IRPL) and Mean Imbalance Ratio (MeanIR) are used and assessed to distinguish the minority and majority classes. The logarithmic progressive sampling per label is applied to minority classes. The Oversampling is implemented to level up the samples in minority classes. The proposed technique improved the overall minority class samples by 5.4 times. After employing LP-SMOTE technique, the overall validation accuracy and test accuracy is improved by 2.81% and 4.53% respectively on VGG19 pre-trained model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. World Health Organization, 14 October 2021. https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual-impairment. Accessed 22 Mar 2022

  2. Vision Loss Expert Group of the Global Burden of Disease Study: Causes of blindness and vision impairment in 2020 and trends over 30 years: evaluating the prevalence of avoidable blindness in relation to “VISION 2020: the Right to Sight”. Lancet Glob. Health (2020). https://doi.org/10.1016/S2214-109X(20)30489-7

  3. WHO Publishes SEAsia-Specific DR Guidelines - The International Agency for the Prevention of Blindness. https://www.iapb.org/news/who-publishes-seasia-specific-dr-guidelines. Accessed 05 Mar 2022

  4. World report on vision, World Health Organization (2019). https://cdn.who.int/media/docs/default-source/infographics-pdf/world-vision-infographic-final.pdf?sfvrsn=85b7bcde_2. Accessed 22 Mar 2022

  5. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953

    Article  MATH  Google Scholar 

  6. Xu, Y., Noy, A., Lin, M., Qian, Q., Li, H., Jin, R.: WeMix: how to better utilize data augmentation. arXiv (2020). https://doi.org/10.48550/arxiv.2010.01267

  7. Rodriguez, M.A., AlMarzouqi, H., Liatsis, P.: Multi-label retinal disease classification using transformers. arXiv (2022). https://doi.org/10.48550/arXiv.2207.02335

  8. Rodriguez, M., AlMarzouqi, H., Liatsis, P.: Multi-label retinal disease (MuReD) dataset. IEEE DataPort (2022). https://doi.org/10.21227/7fx7-8q47

  9. Pereira, R.M., Costa, Y.M.G., Silla, C.N., Jr.: MLTL: a multi-label approach for the Tomek Link undersampling algorithm. Neurocomputing 383, 95–105 (2020). https://doi.org/10.1016/j.neucom.2019.11.076

    Article  Google Scholar 

  10. Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: MLSMOTE: approaching imbalanced multilabel learning through synthetic instance generation. Knowl.-Based Syst. 89, 385–397 (2015). https://doi.org/10.1016/j.knosys.2015.07.019

    Article  Google Scholar 

  11. Dablain, D., Krawczyk, B., Chawla, N.V.: DeepSMOTE: fusing deep learning and SMOTE for imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. 1–15 (2022). https://doi.org/10.1109/TNNLS.2021.3136503

  12. Müller, D., Soto-Rey, I., Kramer, F.: Multi-disease detection in retinal imaging based on ensembling heterogeneous deep learning models, March 2021. arXiv:2103.14660. https://doi.org/10.48550/arXiv.2103.14660

  13. Pachade, S., et al.: Retinal fundus multi-disease image dataset (RFMiD): a dataset for multi-disease detection research. Data 62 (2021). https://doi.org/10.3390/data6020014

  14. Castellanos, F.J., Valero-Mas, J.J., Calvo-Zaragoza, J., Rico-Juan, J.R.: Oversampling imbalanced data in the string space. Pattern Recogn. Lett. (2018). https://doi.org/10.1016/j.patrec.2018.01.003

  15. Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Part B Cybern. (2009). https://doi.org/10.1109/TSMCB.2008.2007853

  16. Charte, F., Rivera, A., del Jesus, M.J., Herrera, F.: A first approach to deal with imbalance in multi-label datasets. In: Pan, J.-S., Polycarpou, M.M., Woźniak, M., de Carvalho, A.C.P.L.F., Quintián, H., Corchado, E. (eds.) HAIS 2013. LNCS (LNAI), vol. 8073, pp. 150–160. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40846-5_16

    Chapter  Google Scholar 

  17. Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163, 3–16 (2015). https://doi.org/10.1016/j.neucom.2014.08.091

    Article  Google Scholar 

  18. Tarekegn, A.N., Giacobini, M., Michalak, K.: A review of methods for imbalanced multi-label classification. Pattern Recogn. 118 (2021). https://doi.org/10.1016/j.patcog.2021.107965

  19. Bernardo, A., Della Valle, E.: An extensive study of C-SMOTE, a continuous synthetic minority oversampling technique for evolving data streams. Expert Syst. Appl. 196, 116630 (2022)

    Article  Google Scholar 

  20. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv (2020). https://doi.org/10.48550/arxiv.2004.10934

Download references

Acknowledgements

We extend out thanks Technical Education Quality Improvement Program (TEQIP), the World Bank project, for providing a state of the art Center of Excellence in Signal and Image Processing research lab. Also, we thank the authors of “Retinal Fundus Multi-Disease Image Dataset (RFMiD): A Dataset for Multi-Disease Detection Research” for making their dataset RFMiD publicly available.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sachin Panchal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Panchal, S., Kokare, M. (2023). Logarithmic Progressive-SMOTE: Oversampling Minorities in Retinal Fundus Multi-disease Image Dataset. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1776. Springer, Cham. https://doi.org/10.1007/978-3-031-31407-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31407-0_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31406-3

  • Online ISBN: 978-3-031-31407-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics