Skip to main content

Abstract

Convolutional neural networks are used to classify dermoscopic skin lesion images. The high accuracy of deep learning models is well documented; however, those models do not perform very well on testing (unseen data) sets due to imbalanced classes of images. To tackle this problem, over-sampling and under-sampling methods are explored in this study. Part 1 of the study focuses on the details of these sampling techniques, while Part 2 highlights the architecture of the deep learning model and its performance when using both sampling approaches. The results of Part 1 show that through the use of unsupervised learning techniques, namely, Hierarchical Clustering, Self-Organizing Maps, and K-Means, similar images are clustered, based on the skin lesions’ shape and color. Using augmentation for oversampling, 32,731 images are included for the training task in total. For undersampling, unsupervised learning techniques suggested 3 or 4 sub-groups of melanocytic nevi. Going through those clusters, the image background color also affects the way unsupervised learning techniques group similar images together.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Harangi B (2018) Skin lesion classification with ensembles of deep convolutional neural networks. J Biomed Inform 86:25–32

    Article  Google Scholar 

  2. Skin Cancer (Including Melanoma)—Patient Version, National Institute of Health page; https://www.cancer.gov/types/skin. Last accessed 25 Feb 2023

  3. ISIC Challenge Datasets, ISIC Challenge page, https://challenge.isic-archive.com/data/#2018. Last accessed 25 Feb 2023

  4. Wu Y, Chen B, Zeng A, Pan D, Wang R, Zhao S (2022) Skin cancer classification with deep learning: a systematic review. Front Oncol 12

    Google Scholar 

  5. Dubey R, Zhou J, Wang Y, Thompson PM, Ye J (2014) Alzheimer's disease neuroimaging initiative. Analysis of sampling techniques for imbalanced data: an n = 648 ADNI study. NeuroImage 87:220–241

    Google Scholar 

  6. Kim HC, Kang MJ (2020) A comparison of methods to reduce overfitting in neural networks. Int J Smart Converg 9(2):173–178

    Google Scholar 

  7. Jeong DH, Kim SE, Choi WH, Ahn SHA (2022) Comparative study on the influence of undersampling and oversampling techniques for the classification of physical activities using an imbalanced accelerometer dataset. Healthcare 10(7):1255

    Google Scholar 

  8. Yang Z, Sinnott RO, Bailey J, Ke QA (2022) Survey of automated data augmentation algorithms for deep learning-based image classification tasks. arXiv:2206.06544

  9. Yen S, Lee Y (2006) Cluster-based sampling approaches to imbalanced data distributions. expert systems with applications. In: Proceedings of international data warehousing and knowledge discovery conference, Krakow, Poland, vol 8, pp 427–436

    Google Scholar 

  10. Lee T, Ng V, Gallagher R, Coldman A, McLean D (1997) Dullrazor®: a software approach to hair removal from images. Comput Biol Med 27(6):533–543

    Article  Google Scholar 

  11. Riveros NAM, Espitia BAC, Pico LEA (2019) Comparison between K-means and self-organizing maps algorithms used for diagnosis spinal column patients. Inform Med Unlocked 16:100206

    Article  Google Scholar 

  12. Oakley A. Melanocytic Naevus. https://dermnetnz.org/topics/melanocytic-naevus. Last accessed 26 Feb 2023

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quynh T. Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, Q.T., Jancic-Turner, T., Kaur, A., Naguib, R.N.G., Sakim, H.A.M. (2024). Sampling Methods to Balance Classes in Dermoscopic Skin Lesion Images. In: Ahmad, N.S., Mohamad-Saleh, J., Teh, J. (eds) Proceedings of the 12th International Conference on Robotics, Vision, Signal Processing and Power Applications. RoViSP 2021. Lecture Notes in Electrical Engineering, vol 1123. Springer, Singapore. https://doi.org/10.1007/978-981-99-9005-4_51

Download citation

Publish with us

Policies and ethics