Abstract
Convolutional neural networks are used to classify dermoscopic skin lesion images. The high accuracy of deep learning models is well documented; however, those models do not perform very well on testing (unseen data) sets due to imbalanced classes of images. To tackle this problem, over-sampling and under-sampling methods are explored in this study. Part 1 of the study focuses on the details of these sampling techniques, while Part 2 highlights the architecture of the deep learning model and its performance when using both sampling approaches. The results of Part 1 show that through the use of unsupervised learning techniques, namely, Hierarchical Clustering, Self-Organizing Maps, and K-Means, similar images are clustered, based on the skin lesions’ shape and color. Using augmentation for oversampling, 32,731 images are included for the training task in total. For undersampling, unsupervised learning techniques suggested 3 or 4 sub-groups of melanocytic nevi. Going through those clusters, the image background color also affects the way unsupervised learning techniques group similar images together.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Harangi B (2018) Skin lesion classification with ensembles of deep convolutional neural networks. J Biomed Inform 86:25–32
Skin Cancer (Including Melanoma)—Patient Version, National Institute of Health page; https://www.cancer.gov/types/skin. Last accessed 25 Feb 2023
ISIC Challenge Datasets, ISIC Challenge page, https://challenge.isic-archive.com/data/#2018. Last accessed 25 Feb 2023
Wu Y, Chen B, Zeng A, Pan D, Wang R, Zhao S (2022) Skin cancer classification with deep learning: a systematic review. Front Oncol 12
Dubey R, Zhou J, Wang Y, Thompson PM, Ye J (2014) Alzheimer's disease neuroimaging initiative. Analysis of sampling techniques for imbalanced data: an n = 648 ADNI study. NeuroImage 87:220–241
Kim HC, Kang MJ (2020) A comparison of methods to reduce overfitting in neural networks. Int J Smart Converg 9(2):173–178
Jeong DH, Kim SE, Choi WH, Ahn SHA (2022) Comparative study on the influence of undersampling and oversampling techniques for the classification of physical activities using an imbalanced accelerometer dataset. Healthcare 10(7):1255
Yang Z, Sinnott RO, Bailey J, Ke QA (2022) Survey of automated data augmentation algorithms for deep learning-based image classification tasks. arXiv:2206.06544
Yen S, Lee Y (2006) Cluster-based sampling approaches to imbalanced data distributions. expert systems with applications. In: Proceedings of international data warehousing and knowledge discovery conference, Krakow, Poland, vol 8, pp 427–436
Lee T, Ng V, Gallagher R, Coldman A, McLean D (1997) Dullrazor®: a software approach to hair removal from images. Comput Biol Med 27(6):533–543
Riveros NAM, Espitia BAC, Pico LEA (2019) Comparison between K-means and self-organizing maps algorithms used for diagnosis spinal column patients. Inform Med Unlocked 16:100206
Oakley A. Melanocytic Naevus. https://dermnetnz.org/topics/melanocytic-naevus. Last accessed 26 Feb 2023
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nguyen, Q.T., Jancic-Turner, T., Kaur, A., Naguib, R.N.G., Sakim, H.A.M. (2024). Sampling Methods to Balance Classes in Dermoscopic Skin Lesion Images. In: Ahmad, N.S., Mohamad-Saleh, J., Teh, J. (eds) Proceedings of the 12th International Conference on Robotics, Vision, Signal Processing and Power Applications. RoViSP 2021. Lecture Notes in Electrical Engineering, vol 1123. Springer, Singapore. https://doi.org/10.1007/978-981-99-9005-4_51
Download citation
DOI: https://doi.org/10.1007/978-981-99-9005-4_51
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9004-7
Online ISBN: 978-981-99-9005-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)