Keywords

1 Introduction

Malignant melanoma is the most dangerous human skin disease. It is well-known that more early diagnosis of melanoma lesions is a crucial issue for the dermatologists. In the past few years, melanoma recognition in dermoscopy images based on deep learning approach has attracted the attention of many researchers. Authors of [1] presented the decision support tool in medical problems based on deep CNN to characterize the melanoma lesion features. The authors proposed the CNN structure and the way to cope with the insufficient number of learning data and they obtained a classification rate of 84%. The transfer knowledge approach using deep CNN algorithm was suggested by authors of [2] to overcome the limitations related with small training dataset. The proposed approach reaches an AUC of 80.7% and 84.5% for the two skin-lesion datasets evaluated. Authors of [3] developed a detection approach for malignant melanoma based on deep learning algorithms to classify and predict the suspected lesion. Contrast enhancement is done using contrast limited adaptive histogram equalization technique (CLAHE) and median filter. The proposed system was tested and validated with nearly 992 images and it provides a classification accuracy of 93%. A deep learning method combined with hand-crafted RSurf features and Local Binary Patterns (LBP) was used for the classification of skin lesions as benign or malignant [4]. In this study, the authors transformed the data of the images into a feature vector using Hand-Crafted method combined with CNN algorithm. A multi-scale feature extraction and classification of skin lesions based on deep CNN was proposed in [5], and this method achieves an accuracy of 81.8%. A deep learning algorithm combined with sparse coding and support vector machine (SVM) was suggested by authors of [6] in order to characterize and recognize melanoma lesions in dermoscopy images. The images were obtained from the International Skin Imaging Collaboration (ISIC), and the recognition system reached promising results.

However, for medical imaging, we generally have a very restricted number of samples. Hence, one of the major challenges in building deep CNN architecture using dermoscopic digital images appears from the limited number of training images to create CNN models without occurring overfitting. One way to tackle this weakness is to employ weight filter transfer where we adapt a pre-trained convolutional layer’s weight for the new target task. Our convolutional layers adaptation system aims to resolve the challenge of dealing with insufficient training data in the dermoscopic domain. It provides a possible solution to enhance the performance of a novel task with small training dataset. To prepare dermoscopic images for our melanoma recognition system, the data pre-processing techniques must be developed which completely assures a balance between high visual quality level and low computational times, efforts, and complexities. They are based on down-sampling operation to rescale all input images and adaptive gamma correction to enhance dermoscopic image brightness.

The remainder of this paper is organized as follows: In Sect. 2, we detail the proposed approach. Then, we present the results of experiments realized on ISIC dataset in Sect. 3. Finally, we conclude this paper in Sect. 4.

2 Melanoma Recognition System

We proposed an adapted CNN model which learns weight filters from a non medical domain and successfully transfer them for melanoma recognition in the target domain. The proposed process has been implemented in four steps: (1) applying down-sampling operation to reduce the image size for the network and to ensure scale normalization; (2) contrast enhancement of the input dermoscopic images using adaptive gamma correction; (3) feature extraction based on the convolutional layers adaptation; (4) carrying out the classification task with Softmax layer, which is the final layer of our deep algorithm used to train CNN for classification. The performance can be greatly improved via our melanoma recognition system without the need of extensive dataset efforts for the training step. A Flow chart of the proposed model is given in Fig. 1.

Fig. 1.
figure 1

Flow chart of proposed method.

2.1 Data Pre-processing

Down-Sampling.

The original ISIC skin dataset contains over 2 000 images of different resolutions (from 1 022 Ã— 767 to 6 748 Ã— 4 499). To reach the input size required by source pre-trained CNN, it is necessary to rescale the dermoscopic images to 224 Ã— 224 pixels. We first cropped the center area of lesion image and then adopt the pyramidal down-sampling process indicated in [7] to resize all images to 224 Ã— 224 pixels.

Intensity Correction.

As it can be seen in Fig. 2, to improve visual quality for better melanoma lesions recognition, the adaptive gamma correction for preserving brightness has been applied [8]. Three channels of the dermoscopic images (R, G, B) are used as input of our CNN model. To make convergence faster while training the network, pixel normalization technique was proposed. It is done by subtracting the mean from each pixel intensity, and then dividing the result by the standard deviation to center the data on zero mean for each channel (R, G, B). This process typically ensures that each input pixel has a similar data distribution which helps the convergence to be faster and more efficiently.

Fig. 2.
figure 2

The enhanced dermoscopic digital images using adaptive gamma correction.

2.2 Convolutional Weights Adaptation Networks

The main contribution of this study is to develop a novel approach for evaluating the transferability of convolutional weights from lower layers of pre-trained CNN, which reveals their generality or specificity. In this paper, we proposed an efficient method to find feature extractor mapping based on learnt kernel adapted to dermoscopic domain. The convolutional kernels are estimated by identifying a set of weight filters from pre-trained CNN on a large data. We can present it as K = {W(1,1), W(1,2),…W(i,j)}. The W(i,j) represents pre-trained weight from CNN model, i and j are layer and weight filter index respectively. Where i = {1,2,3}, j = {1,…,n}, and n is the number of the weights in the convolutional layer. Then, the proposed convolutional weights adaptation is defined as follow:

$$ {\text{Y}}_{{({\text{i}},{\text{j}})}} = {\text{W}}_{{({\text{i}},{\text{j}})}} \otimes {\text{X}}_{\text{i}} $$
(1)

where Y(i,j) is the feature map of the convolutional layer i filtered by j-th kernel, \( \otimes \) denotes convolution operator, and Xi is the convolutional layer i input.

In this study, a model trained on ImageNet dataset using AlexNet architecture, which is proposed in [9], has been proposed as the source domain. As it can be seen in Table 1 and Fig. 3, AlexNet consists of eight layers: first five of the layers are convolutional (C1, C2, C3, C4, and C5) and the rest are fully connected layers (FC6, FC7, and FC8).

Table 1. Specifications of the proposed architecture.
Fig. 3.
figure 3

The flow chart of proposed convolutional adaptation network.

In general, the proposition to search the accurate convolutional weights transferring from pre-trained lower layers is motivated by the observation that the earlier weights of the convolutional layers design more generic features such as the orientation of edges, the color blotches and the simple blob-like image structures, which can be useful to many tasks and domains. By contrast, the deeper layers become gradually more specific to the details of the patterns contained in the target dataset. After the rigorous tests and the several experiments, it was found that our convolutional layers adaptation process was performed by transferring the pre-trained weights to the three initial convolutional layers and adjusting or training the weight filters of the last five layers.

3 Experimental Results and Analysis

3.1 Experimental Setting

Description of Dataset.

The International Skin Imaging Collaboration (ISIC) is an international effort to improve melanoma diagnosis [10], which has recently begun efforts to aggregate a publicly accessible dataset of dermoscopy images. The dermoscopy images from these databases can be used for the research, development and comparison of various algorithms for identifying melanoma. To build our proposed Model, we have used 2 000 dermoscopic digital images from ISIC dataset. This set includes 1 000 benign nevus and 1 000 images with melanoma lesions.

Implementation Details.

Our proposed method has been developed using caffe deep learning framework with python wrapper, and Compute Unified Device Architecture (CUDA) enabled parallel computing platform to access the computational resources of Graphics Processing Unit (GPU). The available hardware, used for training, is NVIDIA GeForce GTX 1080 Ti with 12 GB memory. Training took on average about 5 min and 37 s for early recognition of melanoma in digital dermoscopic images.

Network Configuration.

In the training phase, we optimized the loss minimization using Stochastic Gradient Descent with Momentum factor fixed at 0.9, 64 samples per batch, learning rate of 0.001 and a step learning policy with γ = 0.1.

3.2 Classification Results

In order to improve the performance of our recognition system, 5-fold cross validation was used to train the model. It facilitated with statistical measurement such as accuracy, sensitivity, specificity and error. The experimental results as shown in Table 2 clearly show that our proposed process achieved an average classification accuracy of 97.40% with an average sensitivity of 97.50% and an average specificity of 97.30%.

Table 2. Validity assessment measures with 5-fold cross validation.

The Fig. 4 plots the receiver-operating characteristic (ROC) for five folds cross validation and their AUC values. The results of AUC for five folds cross validation are shown in Table 3.

Fig. 4.
figure 4

The ROC for melanoma recognition for 5-fold cross validation.

Table 3. The area under the curve results for 5-fold cross validation.

To assure that for a dependable comparison and reliable clinical research, it would be necessary to use the same database (ISIC) and the same validation metric (AUC) for validating the performance of our methodology. The visual comparison given in Table 4 demonstrates that our best performing network outperforms a conventional other methods.

Table 4. Comparison with state-of-the-art.

4 Conclusion and Future Work

In this work, we have presented a novel discriminative feature extraction for practical melanoma recognition. It is generated using the combination between weights learning from the three lower layers of a pre-trained network and the weights trained on the dermoscopic data through a high-level of our CNN to classify skin tissues as malignant melanoma or benign nevus. We have experimentally demonstrated how data pre-processing techniques, weight filter adaptation and network configuration can give high-quality analysis for melanoma recognition and handle the problem of having small set of dermoscopic data samples. The comparison our proposed model with existing methods shows that this adduced work outperforms the most representative state-of-the-art strategies. So we can conclude that our method may assist the dermatologist to make the final decision for the further treatment.

As future work, we plan to exploit the results obtained from our analysis to design new discriminative feature for other task recognition in medical imaging field.