Skin lesion image retrieval using transfer learning-based approach for query-driven distance recommendation

https://doi.org/10.1016/j.compbiomed.2021.104825Get rights and content

Highlights

  • ResNet50-based transfer learning allows to deep-craft appropriate features.

  • Features are learned relying on common signs of seven skin lesion classes.

  • Creation of ground truth that associates the most adequate distance to each image.

  • The method predicts dynamically the most appropriate distance for any new image.

  • Query-driven distance recommender improves the performance of skin lesion retrieval.

Abstract

Content-Based Dermatological Lesion Retrieval (CBDLR) systems retrieve similar skin lesion images, with a pathology-confirmed diagnosis, for a given query image of a skin lesion. By producing an intuitive support to both inexperienced and experienced dermatologists, the early diagnosis through CBDLR screening can significantly enhance the patients’ survival, while reducing the treatment cost. To deal with this issue, a CBDLR system is proposed in this study. This system integrates a similarity measure recommender which allows a dynamic selection of the adequate distance metric for each query image. The main contributions of this work reside in (i) the adoption of deep-learned features according to their performances for the classification of skin lesions into seven classes; and (ii) the automatic generation of ground truth that was investigated within the framework of transfer learning in order to recommend the most appropriate distance for any new query image. The proposed CBDLR system has been exhaustively evaluated using the challenging ISIC2018 and ISIC2019 datasets, and the obtained results show that the proposed system can provide a useful aided-decision while offering superior performances. Indeed, it outperforms similar CBDLR systems that adopt standard distances by at least 9% in terms of mAP@K.

Introduction

Medical image repositories have been expanded in quantity, content and dimension thanks to the unprecedented advances in medical imaging devices, computers' performance and network transmission technology [1]. Thus, recent years have witnessed a significant interest in designing automated solutions with the aim of exploring large medical repositories effectively. In fact, Content-Based Medical Image Retrieval (CBMIR) consists in retrieving the most similar past cases, regarding a query image (i.e. a new image of an unknown class). By retrieving similar annotated clinical cases, CBMIR systems can assist practitioners in the clinical decision-making process. This can be very beneficial for them since they increase the degree of trust on the prediction made by theses CBMIR systems over time compared to traditional Computer-Aided Diagnosis (CAD) systems. In particular, for image-based skin cancer diagnosis, dermatologists can find out whether the lesion in the query skin image is benign or malign by analyzing some Common Imaging Signs (CISs) within the retrieved images [2]. Skin cancer, which is the most frequent type of cancer in the world, has risen remarkably during the last four decades. Examples of these cancers are melanoma and basal cell carcinoma which are caused by accumulated exposure to ultraviolet radiation, weakened immune systems, harmful chemical elements and sunlight during the winter months [3]. Thus, the development of effective tools for the automated retrieval of dermoscopy images has seen an impressive growth in the past years [4,5]. The common approach consists in extracting some useful features from the query image in order to retrieve images that have a similar set of features. Generally, feature extraction and similarity measure play a very important role in the success of Content-Based Dermatological Lesion Retrieval (CBDLR) systems. This makes feature extraction an extensively investigated topic that deals with the automated RGB-based description of images’ contents [6]. Several feature extraction techniques have been proposed for indexing, annotation and retrieval purposes. These techniques can be classified into two main groups: hand-crafted features and deep-learned features. In fact, the first attempts, which are often referred to as hand-crafted techniques, have investigated low-level image attributes; such as color, texture, shape and spatial relations [7]. However, many modern techniques, which are often referred to as deep-learned techniques, have focused on deep learning architectures in order to automatically extract relevant features reflecting clinical signs of malignity [8,9]. Deep-learned features outperform hand-crafted ones, but they require large annotated datasets for an accurate training, and sometimes there is a problem of overfitting. On the other hand, the similarity measure is based on certain distance measurements between feature vectors, such that two dermatological images with shorter distance are considered more similar than images that are far away. Despite the large number of studies on similarity assessment [10], no measure can be said to be “perfect”. The majority of CBDLR systems adopt one similarity metric to retrieve relevant images. However, while this metric can be suitable for many queries, it is sometimes not convenient for others. Similarly, certain measures that deemed to be inappropriate for most images, proved to be ideal for some individual queries.

These findings led us to investigate the dynamic selection of the adequate similarity measure according to the type of query. The main goal of the proposed method is to automatically select the appropriate similarity measure in a dynamic manner without having to standardize this choice for all queries. More precisely, being inspired by the recent success of deep learning techniques in medical imaging applications [11,12], we propose to improve CBDLR performance by investigating deep-learned features while integrating a similarity measure recommender that is able to dynamically predict the appropriate distance metric for each query. The designed recommender, which is a model obtained by a transfer learning approach, aims to offer a more personalized distance metric to any query image. Thus, the contribution of this study is threefold. 1) Given that image classification and content-based image retrieval are slightly different variations of the same problem, convolutional neural networks have been adopted for learning the appropriate features for retrieving similar images to a query lesion while encoding higher level of semantics which are present in dermatological images. The selection of these features is conducted according to their performances for classifying skin lesions into 7 classes of dermatological diseases. 2) We construct a challenging ground truth that associates to each skin lesion image (among a dataset of 3, 207 images), the most appropriate distance metric (among 10 standard metrics) for retrieving the most similar cases, diagnostically speaking. 3) In the light of the advantages of transfer learning, an effective model is designed in order to recommend the most adequate distance to better identify clinically similar images for any skin lesion image. The returned images aim to make decision for the query lesion while facilitating the interpretation of results by dermatologists.

The rest of this paper is organized as follows. Section 2 presents a brief synthesis of the related work on CBDLR. The proposed method is described in Section 3. Section 4 summarizes the experimental results while Section 5 provides a summary discussion. Finally, Section 6 outlines a conclusion and some ideas for future research directions.

Section snippets

Related work

The performance of a CBDLR system depends on these two steps: feature extraction and image comparison. The first step consists in representing each image by a set of attributes and the second one lies in evaluating the similarity between the query image features and those of images composing the dataset.

Proposed method

The flowchart of the proposed CBDLR method is composed of two main phases, namely offline indexing and online retrieval (Fig. 1). In the offline phase, features are extracted from each image, within the investigated large dataset, in order to create a feature database (also referred to as signature image database). In the online phase, features are extracted from the query image, by using the same feature extraction procedure adopted in the offline phase, in order to estimate the similarity

Experimental results

Since performances of CBDLR systems are strongly related to the ranking of relevant cases retrieved for each query [2], the proposed method aims to learn the appropriate distance leading to pertinent results. The CNN-based model permits to effectively address the problem of more relevant and less relevant differences among the images in order to obtain human-like performance. In Fig. 6, a sample of retrieval results of the proposed CBDLR system is shown for the 7 classes of diseases within the

Discussion

The main objective of this study is to improve the CBDLR performance by investigating deep-learned features while integrating a similarity measure recommender that dynamically predicts the appropriate distance metric for each query. CNN has been adopted to learn effective features aiming to retrieve similar images to a query image based on the performance of these features to classify skin lesions into 7 classes of diseases. In fact, a ResNet50-based transfer learning has been performed using

Conclusion and perspectives

During the last years, the massive increase of skin lesion images has precipitated the challenge of mining specific images among huge collections, which explains the exhibition of CBDLR as an active research topic. In fact, unlike automatic classification tools that present results as ’black box’, CBDLR provides a sorted set of similar images, with a confirmed diagnosis, relative to a given skin lesion image. This can serve to support dermatologists in the diagnosis process, without directly

Declaration of competing interest

There is no conflict of interest.

References (47)

  • A. Maiti et al.

    Computer-aided diagnosis of melanoma: a review of existing knowledge and strategies

    Curr. Med. Imag.

    (2020)
  • F.A. Damian et al.

    Feature selection of non-dermoscopic skin lesion images for nevus and melanoma classification

    Computation

    (2020)
  • J.M. Patel et al.

    A review on feature extraction techniques in Content Based Image Retrieval

  • O. Layode et al.

    Deep learning based integrated classification and image retrieval system for early skin cancer detection

  • N. Pasumarthi et al.

    An empirical study and comparative analysis of Content Based Image Retrieval (CBIR) techniques with various similarity measures

  • M. Owais et al.

    Effective diagnosis and treatment through Content-Based Medical Image Retrieval (CBMIR) by using artificial intelligence

    J. Clin. Med.

    (2019)
  • M.M. Rahman

    A Decision support system for skin cancer recognition with deep feature extraction and multi response linear regression (MLR)-based meta learning

  • N. Dey et al.

    Social group optimization supported segmentation and evaluation of skin melanoma images

    Symmetry

    (2018)
  • G.W. Jiji et al.

    CBI+R: a fusion approach to assist dermatological diagnoses

    Int. J. Image Graph.

    (2021)
  • G.W. Jiji et al.

    A retrieval system to analyse dermatological lesions using feature ortho-normalisation

    J. Exp. Theor. Artif. Intell.

    (2019)
  • J. Kawahara et al.

    Deep features to classify skin lesions

  • V. Pomponiu et al.

    Deepmole: deep neural networks for skin mole lesion classification

  • C. Barata et al.

    A survey of feature extraction in dermoscopy image analysis of skin cancer

    IEEE J. Biomed. Health Inf.

    (2019)
  • Cited by (16)

    View all citing articles on Scopus
    View full text