Classification of melanomas in situ using knowledge discovery with explained case-based reasoning

https://doi.org/10.1016/j.artmed.2010.09.001Get rights and content

Abstract

Objective

Early diagnosis of melanoma is based on the ABCD rule which considers asymmetry, border irregularity, color variegation, and a diameter larger than 5 mm as the characteristic features of melanomas. When a skin lesion presents these features it is excised as prevention. Using a non-invasive technique called dermoscopy, dermatologists can give a more accurate evaluation of skin lesions, and can therefore avoid the excision of lesions that are benign. However, dermatologists need to achieve a good dermatoscopic classification of lesions prior to extraction. In this paper we propose a procedure called LazyCL to support dermatologists in assessing the classification of skin lesions. Our goal is to use LazyCL for generating a domain theory to classify melanomas in situ.

Methods

To generate a domain theory, the LazyCL procedure uses a combination of two artificial intelligence techniques: case-based reasoning and clustering. First LazyCL randomly creates clusters and then uses a lazy learning method called lazy induction of descriptions (LID) with leave-one-out on them. By means of LID, LazyCL collects explanations of why the cases in the database should belong to a class. Then the analysis of relationships among explanations produces an understandable clustering of the dataset. After a process of elimination of redundancies and merging of clusters, the set of explanations is reduced to a subset of it describing classes that are “almost” discriminant. The remaining explanations form a preliminary domain theory that is the basis on which experts can perform knowledge discovery.

Results

We performed two kinds of experiments. First ones consisted on using LazyCL on a database containing the description of 76 melanomas. The domain theory obtained from these experiments was compared on previous experiments performed using a different clustering method called self-organizing maps (SOM).

Results of both methods, LazyCL and SOM, were similar. The second kind of experiments consisted on using LazyCL on well known domains coming from the machine learning repository of the Irvine University. Thus, since these domains have known solution classes, we can prove that the clusters build by LazyCL are correct.

Conclusions

We can conclude that LazyCL that uses explained case-based reasoning for knowledge discovery is feasible for constructing a domain theory. On one hand, experiments on the melanoma database show that the domain theory build by LazyCL is easy to understand. Explanations provided by LID are easily understood by domain experts since these descriptions involve the same attributes than they used to represent domain objects. On the other hand, experiments on standard machine learning data sets show that LazyCL is a good method of clustering since all clusters produced are correct.

Introduction

Early diagnosis and surgical excision are the main goals in the secondary prevention of cutaneous melanoma. Nowadays, the diagnosis of melanoma is based on the ABCD rule [1] which considers four clinical features commonly observed in this kind of tumour: asymmetry, border irregularity, color variegation, and a diameter larger than 5 mm. Although most melanomas are correctly diagnosed following this rule, a variable proportion of melanomas do not comply with these criteria. The current procedure when a suspicious skin lesion appears is to excise and then to analyse it by means of a biopsy. The result of the biopsy usually allows an accurate determination of the malignancy of the lesion.

Dermoscopy is a non-invasive technique introduced by dermatologists two decades ago which provides a more accurate evaluation of skin lesions, and can therefore, avoid the excision of lesions that are benign. However, dermatologists need to achieve a good dermatoscopic classification of lesions prior to extraction [2]. Hofmann-Wellenhof et al. [3] suggested a classification of benign melanocytic lesions. Argenziano et al. [4] hypothesized that dermoscopic classification may be better than the classical clinico pathological classification of benign melanocytic lesions (nevi). Currently, there is no dermoscopic classification for melanomas located in trunk and extremities. In the era of genetic profiling, molecular studies including microarrays suggest that there is more than one type of melanoma in these sites. The aim of the present study is to help dermatologists in the classification of early melanomas (melanomas in situ) based on dermoscopical characteristics. Dermatologists have defined several dermatoscopic classes of melanoma in situ based on the dermatoscopic features. Dermatopathologists also suggest another classification based on histological features. In particular, our approach consists in using case-based reasoning inside a new knowledge discovery procedure in order to provide dermatologists with a classification theory for melanomas.

Case-based reasoning (CBR) methods predict the classification of a problem based on its similarity to already solved cases. One of the key points of CBR systems is the measure used to assess the similarity between cases, since the final classification of a new problem depends on it. Related to this issue is the fact that results should be clearly understood by the system's user; otherwise, he may not be fully convinced of the results produced by the system. For this reason, in recent years there has been an increasing interest in approaches addressed to explaining CBR results in a satisfactory way (see [5]). One of these approaches is the lazy induction of descriptions (LID) method we introduced in [6]. During the problem solving process LID builds an explanation justifying the classification of a new problem (see Section 2.1). This explanation is, in fact, a generalization of the relevant attributes shared by both problem and cases. In [7] we argued that generalizations can be seen as explanations since they commonly contain problem features useful for classifying problems. This is the case of prototypes from PROTOS [8], generalized cases [9], and lazy decision trees [10]. Also, explanation-based learning (EBL) methods [11] generalize a particular example to obtain a domain rule that can be used for solving unseen problems. Our point is that explanations produced by lazy learning methods like LID should be considered as domain rules in the same way as generalizations are. Thus, the set of explanations could be considered as a lazy domain theory.

Commonly, domain theories are built using eager learning methods (such as ID3 [12]). Eager learning methods build discriminant descriptions for classes, and so the union of these discriminant descriptions covers all the space of known examples. In contrast, lazy domain theories cover only zones around each new problem; therefore, this may result in “holes” in the description of the domain (see Fig. 1). In [13] we compared lazy domain theories formed by sets of explanations from LID with the eager theory built by the ID3 method [12]. In our experiments we showed that, for some domains, eager and lazy theories have similar predictive ability. The difference is that because the explanations that make up the lazy domain theory are more specific than eager rules, there is a high percentage of unseen problems that the lazy theory cannot classify although the classification, when it is proposed, is usually correct.

In the current paper we exploit the concept of lazy domain theory for knowledge discovery. Although lazy domain theories are formed by local rules, this information is very valuable to experts to obtain a picture of some parts of the domain. Frawley et al. [14] defined knowledge discovery as “the non-trivial extraction of implicit, unknown and potentially useful information from data”. In fact, we want to support domain experts in building a domain theory, producing explanations (generalizations) that can be easily understood and giving them the opportunity to systematically analyze the classes proposed. However, knowledge discovery problems cannot be directly solved by means of either lazy or eager learning methods since most of them need to know the class label of the domain objects in advance. The most commonly used techniques in knowledge discovery are clustering methods whose goal is to analyze a set of objects and to build clusters based on the similarity between objects. Lazy learning methods cannot be used for clustering because the cases are not labeled. So how can a lazy learning method be used for knowledge discovery? Our proposal is to randomly cluster the domain objects and then consider these clusters as the solution classes. Because domain objects now belong to some class, a lazy learning method can be used to obtain explanations that can be seen as domain rules of a lazy domain theory. We call this procedure LazyCL and we use it to help dermatologists to define and describe classes of melanomas in situ. In fact, this is the main novelty of LazyCL; whereas most approaches combining both CBR and clustering techniques exploit the clustering to organize the case memory in order to make an efficient retrieval of past cases, LazyCL uses CBR for clustering, i.e., explanations produced by a lazy learning method are used as descriptions of clusters.

The paper is structured as follows. In the next section we explain the general procedure of LazyCL. Section 3 describes experiences when using LazyCL to discover a classification of melanomas in situ. Section 4 describes experiences when using LazyCL on some standard data sets of the machine learning repository from the Irvine University (UCI repository). Section 5 compares some methods used for knowledge discovery with our approach. Section 6 is devoted to conclusions and future work.

Section snippets

LazyCL: a procedure using explanations for knowledge discovery

Let us suppose the following scenario. Domain experts have available a set of object descriptions (cases) and they hypothesize about the existence of several classes of such objects. These classes would be reasonable from the experts’ point of view, and so it is necessary to give an explanation of the clustering. This expected explanation would have a form similar to the symbolic descriptions given by eager learning methods. However, in this scenario, the use of a supervised learning method is

Experimenting with LazyCL for classifying melanomas in situ

Dermatologists provided us with a database with descriptions of 76 melanomas in situ from the consensus of six experts (four dermatologists and two dermatopathologists).1 The descriptions comprise three kinds of attributes: clinical, dermoscopic and histological. Clinical attributes are

Experimenting with LazyCL on standard data sets

To analyze the feasibility of LazyCL, we used several data sets from the UCI repository [15] (Table 2). Most of them have attributes with numeric values, therefore we discresized them to obtain nominal values. The goal of the experiments was twofold. Firstly, we wanted to analyze whether or not the clustering is consistent with the correct classification of cases. Secondly, since the purpose of LazyCL is for knowledge discovery, we also wanted to analyze the descriptions of the clusters.

For

Related work

The techniques most commonly used for knowledge discovery are clustering methods. In [20] the reader can find a survey of these methods and a classification of them. Unlike LazyCL, which deals with symbolic data, most clustering methods work better with numerical data since they have their roots in statistics. These numerical methods group objects taking into account both the similarity between the objects included in a cluster and the dissimilarity between objects in different clusters. A

Conclusions

In this paper we present LazyCL, a procedure for knowledge discovery based on the explanations produced by a lazy learning method. In LazyCL, a lazy learning method called LID is used to form symbolic descriptions of clusters. Our approach is based on the hypothesis that, because explanations are generalizations, they can be used as domain theory. Although this domain theory is lazy – hence it does not cover all the space of known examples – it can play the same role as domain theory obtained

Acknowledgments

This work has been supported by the MCYT-FEDER Projects called NEXT-CBR (MICIN TIN2009-13692-C03-01) and the Generalitat de Catalunya under grant 2009-SGR-1434. The author thanks to Dr. Susana Puig from the Hospital Clinic i Provincial de Barcelona that supported me on the interpretation of results from the medical point of view. The author also thanks to Dr. Àngel García-Cerdaña his helpful comments to improve this work.

References (33)

  • R. Hofmann-Wellenhof et al.

    Dermoscopic classification of Clark's nevi (atypical melanocytic nevi)

    Clinics in Dermatology

    (2002)
  • E.R. Bareiss et al.

    Protos: an examplar-based learning apprentice

    International Journal of Man-Machine Studies

    (1988)
  • P. Perner

    Case-base maintenance by conceptual clustering of graphs

    Engineering Applications of Artificial Intelligence

    (2006)
  • R.J. Friedman et al.

    Early detection of malignant melanoma: the role of physician examination and self-examination of the skin

    CA-A Cancer Journal for Clinicians

    (1985)
  • S. Puig et al.

    Melanomas that failed dermoscopic detection: a combined clinicodermoscopic approach for not missing melanoma

    Journal of Dermatologic Surgery

    (2007)
  • G. Argenziano et al.

    Proposal of a new classification system for melanocytic naevi

    British Journal of Dermatology

    (2007)
  • D.B. Leake et al.

    Artificial Intelligence Review. Special Issue on Explanation in Case-Based Reasoning

    (2005)
  • E. Armengol et al.

    Lazy induction of descriptions for relational case-based learning

  • E. Armengol

    Usages of generalization in CBR

  • R. Bergmann et al.

    Similarity measures for object-oriented case representations

  • J.H. Friedman et al.

    Lazy decision trees

  • T.M. Mitchell et al.

    Explanation-based learning: a unifying view

    Machine Learning

    (1986)
  • J.R. Quinlan

    Induction of decision trees

    Machine Learning

    (1986)
  • E. Armengol

    Building partial domain theories from explanations

    Künstliche Intelligenz

    (2008)
  • W.J. Frawley et al.

    Knowledge discovery in databases—an overview

    AI Magazine

    (1992)
  • Blake CL, Merz CJ. UCI repository of machine learning databases;...
  • Cited by (0)

    View full text