Classification of melanomas in situ using knowledge discovery with explained case-based reasoning
Introduction
Early diagnosis and surgical excision are the main goals in the secondary prevention of cutaneous melanoma. Nowadays, the diagnosis of melanoma is based on the ABCD rule [1] which considers four clinical features commonly observed in this kind of tumour: asymmetry, border irregularity, color variegation, and a diameter larger than 5 mm. Although most melanomas are correctly diagnosed following this rule, a variable proportion of melanomas do not comply with these criteria. The current procedure when a suspicious skin lesion appears is to excise and then to analyse it by means of a biopsy. The result of the biopsy usually allows an accurate determination of the malignancy of the lesion.
Dermoscopy is a non-invasive technique introduced by dermatologists two decades ago which provides a more accurate evaluation of skin lesions, and can therefore, avoid the excision of lesions that are benign. However, dermatologists need to achieve a good dermatoscopic classification of lesions prior to extraction [2]. Hofmann-Wellenhof et al. [3] suggested a classification of benign melanocytic lesions. Argenziano et al. [4] hypothesized that dermoscopic classification may be better than the classical clinico pathological classification of benign melanocytic lesions (nevi). Currently, there is no dermoscopic classification for melanomas located in trunk and extremities. In the era of genetic profiling, molecular studies including microarrays suggest that there is more than one type of melanoma in these sites. The aim of the present study is to help dermatologists in the classification of early melanomas (melanomas in situ) based on dermoscopical characteristics. Dermatologists have defined several dermatoscopic classes of melanoma in situ based on the dermatoscopic features. Dermatopathologists also suggest another classification based on histological features. In particular, our approach consists in using case-based reasoning inside a new knowledge discovery procedure in order to provide dermatologists with a classification theory for melanomas.
Case-based reasoning (CBR) methods predict the classification of a problem based on its similarity to already solved cases. One of the key points of CBR systems is the measure used to assess the similarity between cases, since the final classification of a new problem depends on it. Related to this issue is the fact that results should be clearly understood by the system's user; otherwise, he may not be fully convinced of the results produced by the system. For this reason, in recent years there has been an increasing interest in approaches addressed to explaining CBR results in a satisfactory way (see [5]). One of these approaches is the lazy induction of descriptions (LID) method we introduced in [6]. During the problem solving process LID builds an explanation justifying the classification of a new problem (see Section 2.1). This explanation is, in fact, a generalization of the relevant attributes shared by both problem and cases. In [7] we argued that generalizations can be seen as explanations since they commonly contain problem features useful for classifying problems. This is the case of prototypes from PROTOS [8], generalized cases [9], and lazy decision trees [10]. Also, explanation-based learning (EBL) methods [11] generalize a particular example to obtain a domain rule that can be used for solving unseen problems. Our point is that explanations produced by lazy learning methods like LID should be considered as domain rules in the same way as generalizations are. Thus, the set of explanations could be considered as a lazy domain theory.
Commonly, domain theories are built using eager learning methods (such as ID3 [12]). Eager learning methods build discriminant descriptions for classes, and so the union of these discriminant descriptions covers all the space of known examples. In contrast, lazy domain theories cover only zones around each new problem; therefore, this may result in “holes” in the description of the domain (see Fig. 1). In [13] we compared lazy domain theories formed by sets of explanations from LID with the eager theory built by the ID3 method [12]. In our experiments we showed that, for some domains, eager and lazy theories have similar predictive ability. The difference is that because the explanations that make up the lazy domain theory are more specific than eager rules, there is a high percentage of unseen problems that the lazy theory cannot classify although the classification, when it is proposed, is usually correct.
In the current paper we exploit the concept of lazy domain theory for knowledge discovery. Although lazy domain theories are formed by local rules, this information is very valuable to experts to obtain a picture of some parts of the domain. Frawley et al. [14] defined knowledge discovery as “the non-trivial extraction of implicit, unknown and potentially useful information from data”. In fact, we want to support domain experts in building a domain theory, producing explanations (generalizations) that can be easily understood and giving them the opportunity to systematically analyze the classes proposed. However, knowledge discovery problems cannot be directly solved by means of either lazy or eager learning methods since most of them need to know the class label of the domain objects in advance. The most commonly used techniques in knowledge discovery are clustering methods whose goal is to analyze a set of objects and to build clusters based on the similarity between objects. Lazy learning methods cannot be used for clustering because the cases are not labeled. So how can a lazy learning method be used for knowledge discovery? Our proposal is to randomly cluster the domain objects and then consider these clusters as the solution classes. Because domain objects now belong to some class, a lazy learning method can be used to obtain explanations that can be seen as domain rules of a lazy domain theory. We call this procedure LazyCL and we use it to help dermatologists to define and describe classes of melanomas in situ. In fact, this is the main novelty of LazyCL; whereas most approaches combining both CBR and clustering techniques exploit the clustering to organize the case memory in order to make an efficient retrieval of past cases, LazyCL uses CBR for clustering, i.e., explanations produced by a lazy learning method are used as descriptions of clusters.
The paper is structured as follows. In the next section we explain the general procedure of LazyCL. Section 3 describes experiences when using LazyCL to discover a classification of melanomas in situ. Section 4 describes experiences when using LazyCL on some standard data sets of the machine learning repository from the Irvine University (UCI repository). Section 5 compares some methods used for knowledge discovery with our approach. Section 6 is devoted to conclusions and future work.
Section snippets
LazyCL: a procedure using explanations for knowledge discovery
Let us suppose the following scenario. Domain experts have available a set of object descriptions (cases) and they hypothesize about the existence of several classes of such objects. These classes would be reasonable from the experts’ point of view, and so it is necessary to give an explanation of the clustering. This expected explanation would have a form similar to the symbolic descriptions given by eager learning methods. However, in this scenario, the use of a supervised learning method is
Experimenting with LazyCL for classifying melanomas in situ
Dermatologists provided us with a database with descriptions of 76 melanomas in situ from the consensus of six experts (four dermatologists and two dermatopathologists).1 The descriptions comprise three kinds of attributes: clinical, dermoscopic and histological. Clinical attributes are
Experimenting with LazyCL on standard data sets
To analyze the feasibility of LazyCL, we used several data sets from the UCI repository [15] (Table 2). Most of them have attributes with numeric values, therefore we discresized them to obtain nominal values. The goal of the experiments was twofold. Firstly, we wanted to analyze whether or not the clustering is consistent with the correct classification of cases. Secondly, since the purpose of LazyCL is for knowledge discovery, we also wanted to analyze the descriptions of the clusters.
For
Related work
The techniques most commonly used for knowledge discovery are clustering methods. In [20] the reader can find a survey of these methods and a classification of them. Unlike LazyCL, which deals with symbolic data, most clustering methods work better with numerical data since they have their roots in statistics. These numerical methods group objects taking into account both the similarity between the objects included in a cluster and the dissimilarity between objects in different clusters. A
Conclusions
In this paper we present LazyCL, a procedure for knowledge discovery based on the explanations produced by a lazy learning method. In LazyCL, a lazy learning method called LID is used to form symbolic descriptions of clusters. Our approach is based on the hypothesis that, because explanations are generalizations, they can be used as domain theory. Although this domain theory is lazy – hence it does not cover all the space of known examples – it can play the same role as domain theory obtained
Acknowledgments
This work has been supported by the MCYT-FEDER Projects called NEXT-CBR (MICIN TIN2009-13692-C03-01) and the Generalitat de Catalunya under grant 2009-SGR-1434. The author thanks to Dr. Susana Puig from the Hospital Clinic i Provincial de Barcelona that supported me on the interpretation of results from the medical point of view. The author also thanks to Dr. Àngel García-Cerdaña his helpful comments to improve this work.
References (33)
- et al.
Dermoscopic classification of Clark's nevi (atypical melanocytic nevi)
Clinics in Dermatology
(2002) - et al.
Protos: an examplar-based learning apprentice
International Journal of Man-Machine Studies
(1988) Case-base maintenance by conceptual clustering of graphs
Engineering Applications of Artificial Intelligence
(2006)- et al.
Early detection of malignant melanoma: the role of physician examination and self-examination of the skin
CA-A Cancer Journal for Clinicians
(1985) - et al.
Melanomas that failed dermoscopic detection: a combined clinicodermoscopic approach for not missing melanoma
Journal of Dermatologic Surgery
(2007) - et al.
Proposal of a new classification system for melanocytic naevi
British Journal of Dermatology
(2007) - et al.
Artificial Intelligence Review. Special Issue on Explanation in Case-Based Reasoning
(2005) - et al.
Lazy induction of descriptions for relational case-based learning
Usages of generalization in CBR
- et al.
Similarity measures for object-oriented case representations