Elsevier

Data & Knowledge Engineering

Volume 106, November 2016, Pages 1-17
Data & Knowledge Engineering

Reliability analysis of psoriasis decision support system in principal component analysis framework

https://doi.org/10.1016/j.datak.2016.09.001Get rights and content

Highlights

  • Psoriasis decision support system design to model the automated classification

  • Novel feature space extraction for psoriasis

  • PCA-based feature selection for psoriasis decision support design

  • Feature retaining power at different PCA cutoffs for optimal feature selection

  • Design of novel reliability and stability of psoriasis decision support system

Abstract

Reliability and accuracy are essential components in any decision support system. These become even more important with a rising number of features during the classification process in a machine learning paradigm. Further, the selection of an optimal feature set is of paramount importance for the best performance, reliable and stable decision support systems.

This paper presents a dermatology decision support system used for the classification of psoriasis images into diseased and healthy skin. A comprehensive grayscale and color feature space with 87 features are explored. The classification system consists of a machine learning paradigm embedded with principal component analysis-based optimal feature selection. The system consists of both offline training classifier and online testing classifier phases. The training parameters are estimated using unique feature space and ground truth, a priori derived by the dermatologist. The training phase generates the offline coefficients using a training classifier which is then used for transforming the online test features for prediction of two skin classes: diseased vs. healthy.

The proposed system using principal component analysis shows the best classification accuracy of 99.39% for a 10-fold cross-validation using polynomial kernel of order-2 on a set of 540 images. We validate our system by computing the reliability and stability indices. The results demonstrate a mean reliability index of 98.71% for 11 distinct data sizes, and meeting the stability criteria within 2% tolerance. The ability to retain the dominant features by inclusion of increasing set of features is 90.52%. Thus proposed system shows the encouraging results with higher accuracy, reliability, stability and retaining power of dominant features.

Introduction

There are several components in decision support systems such as feature extraction, selection of optimal features and classification. A decision support system is adaptable in clinical practice if it is reliable, stable and has ability to produce accurate results. This paper is focused on the design of a decision support system for dermatology application called psoriasis, a field dealing with skin cancer imaging. The variation in psoriasis disease is large and not easy to understand. Psoriasis is the very common skin disease which affects the quality of life [1], [2]. The present world population affected by psoriasis is 2–3% as per recent reports [3]. Presently, no permanent cure has been reported for psoriasis, but it can be controlled by long and careful treatment. Dermatologists are currently able to deliver effective treatment to patients due to their extensive experience and manually recording the observations. Thus, there is a clear need to design an automated and computerized based decision support system. In this paper, we have developed an innovative psoriasis decision support system (pDSS), which uses extensive feature space and selects the dominant features using polling-based principal component analysis which is then fed to a support vector machine (SVM) classifier.

There is little literature available on computer aided diagnosis for psoriasis compared to another skin disease termed as melanoma. There is a difference in melanoma and psoriasis in terms of symptoms as well as diagnosis. The symptoms of melanoma are unusual sores, lumps, blemishes, markings or changes in the way an area of the skin looks or feels, whereas in psoriasis, the symptoms are skin plaques, silver scales, nail pitting and arthritis. Another difference is that melanoma can be cured in early stages of the disease, while psoriasis is a lifelong condition. Recently, Guo et al. [4] presented a psoriasis classification model based on incremental feature selection algorithm. Similarly, a new methodology was proposed for the classification and analysis of the tissue composition of skin lesions [5]. In melanoma, researchers presented the CADx software [6] based on normal photographic digital images and confirmed its feasibility to discriminate between melanocytic and non-melanocytic skin lesions. Suri's team recently published an article [7], and presented a CADx system to stratify the healthy and psoriatic skin utilizing an aggregate of 46 features. In this study, the optimized features were selected on the basis of an average feature value rather than adapting a specialized dominant feature selection criteria. A brief survey of CADx systems for skin disease classification is summarized in the discussion section. A general review on psoriasis and its risk stratification using CADx system has been recently published by Suri's team [8].

In the aforementioned methods, there are several striking primary weaknesses: (i) there is a lack of diversity and amalgamation of the features; (ii) there is no well documented connection between the dimensionality reduction schemes which can filter out the most dominant features given a large and diverse set of grayscale and color features; (iii) lastly, there is no mathematical foundation on how to compute the reliability and stability of such dynamic pDSS's in a machine learning framework.

In the lieu of the above challenges, this study brings to the forefront the key aspects that cover these weaknesses. The following primary set of four new innovations is highlighted here:

  • 1.

    Novel feature space extraction for psoriasis: We bring a diverse feature space which takes into account the pixel distribution due to the linearity/non-linearity of the grayscale and color information. Thus, we exploit the features set: Cluster Prominence, Run Length Emphasis, Skewness, Kurtosis, Invariant Moment Features (in grayscale level difference statistics), Busyness and Texture Strengths (in Neighborhood Gray Tone Difference Matrix frameworks). Our exhaustive feature space consists of 87 features which belong to four different dermatologist mimicking categories consisting of texture, color, redness and randomness.

  • 2.

    PCA-based feature selection for psoriasis decision support design: The second aspect deals with the selection criteria of the dominant features (without modifying feature values) using a PCA-based paradigm and finally using a machine learning infrastructure for automatically stratifying the images into diseased and healthy skin.

  • 3.

    Design of novel reliability and stability of psoriasis decision support system: A novel paradigm for computing the reliability and stability assessment is presented for the psoriasis decision support system (pDSS).

  • 4.

    Psoriasis decision support system design to model the automated classification: Our pDSS system provides a fully automated system for classification of psoriasis images with an accurate and highly reliable system design.

The secondary set of innovations is related to our experimental protocol design:

  • 1.

    Optimization of kernel function for psoriasis decision support system: In the first experiment, three SVM kernels are adopted i.e., linear kernel and polynomial kernel of order-2 and order-3, out of which the best is selected on the basis of classification accuracy. The least square method is used to find a separating hyper-plane. To generalize the SVM classification process, we evaluate our classification performance on a 10-fold cross-validation (CV) protocol. To choose the best kernel function on the basis of accuracy as part of experiment one, the pDSS runs on a fixed data size of N = 540 using varying kernel functions. This experiment shows the highest classification accuracy of 99.39% for 10-fold CV protocol with polynomial kernel function of order-2.

  • 2.

    How to use PCA-based analysis in machine learning paradigm for psoriasis decision support system: The second novel experiment helps in understanding the concept of generalization of a machine learning paradigm in PCA-based framework. This protocol involves computing accuracy with an increase in the number of data sizes, while keeping the number of partitions fixed.

  • 3.

    How to estimate the stability and reliability of the psoriasis decision support system: The stability of the system is determined analogously to the dynamics of the control theory and we validate our system by demonstrating the mean reliability index of 98.71% over all the data sizes and for all the PCA-based cutoff values for the dominant feature selection.

Section snippets

Data acquisition and preparation

The database of psoriasis images are collected from Psoriasis Clinic and Research Centre, Psoriatreat, Pune, Maharashtra, India. The data are obtained by digitally photographing the patients of Indian ethnic origin under a supervision of a dermatologist. The ethics approval was generated for dataset and the patients data were anonymized. The data was taken using a Sony NEX-5 camera with 22 mm lens and 350 dpi. The images were processed in a Joint Photographic Expert Group (JPEG) format with a

Methodology: pDSS design and feature extraction

The proposed pDSS as shown in Fig. 3 that adapts the machine learning paradigm. It contains two components, separated by the dotted line. The left side characterizes an offline system while the right side characterizes an online system. The online system is used for predicting the class on test patient images. Both components are composed of a feature extraction process that computes four sets of feature categories: texture, color, redness and chaotic features, i.e., a total of 87 features.

Experimental protocols

We have performed two experimental protocols. Initially, we chose the best kernel function for fixed data size (N) on the basis of classification accuracy. Further, all system performance parameters are calculated in the first experiment. The second experimental protocol is about understanding the learning of pDSS based on changing data size (N). Since the partitions K are random, we thus repeated the protocol T = 20 trials in 10-fold CV. Thus, each time the experiment is executed, K, N, and T

Results

This section presents the results using the above two experimental protocol setup as discussed in Section 4.1 (kernel optimization) and Section 4.2 (varying data size). In the first experiment, best kernel function for SVM classifier was estimated based on the criteria of the highest accuracy. The second experiment was positioned to understand the memorization vs. generalization behavior based on the variations in classification accuracy with respect to the change in data size (N). The results

Validation of pDSS: reliability and stability assessment

Fig. 8 shows the protocol for validation of pDSS. Given the classification results on test image and their corresponding a prior class labels as ground truth, one can compute classification accuracy. These measures are then used for the computation of the system's reliability and stability measures. The algorithm for computation of reliability of the system is discussed in Subsection 6.1 while the stability assessment is presented in Subsection 6.2.

Discussion

The pDSS proposed in this work discovers the comprehensive feature space covering four sets of features: texture, color, redness and chaoticness. Features such as cluster prominence, short run emphasis, long run emphasis, low gray-level run emphasis, long run low gray-level emphasis, periodicity, roughness, busyness, skewness, kurtosis, texture strength, and mean and standard deviation of hue are the dominant features selected using polling-based PCA feature selection process. Among the

Conclusion

A dermatology decision support system for automated classification of psoriasis skin images into psoriatic lesion and healthy was presented. This consisted of extraction of four set of feature spaces, namely: texture, color, redness and chaoticness. Using the power of polling-based PCA, one time training protocol was conducted and accuracy was determined on test data set using cross-validation approach. The study further demonstrated the reliability and stability of the dermatology decision

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Vimal K. Shrivastava has received the BE degree in Electronics and Telecommunication engineering from the Chhattisgarh Swami Vivekanand Technical University, Bhilai, Chhattisgarh in 2009 and MTech degree in Electronics Instrumentation engineering from National Institute of Technology Warangal, Andhra Pradesh in 2011. He has been working toward the PhD degree since 2012 from Department of Electrical Engineering of National Institute of Technology, Raipur, India.

References (30)

  • G. Krueger et al.

    The impact of psoriasis on quality of life: results of a 1998 National Psoriasis Foundation patient-membership survey

    Arch. Dermatol.

    (2001)
  • National Psoriasis Foundation, Statistics
  • S.M. Pereira et al.

    Classification of color images of dermatological ulcers

    IEEE J. Biomed. Health Inf.

    (2013)
  • W.Y. Chang et al.

    Computer-aided diagnosis of skin lesions using conventional digital photography: a reliability and feasibility study

    PLoS One

    (2013)
  • V.K. Shrivastava et al.

    First review on psoriasis severity risk stratification: an engineering perspective

    Comput. Biol. Med.

    (2015)
  • Cited by (10)

    • Swarm intelligence based clustering technique for automated lesion detection and diagnosis of psoriasis

      2020, Computational Biology and Chemistry
      Citation Excerpt :

      Other clustering-based (Li-Hong and Ming-Ni, 2011; Shrivastava and Londhe, 2015) and model-based (Taur et al., 2006; Guoli et al., 2013) methods have been proposed for the segmentation of psoriasis images. However, the results of most of the presented works (Shrivastava et al. (2015a); Taur et al., 2006; Shrivastava et al. (2015b); Shrivastava et al., 2015c, a; Shrivastava et al., 2016b, c; Shrivastava et al., 2017; Taur (2003); Bogo et al., 2012; Li-Hong and Ming-Ni, 2011; Shrivastava and Londhe, 2015) are not reliable as experimentations have been carried out with small data size (Taur, 2003; Bogo et al., 2012; Li-Hong and Ming-Ni, 2011; Shrivastava and Londhe, 2015). Validation on a larger dataset with varying degree of severity allows the applicability of the technique for the design of a reliable and generalized CADx system.

    • A novel and robust Bayesian approach for segmentation of psoriasis lesions and its risk stratification

      2017, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      The machine learning protocol has been adapted in literature for stratification of different dermatology diseases such as melanoma [11,12], Erythemato-squamous diseases [13,14] and dermatological ulcer [15]. However, machine learning protocol has been adapted recently for stratification of psoriasis images [16–21]. The issue with the current psoriasis risk assessment systems is the absence of automatic segmentation of psoriatic lesion.

    View all citing articles on Scopus

    Vimal K. Shrivastava has received the BE degree in Electronics and Telecommunication engineering from the Chhattisgarh Swami Vivekanand Technical University, Bhilai, Chhattisgarh in 2009 and MTech degree in Electronics Instrumentation engineering from National Institute of Technology Warangal, Andhra Pradesh in 2011. He has been working toward the PhD degree since 2012 from Department of Electrical Engineering of National Institute of Technology, Raipur, India.

    Dr. Narendra D Londhe received his BE degree from Amravati University in 2000. Later he received his MTech and PhD degrees in the year 2004 and 2011, respectively from Indian Institute of Technology Roorkee. He is presently working as Assistant Professor in Department of Electrical Engineering of National Institute of Technology, Raipur, India.

    Dr. Rajendra S. Sonawane is graduated from National Institute of Homeopathy, Kolkata, India. Later he received his M.D. and D.I. in homeopathy from London. He has treated over 15,000 psoriasis patients personally from all over India and many countries of the world since 27 years with homeopathic medicines only. He is the ex-professor of Homeopathic Medical Colleges, Malkapur, Amravati, Dhule and Shirpur.

    Jasjit S. Suri, PhD, MBA, Fellow AIMBE is an innovator, visionary, scientist and ​an internationally known world leader. Dr. Suri received the Director General's Gold medal in 1980 and the Fellow of American Institute of Medical and Biological Engineering, awarded by National Academy of Sciences, Washington DC in 2004. He has written over 500 peer reviewed articles and over 100 innovations. He is currently Chairman of Global Biomedical Technologies, Inc., Roseville, CA, USA.

    View full text