Local discriminant preservation projection embedded ensemble learning based dimensionality reduction of speech data of Parkinson’s disease

https://doi.org/10.1016/j.bspc.2020.102165Get rights and content

Highlights

  • An improved locality preserving projections algorithm is proposed for Parkinson's disease (PD) speech diagnosis for the first time.

  • A new objective function is designed to reduce the intra-class variance and increase the inter-class variance of PD speech data.

  • Ensemble learning is introduced into the construction of the projection matrix to increase the stability of the algorithm.

  • The proposed algorithm has a good effect both on PD diagnosis, and on automatic assessment of rehabilitative of PD.

Abstract

Speech has been widely used in the diagnosis of Parkinson's disease (PD). However, the collected PD speech data has the characteristics of high data redundancy, high aliasing and small sample size, which brings great challenges to PD speech recognition. Dimensionality reduction (DR) can effectively solve these problems. However, the existing methods for PD speech DR methods ignore the high noise and high aliasing characteristics of PD speech. In order to alleviate these problems, a weighted local discriminant preservation projection embedded ensemble algorithm is proposed to detect PD. The proposed algorithm preferentially reduces the intra-class variance of PD speech samples, and simultaneously increases the inter-class variance and maintains the neighborhood structure of PD speech samples. In addition, the idea of ensemble learning is introduced to increase the stability of the model. Two widely used PD speech datasets for diagnosis and a treated Parkinson patient speech dataset collected by ourselves were used to verify the effectiveness of the proposed algorithm. Compared with existing PD speech DR methods, the proposed algorithm always has the highest Accuracy, Precision, Recall and G-mean in PD speech datasets. This shows that the proposed algorithm not only has excellent performance in classification of PD speech data, but also can handle imbalanced PD samples well. Even compared with the state-of-the-art DR methods, the proposed method was improved by at least 4.34 %. In addition, the proposed algorithm not only achieved the highest detection accuracy, but also achieved the highest AUC in most case.

Introduction

Parkinson’s disease (PD) is a neurodegenerative disease [1] and the number of patients has increased year by year. PD causes great inconvenience to patients and brings a heavy economic burden to patients' families and society. There is no permanent cure for Parkinson's disease [2], therefore early diagnosis of PD is particularly important. Vocal performance degradation, a common symptom for PD subjects, has been widely used in the diagnosis of PD [[3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25]]. However, existing Parkinson speech datasets are characterized by high redundancy, high aliasing, high noise and small sample size. Therefore, improving the classification accuracy of PD speech is a scientific issue. Dimension reduction (DR) can effectively process PD speech data, improve the diagnostic accuracy of PD classification and the generalization ability of the classification model.

At present, DR methods for PD speech datasets can be divided into feature selection and feature extraction according to whether the generated feature space is a subset of the original feature space or not. The earliest DR methods of PD speech datasets mainly focused on feature selection, because the generated feature space is a subset of the original feature space, it has a higher interpretability. The feature selection algorithms commonly used for Parkinson's diagnosis are Relief (or ReliefF) [4,8,9], genetic algorithm (GA) [11], sequential backward selection (SBS) [12], minimum redundancy maximum relevance (mRMR) [[4], [5], [6], [7]], particle swarm optimization (PSO) [10], sequential forward selection (SFS) [13] and Least absolute shrinkage and selection operator (LASSO) [3,4], etc. However, the methods result in the loss of original data information, as some features must be omitted. Feature extraction can completely solve this problem [14].

Feature extraction, mapping high-dimensional data to a specific low-dimensional space, preserves Parkinson sample information as much as possible [14]. Moreover, another advantage of feature extraction over feature selection is that it is more suitable for processing datasets with noise. To the best of our knowledge, most medical-related datasets contain noisy data rather than irrelevant or redundant data [14]. The earliest and most representative feature extraction methods for the diagnosis of PD speech datasets are PCA [15,16] and LDA [13,17,18]. Despite the good results of the PCA and LDA in Parkinson's data, this does not mean that such an approach is reasonable. Both PCA and LDA are linear feature extraction methods, which is contrary to the nonlinear characteristics of most complex data [19,20]. The linear feature extraction methods cannot explore the characteristics of the data well.

Nonlinear feature extraction can achieve nonlinear mapping of data and has been applied to the diagnosis of PD [17,[21], [22], [23], [24], [25]]. Yang et al. used the SFS and kernel PCA (KPCA) to identify Parkinson's speech data and achieved good classification results [17]. Genetic Algorithm-Wavelet Kernel-Extreme Learning Machine (GA-WK-ELM) [22] proposed by Derya used a WK-ELM to classify PD speech data. The function of the wavelet kernel is actually to map the original PD speech data and then input it to the ELM for classification. Peker et al. combined the complex-valued artificial neural networks and mRMR to achieve PD diagnosis [23]. A deep neural network was used by Grover to predict the severity of PD on the Parkinson's speech dataset [24]. Camilo et al. consider multimodal information (speech, gait and handwriting) to diagnosis PD following a deep learning approach [25].

As can be seen from above referenced papers, the existing nonlinear feature extraction algorithms for PD speech data can be divided into two categories. The first category named kernel mapping methods realized nonlinear mapping by performing kernel mapping on PD speech samples, and its typical representative algorithm is KPCA [17]. However, the disadvantage of kernel mapping methods is that it needs to find an appropriate kernel function for the data based on prior knowledge of the data. The second category realized the nonlinear mapping of PD speech data by neural networks (NN), such as deep neural network [24,25]. Although NN have achieved good performance in the diagnosis of PD, there are still some shortcomings: 1) The establishment of a NN model requires a large amount of data, which cannot be satisfied by many existing PD speech datasets. 2) The neural network model established by small sample data is prone to over-fitting and leads to poor generalization. 3) In addition, the establishment of neural network models is a very time-consuming process. Actually, there is another nonlinear feature extraction method called manifold learning that is ignored by scholars studying PD speech datasets.

Locality Preserving Projections (LPP), a classic manifold learning algorithm, can optimally preserve the adjacent structure of the data in the process of dimensionality reduction [26]. However, the neighborhood structure of LPP is maintained so that it cannot effectively separate the data with high aliasing (such as PD speech data). Some algorithms have been proposed to overcome this shortcoming of LPP [[27], [28], [29], [30]]. However, these improved LPP algorithms ignore the characteristics of PD speech datasets. Most of them focus on increasing the inter-class variance of the data and ignore the larger intra-class variance. In addition, there is a lack of stability when the algorithms map high-dimensional data. In order to solve these problems, a weighted local discriminant preservation projection embedded ensemble (w_LDPPEE) algorithm for classification of PD speech data is proposed.

As far as we know, there is no public report of manifold learning based dimensionality reduction on the automatic assessment of rehabilitative speech treatment in PD. Therefore, it is the first report that the w_LDPPEE based dimensionality reduction methods are applied to automatic assessment of rehabilitative speech treatment in PD. The main contributions and innovations of this paper are:

  • 1

    An improved LPP algorithm is proposed and applied to PD speech diagnosis for the first time, which is a supplement to the PD speech diagnosis by using manifold learning.

  • 2

    According to the characteristics of PD speech data, an objective function suitable for processing PD speech data is designed. It can preferentially reduce the intra-class variance of PD speech data and simultaneously increase the inter-class variance of PD speech data, which can effectively reduce the high aliasing characteristics of PD speech data.

  • 3

    Ensemble learning is introduced into the construction of the projection matrix to increase the stability of the algorithm.

  • 4

    The proposed algorithm not only has a good effect on PD diagnosis, but also has excellent performance on the automatic assessment of rehabilitative speech treatment in PD.

Section snippets

Data

Three representative datasets were utilized to verify the effectiveness of the proposed algorithm. They are Parkinson Speech Dataset with Multiple Types of Sound Recordings Data Set (PSDMTSR), PARKINSONS and the self-collected dataset (named SelfData), respectively. The first two datasets are available for free in the UCI Machine Learning Repository [31] and have been widely used in PD speech diagnosis. The third dataset is used to automatic assessment of rehabilitative speech treatment in PD.

Experimental environment

The experiments were conducted within the MATLAB, version 2018b. A PC with Intel(R) Core i5-8400 (2.8 GHz) CPU and 8 GB RAM. The operating system was Windows 10, 64-bit. The parameter settings of the proposed algorithm were shown in Table 3. The dimension d of the other classification algorithms were set to {5, 10, 15, …} and the kernel parameter t of the Heatkernel mode of LPP was set to {10−4, 10-3, …, 104}. In classifier learning, the number of trees in RF was set to 300 and the hidden

Discussion

Speech has been widely used in the clinical diagnosis of Parkinson's disease due to the richness of information contained in speech and the convenience of collection and diagnosis. Dimensionality reduction can be helpful for improving the accuracy. However, speech is easily mixed with noise during the acquisition process, and it is easily affected by the emotional fluctuation of the speaker. Therefore, existing speech datasets often have high noise and high aliasing (Large intra-class variance

Conclusion

The proposed algorithm is a feature extraction algorithm with superior performance. Compared with the existing feature extraction, feature selection and the state-of-the-art algorithms for PD speech diagnosis, it has higher Accuracy, Precision, Recall and G-mean. This means that the proposed algorithm can provide clinicians with more accurate clinical diagnostic information. In addition, compared with existing algorithms, the proposed algorithm also has excellent performance on the automatic

CRediT authorship contribution statement

Yuchuan Liu: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing - original draft. Yongming Li: Conceptualization, Data curation, Formal analysis, Methodology, Funding acquisition, Supervision, Validation, Writing - review & editing. Xiaoheng Tan: Conceptualization, Funding acquisition, Project administration, Supervision, Writing - review & editing. Pin Wang: Conceptualization, Data curation, Formal analysis, Resources,

Declaration of Competing Interest

The authors report no declarations of interest.

Acknowledgments

The authors would like to thank the editor and reviewers for their valuable comments and suggestions. We also would like to thank those individuals (or institutions) that have provided data support for this research. This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 61571069 and Grant 61771080, in part by the Graduate Research and Innovation Foundation of Chongqing, China, under Grant CYB18068 and Grant CYB19058, in part by the Fundamental

References (46)

  • X. Deng et al.

    An improved method to construct basic probability assignment based on the confusion matrix for classification problem

    Inf. Sci.

    (2016)
  • Tan Guo et al.

    Data induced masking representation learning for face data analysis

    Knowl. Syst.

    (2019)
  • C. Arkinson et al.

    Parkin function in Parkinson’s disease

    Science

    (2018)
  • L. Zhang et al.

    Neuroprotective effects of the novel GLP-1 long acting analogue semaglutide in the MPTP Parkinson’s disease mouse model

    Neuropeptides

    (2018)
  • A. Tsanas et al.

    Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests

    IEEE Trans. Biomed. Eng.

    (2010)
  • A. Tsanas et al.

    Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease

    IEEE Trans. Bio-Med. Eng.

    (2012)
  • M. Peker et al.

    Computer-aided diagnosis of Parkinson’s disease using complex-valued neural networks and mRMR feature selection algorithm

    J. Healthc. Eng.

    (2015)
  • T. Tuncer et al.

    Automated detection of Parkinson’s disease using minimum average maximum tree and singular value decomposition method with vowels

    Biocybern. Biomed. Eng.

    (2019)
  • A.A. Spadoto et al.

    Improving Parkinson’s disease identification through evolutionary-based feature selection

  • O. Kursun et al.

    Selection of vocal features for Parkinson’s disease diagnosis

    Int. J. Data Min. Bioinform.

    (2012)
  • Yuchuan Liu et al.

    Recognition algorithm of Parkinson’s disease based on weighted local discriminant preservation projection embedded ensemble algorithm

  • I. El Moudden et al.

    Feature selection and extraction for class prediction in dysphonia measures analysis: a case study on Parkinson’s disease speech rehabilitation

    Technol. Health Care

    (2017)
  • I.E. Moudden et al.

    Automatic speech analysis in patients with Parkinson’s disease using feature dimension reduction

  • Cited by (21)

    • A lightweight CNN and Transformer hybrid model for mental retardation screening among children from spontaneous speech

      2022, Computers in Biology and Medicine
      Citation Excerpt :

      Karan et al. [21] proposed a new acoustic feature, namely the instantaneous energy deviation cepstrum coefficient (IEDCC), for PD detection and obtained promising results. Due to the high data redundancy, high confounding, and small sample size of the collected speech data, Liu et al. [22] proposed a weighted local discriminant preservation projection embedded ensemble algorithm to improve the detection accuracy of PD speech diagnosis. Mehmet et al. [23] extracted the Mel spectrogram of the sound signal and proposed depth features using a pre-trained deep network and input these features into an LSTM model for PD speech classification.

    • Computerized analysis of speech and voice for Parkinson's disease: A systematic review

      2022, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      They also used real-world data and reported 72% accuracy. Feature sets should have low intraclass variance and high interclass distance [54]. Reduced intraclass variance and increased interclass distance of PD speech data improve the stability of classification.

    View all citing articles on Scopus
    View full text