Prediction of the Unified Parkinson’s Disease Rating Scale assessment using a genetic programming system with geometric semantic genetic operators

doi:10.1016/j.eswa.2014.01.018

Expert Systems with Applications

Volume 41, Issue 10, August 2014, Pages 4608-4616

https://doi.org/10.1016/j.eswa.2014.01.018 Get rights and content

Highlights

•
Assessment of Parkinson’s disease (PD) symptom progression using a CI system.
•
System that includes the concept of semantics in the search process.
•
Results achieved using the largest database of PD speech in existence.
•
Better results than the ones produced by standard GP and other ML methods.
•
Results outperform the best published results achieved using the same dataset.

Abstract

Unified Parkinson’s Disease Rating Scale (UPDRS) assessment is the most used scale for tracking Parkinson’s disease symptom progression. Nowadays, the tracking process requires a patient to undergo invasive and time-consuming specialized examinations in hospital clinics, under the supervision of trained medical staff. Thus, the process is costly and logistically inconvenient for both patients and clinicians. For this reason, new powerful computational tools, aimed at making the process more automatic, cheaper and less invasive, are becoming more and more a necessity. The purpose of this paper is to investigate the use of an innovative intelligent system based on genetic programming for the prediction of UPDRS assessment, using only data derived from simple, self-administered and non-invasive speech tests. The system we propose is called geometric semantic genetic programming and it is based on recently defined geometric semantic genetic operators. Experimental results, achieved using the largest database of Parkinson’s disease speech in existence (approximately 6000 recordings from 42 Parkinson’s disease patients, recruited in a six-month, multi-centre trial), show the appropriateness of the proposed system for the prediction of UPDRS assessment. In particular, the results obtained with geometric semantic genetic programming are significantly better than the ones produced by standard genetic programming and other state of the art machine learning methods both on training and unseen test data.

Introduction

Neurological disorders, including Parkinson’s disease (PD), affect profoundly the lives of patients and their families (Caap-Ahlgren & Dehlin, 2002). PD is a disorder of the central nervous system that leads to severe difficulties with body motions. It is the second most common neurodegenerative disorder after Alzheimer’s disease (de Rijk et al., 2000) and it is estimated that more than one million people in North America alone are affected by it (Lang & Lozano, 1998). Moreover, as explained in Little, McSharry, Hunter, Spielman, and Ramig (2009), because of the rapid increase in the average population age in several countries, and since the risk of contracting PD increases after the age of 60 (Van Den Eeden et al., 2013), this number is expected to rise in the next few years. As a direct consequence, the medical care costs for patients with PD is estimated to rise in the future (Huse et al., 2005). The currently available therapies aim at improving the functional capacity of the patient for as much time as possible; however they are not able to modify the progression of the neurodegenerative process (Singh, Pillay, & Choonara, 2007). Most people affected by PD will therefore be substantially dependent on clinical intervention.

The process of tracking PD symptoms progression is a complex task. It often uses a system of measurement of the intensity of the symptoms called Unified Parkinson’s Disease Rating Scale (UPDRS). The UPDRS is a scale that was developed as an effort to incorporate elements from existing scales to provide a comprehensive, efficient and flexible way of measuring and monitoring PD-related disability and impairment (Movement Disorder Society, 2003). Prior to its development, multiple scales were used in different hospital clinics and health centers, making comparative assessments difficult. One of the core advantages of the UPDRS is that it was developed as a compound scale to capture multiple aspects of PD. It assesses both motor disability and motor impairment. Of all analogous available clinical scales, the UPDRS is currently the most commonly used one (Ramaker, Marinus, Stiggelbout, & Van Hilten, 2002). It reflects the presence and severity of symptoms, expressing it in a range from 0 to 176, with 0 representing a healthy state and 176 total disability. The UPDRS contains three sections:

•
Mentation, Behavior and Mood.
•
Activities of daily living.
•
Motor.

The motor section of the UPDRS encompasses tasks such as speech, facial expression, tremor and rigidity and expresses the severity of the related symptoms in a range from 0 to 108, where 0 represents a symptom free state and 108 denotes severe motor impairment.

For many persons affected by PD, the necessary specialized medical examinations to estimate the severity of their symptoms are difficult and invasive and they have to be performed by trained medical staff. Thus, as described in Tsanas, Little, McSharry, and Ramig (2010), symptom monitoring is costly and logistically inconvenient for patients and clinicians. All these critical aspects highlight the need of reliable and accurate computational techniques that allow estimating the UPDRS automatically and effectively.

In this paper, we present a comparative study of a set of computational methods aimed at predicting the severity of the PD symptoms in their entirety (i.e. including all of the three sections of the UPDRS) and the severity of the symptoms considered in the motor section of the UPDRS. The studied methods attempt to express these quantities as a function of several other features related to patients. Thus, the application is reduced to two symbolic regression problems, using as many datasets. The two datasets contain identical features and differ between each other in terms of the target values to be predicted. The dataset using as target the values of the severity of the general PD symptoms (including all of the three sections of the UPDRS) will be called total-UPDRS from now on, while the one using as target the values of the severity of the motor symptoms will be called motor-UPDRS.

In particular, the focus of this paper is on an intelligent system based on genetic programming (Koza, 1992, Poli et al., 2008). We use a recently introduced version of genetic programming, that uses so called geometric semantic genetic operators. We compare the results obtained with this new version of genetic programming to the ones returned by standard genetic programming and a set of different state-of-the-art machine learning methods.

The paper is organized as follows: Section 2 introduces standard genetic programming. Section 3 presents and motivates geometric semantic genetic operators. Section 4 describes the data we used and our experimental settings and proposes an accurate analysis of the results, comparing them with several different machine learning techniques. Finally, Section 5 concludes the paper.

Section snippets

Genetic programming

Models lie in the core of any technology in any industry, be it finance, health, manufacturing, services, mining, or information technology. The task of data-driven modeling lies in using a limited number of observations of system variables for inferring relationships among these variables. The design of reliable learning machines for data-driven modeling tasks is of strategic importance, as there are many systems that cannot be accurately modeled by classical mathematical or statistical

Geometric semantic operators

In the last few years, GP has been extensively used both in Industry and Academia (Arcuri and Yao, 2010, Chan et al., 2010, Choi and Choi, 2012, dos Santos et al., 2011, Koza et al., 2008, Moreno-Torres et al., 2013, Ravisankar et al., 2010, Trujillo et al., 2012, Yeun et al., 2000, Wongseree et al., 2007) and it has produced a wide set of results that have been defined human-competitive (Koza, 2010). While these results have demonstrated the appropriateness of GP in tackling real-life

Data set

This study makes use of the recordings described in Goetz et al. (2009) and in Tsanas et al. (2010), where 52 subjects with idiopathic PD were recruited. A subject was diagnosed with PD if he had at least two of the following: rest tremor, bradykinesia (slow movement) or rigidity, without evidence of other forms of parkinsonism. The study was supervised by six US medical centers: Georgia Institute of Technology (7 subjects), National Institutes of Health (10 subjects), Oregon Health and Science

Conclusions

The process of tracking Parkinson’s disease (PD) symptoms progression is very complex and new and powerful computational methods are needed to automatize it and make it faster and more reliable. The objective of this paper was to present a computational intelligence method that could outperform the state-of-the-art ones in terms of prediction accuracy of the PD symptoms progression, automatically discovering insightful relationships between dysphonia measures and the well known Unified

Acknowledgments

This work was supported by national funds through FCT under contract PEst-OE/EEI/LA0021/2013 and by projects MassGP (PTDC/EEI-CTP/2975/2012), EnviGP (PTDC/EIA-CCO/103363/2008) and InteleGen (PTDC/DTP-FTO/1747/2012), Portugal.

References (41)

K. Chan et al.
Modeling manufacturing processes using a genetic programming-based fuzzy regression with detection of outliers
Information Sciences
(2010)
W.J. Choi et al.
Genetic programming-based feature transform and classification for the automatic detection of pulmonary nodules on computed tomography images
Information Sciences
(2012)
J. dos Santos et al.
A relevance feedback method based on genetic programming for classification of remote sensing images
Information Sciences
(2011)
J.R. Koza et al.
Routine high-return human-competitive automated problem-solving by means of genetic programming
Information Sciences
(2008)
J.G. Moreno-Torres et al.
Repairing fractures between data using genetic programming-based feature extraction: A case study in cancer diagnosis
Information Sciences
(2013)
P. Ravisankar et al.
Failure prediction of dotcom companies using neural network-genetic programming hybrids
Information Sciences
(2010)
N. Singh et al.
Advances in the treatment of Parkinson’s disease
Progress in Neurobiology
(2007)
L. Trujillo et al.
Evolving estimators of the pointwise holder exponent with genetic programming
Information Sciences
(2012)
N.Q. Uy et al.
On the roles of semantic locality of crossover in genetic programming
Information Sciences
(2013)
W. Wongseree et al.
Thalassaemia classification by neural networks and genetic programming
Information Sciences
(2007)

Y. Yeun et al.

Function approximations by superimposing genetic programming trees: With applications to engineering problems

Information Sciences

(2000)

A. Arcuri et al.

Co-evolutionary automatic programming for software development

Information Sciences

(2010)

L. Beadle et al.

Semantically driven mutation in genetic programming

P. Boersma

Praat, a system for doing phonetics by computer

Glot International

(2001)

M. Caap-Ahlgren et al.

Factors of importance to the caregiver burden experienced by family caregivers of parkinson’s disease patients

Aging Clinical and Experimental Research

(2002)

M.C. de Rijk et al.

Prevalence of Parkinson’s disease in Europe: A collaborative study of population-based cohorts. Neurologic diseases in the elderly research group

Neurology

(2000)

C.G. Goetz et al.

Testing objective measures of motor impairment in early Parkinson’s disease: Feasibility study of an at-home testing device

Movement Disorders

(2009)

S. Haykin

Neural networks: A comprehensive foundation

(1999)

Hoffmann, L. (2009). Multivariate isotonic regression and its algorithms, Wichita State University, College of Liberal...

D.M. Huse et al.

Burden of illness in Parkinson’s disease

Movement Disorders

(2005)

Cited by (55)

Computerized analysis of speech and voice for Parkinson's disease: A systematic review
2022, Computer Methods and Programs in Biomedicine
Speech impairment is an early symptom of Parkinson's disease (PD). This study has summarized the literature related to speech and voice in detecting PD and assessing its severity.
A systematic review of the literature from 2010 to 2021 to investigate analysis methods and signal features. The keywords “Automatic analysis” in conjunction with “PD speech” or “PD voice” were used, and the PubMed and ScienceDirect databases were searched. A total of 838 papers were found on the first run, of which 189 were selected. One hundred and forty-seven were found to be suitable for the review. The different datasets, recording protocols, signal analysis methods and features that were reported are listed. Values of the features that separate PD patients from healthy controls were tabulated. Finally, the barriers that limit the wide use of computerized speech analysis are discussed.
Speech and voice may be valuable markers for PD. However, large differences between the datasets make it difficult to compare different studies. In addition, speech analytic methods that are not informed by physiological understanding may alienate clinicians.
The potential usefulness of speech and voice for the detection and assessment of PD is confirmed by evidence from the classification and correlation results.
Semantic schema based genetic programming for symbolic regression
2022, Applied Soft Computing
Citation Excerpt :
In this method, after the evolution, the best individual of the population is generated by applying some changes to its predecessors from the first generation to the last one. This kind of genetic programming was later tested on several real-world problems like [59,60]. In addition, a geometric mutation was designed and introduced in the semantic space [61], which produced offspring smaller than its parents or equal in size.
Despite the empirical success of Genetic programming (GP) in various symbolic regression applications, GP is not still known as a reliable problem-solving technique in this domain. Non-locality of GP representation and operators causes ineffectiveness of its search procedure. This study employs semantic schema theory to control and guide the GP search and proposes a local GP called semantic schema-based genetic programming (SBGP). SBGP partitions the semantic search space into semantic schemas and biases the search to the significant schema of the population, which is gradually progressing towards the optimal solution. Several semantic local operators are proposed for performing a local search around the significant schema. In combination with schema evolution as a global search, the local in-schema search provides an efficient exploration–exploitation control mechanism in SBGP. For evaluating the proposed method, we use six benchmarks, including synthesized and real-world problems. The obtained errors are compared to the best semantic genetic programming algorithms, on the one hand, and data-driven layered learning approaches, on the other hand. Results demonstrate that SBGP outperforms all mentioned methods in four out of six benchmarks up to 87% in the first set and up to 76% in the second set of experiments in terms of generalization measured by root mean squared error.
A novel binary classification approach based on geometric semantic genetic programming
2022, Swarm and Evolutionary Computation
Geometric semantic genetic programming (GSGP) is a recent variant of genetic programming. GSGP allows the landscape of any supervised regression problem to be transformed into a unimodal error surface, thus it has been applied only to this kind of problem. In a previous paper, we presented a novel variant of GSGP for binary classification problems that, taking inspiration from perceptron neural networks, uses a logistic-based activation function to constrain the output value of a GSGP tree in the interval [0,1]. This simple approach allowed us to use the standard RMSE function to evaluate the train classification error on binary classification problems and, consequently, to preserve the intrinsic properties of the geometric semantic operators. The results encouraged us to investigate this approach further. To this aim, in this paper, we present the results from 18 test problems, which we compared with those achieved by eleven well-known and widely classification schemes. We also studied how the parameter settings affect the classification performance and the use of the $F$ -score function to deal with imbalanced data. The results confirmed the effectiveness of the proposed approach.
Remote tracking of Parkinson's Disease progression using ensembles of Deep Belief Network and Self-Organizing Map
2020, Expert Systems with Applications
Citation Excerpt :
Accurate and reliable diagnosis of PD leads to appropriate and timely treatment (Challa, Pagolu, Panda, & Majhi, 2016). With the growth of computer science, the use of Artificial Intelligence (AI) and Machine Learning (ML) has contributed to the early and reliable detection satisfactory results (Castelli, Vanneschi, & Silva, 2014; Pereira et al., 2018). The analysis of clinical datasets with ML techniques has led to the development of Decision Support Systems (DSS), which helps physicians make decisions (Exarchos et al., 2012; Nilashi, Ibrahim, & Ahani, 2016; Prashanth & Roy, 2018; Prashanth, Roy, Mandal, & Ghosh, 2016; Zdrodowska, Dardzińska, Chorąży, & Kułakowska, 2018).
Parkinson’s Disease (PD) is one of the most prevalent neurological disorders characterized by impairment of motor function. Early diagnosis of PD is important for initial treatment. This paper presents a newly developed method for application in remote tracking of PD progression. The method is based on deep learning and clustering approaches. Specifically, we use the Deep Belief Network (DBN) and Support Vector Regression (SVR) to predict Unified Parkinson's Disease Rating Scale (UPDRS). The DBN prediction models were developed by different epoch numbers. We use a clustering approach, namely, Self-Organizing Map (SOM), to improve the accuracy and scalability of prediction. We evaluate our method on a real-world PD dataset. In all, nine clusters were detected from the data with the best SOM map quality for clustering, and for each cluster, a DBN was developed with a specific number of epochs. The results of the DBN prediction models were integrated by the SVR technique. Further, we compare our work with other supervised learning techniques, SVR and Neuro-Fuzzy techniques. The results revealed that the hybrid of clustering and DBN with the aid of SVR for an ensemble of the DBN outputs can make relatively better predictions of Total-UPDRS and Motor-UPDRS than other learning techniques.
MedGA: A novel evolutionary method for image enhancement in medical imaging systems
2019, Expert Systems with Applications
Citation Excerpt :
Differently to GAs, GP evolves a population of functions, or more generally, computer programs to solve a computational task. The solutions in the computer program space can be represented as trees, lines of code, expressions in prefix or postfix notations as well as strings of variable length (Castelli, Vanneschi, & Silva, 2014). For instance, Bianco, Ciocca, and Schettini (2017) tackled the video change detection problem (among the frames of video streams) by combining existing algorithms via different GP solutions exploiting several fusion schemes.
Medical imaging systems often require the application of image enhancement techniques to help physicians in anomaly/abnormality detection and diagnosis, as well as to improve the quality of images that undergo automated image processing. In this work we introduce MedGA, a novel image enhancement method based on Genetic Algorithms that is able to improve the appearance and the visual quality of images characterized by a bimodal gray level intensity histogram, by strengthening their two underlying sub-distributions. MedGA can be exploited as a pre-processing step for the enhancement of images with a nearly bimodal histogram distribution, to improve the results achieved by downstream image processing techniques. As a case study, we use MedGA as a clinical expert system for contrast-enhanced Magnetic Resonance image analysis, considering Magnetic Resonance guided Focused Ultrasound Surgery for uterine fibroids. The performances of MedGA are quantitatively evaluated by means of various image enhancement metrics, and compared against the conventional state-of-the-art image enhancement techniques, namely, histogram equalization, bi-histogram equalization, encoding and decoding Gamma transformations, and sigmoid transformations. We show that MedGA considerably outperforms the other approaches in terms of signal and perceived image quality, while preserving the input mean brightness. MedGA may have a significant impact in real healthcare environments, representing an intelligent solution for Clinical Decision Support Systems in radiology practice for image enhancement, to visually assist physicians during their interactive decision-making tasks, as well as for the improvement of downstream automated processing pipelines in clinically useful measurements.
An analytical method for measuring the Parkinson's disease progression: A case on a Parkinson's telemonitoring dataset
2019, Measurement: Journal of the International Measurement Confederation
Citation Excerpt :
Motor-UPDRS and Total-UPDRS are two important clinical scales of PD [51]. The diagnosis of PD at the early stage is important [69], which has raised an interest of numerous scholars worldwide [13,22,46,48]. From the literature, it can be found that many studies are conducted for early diagnosis of this disease through a set of real-world data to predict the progression of PD [14,15].
The use of machine learning techniques for early diseases diagnosis has attracted the attention of scholars worldwide. Parkinson’s Disease (PD) is one of the most common neurological and complicated diseases affecting the central nervous system. Unified Parkinson’s Disease Rating Scale (UPDRS) is widely used for tracking PD symptom progression. Motor- and Total-UPDRS are two important clinical scales of PD. The aim of this study is to predict UPDRS scores through analyzing the speech signal properties which is important in PD diagnosis. We take the advantages of ensemble learning and dimensionality reduction techniques and develop a new hybrid method to predict Total- and Motor-UPDRS. We accordingly improve the time complexity and accuracy of the PD diagnosis systems, respectively, by using Singular Value Decomposition (SVD) and ensembles of Adaptive Neuro-Fuzzy Inference System (ANFIS). We evaluate our method on a large PD dataset and present the results. The results showed that the proposed method is effective in predicting PD progression by improving the accuracy and computation time of the disease diagnosis. The method can be implemented as a medical decision support system for real-time PD diagnosis when big data from the patients is available in the medical datasets.

View all citing articles on Scopus

View full text

Prediction of the Unified Parkinson’s Disease Rating Scale assessment using a genetic programming system with geometric semantic genetic operators

Highlights

Abstract

Introduction

Section snippets

Genetic programming

Geometric semantic operators

Data set

Conclusions

Acknowledgments

Information Sciences

Information Sciences

Information Sciences

Information Sciences

Information Sciences

Information Sciences

Progress in Neurobiology

Information Sciences

Information Sciences

Information Sciences

Information Sciences

Co-evolutionary automatic programming for software development

Information Sciences

Semantically driven mutation in genetic programming

Praat, a system for doing phonetics by computer

Glot International

Factors of importance to the caregiver burden experienced by family caregivers of parkinson’s disease patients

Aging Clinical and Experimental Research

Prevalence of Parkinson’s disease in Europe: A collaborative study of population-based cohorts. Neurologic diseases in the elderly research group

Neurology

Testing objective measures of motor impairment in early Parkinson’s disease: Feasibility study of an at-home testing device

Movement Disorders

Neural networks: A comprehensive foundation

Burden of illness in Parkinson’s disease

Movement Disorders