A Comparative Analysis of Machine Learning classifiers for Dysphonia-based classification of Parkinson’s Disease

Goyal, Jinee; Khandnor, Padmavati; Aseri, Trilok Chand

doi:10.1007/s41060-020-00234-0

A Comparative Analysis of Machine Learning classifiers for Dysphonia-based classification of Parkinson’s Disease

Applications
Published: 15 October 2020

Volume 11, pages 69–83, (2021)
Cite this article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

925 Accesses
20 Citations
Explore all metrics

Abstract

Parkinson’s Disease is the second most common neurogenerative disease that affects the nervous system. There is no permanent cure for this disease, so, its early diagnosis is important to improve the quality of living of Parkinson patients. The distortion of the voice is one of the first symptoms to appear in Parkinson patients. Therefore, comparison and classification plays an important role. In this paper, a comparison of various classification techniques is done to show the potential of each classifier. The various classification techniques include SVM (Linear, RBF, Polynomial), DT, RF, LR, KNN, NB, MLP, AdaBoost, and XGBoost. Three different types of feature selection techniques are also explored to reduce the dimensionality of the dataset without affecting the accuracy much. The three different feature selection techniques include mRMR, GA, and PCA. The potential of voice features in classification process is also shown.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parkinson’s Disease Detection from Voice and Speech Data Using Machine Learning

Performance of machine learning methods in diagnosing Parkinson’s disease based on dysphonia measures

Article 12 October 2017

Prediction of Parkinson’s Disease Using Machine Learning Models—A Classifier Analysis

References

Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev.: Comput. Stat. 2(4), 433–459 (2010)
Google Scholar
Abrol, A., Rokham, H., Calhoun, V.D.: Diagnostic and prognostic classification of brain disorders using residual learning on structural MRI data. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 4084–4088. IEEE (2019)
Bandini, A., Orlandi, S., Escalante, H.J., Giovannelli, F., Cincotta, M., Reyes-Garcia, C.A., Vanni, P., Zaccara, G., Manfredi, C.: Analysis of facial expressions in Parkinson’s disease through video-based automatic methods. J. Neurosci. Methods 281, 7–20 (2017)
Google Scholar
Bayestehtashk, A., Asgari, M., Shafran, I., McNames, J.: Fully automated assessment of the severity of Parkinson’s disease from speech. Comput. Speech Lang. 29(1), 172–185 (2015)
Google Scholar
Braga, D., Madureira, A.M., Coelho, L., Ajith, R.: Automatic detection of Parkinson’s disease based on acoustic analysis of speech. Eng. Appl. Artif. Intell. 77, 148–158 (2019)
Google Scholar
Cai, J., Luo, J., Wang, S., Yang, S.: Feature selection in machine learning: a new perspective. Neurocomputing 300, 70–79 (2018)
Google Scholar
Cai, Y., Huang, T., Hu, L., Shi, X., Xie, L., Li, Y.: Prediction of lysine ubiquitination with mRMR feature selection and analysis. Amino Acids 42(4), 1387–1395 (2012)
Google Scholar
Cai, Z., Gu, J., Wen, C., Zhao, D., Huang, C., Huang, H., Tong, C., Li, J., Chen, H.: An intelligent Parkinson’s disease diagnostic system based on a chaotic bacterial foraging optimization enhanced fuzzy KNN approach. Comput. Math. Methods Med. (2018)
Chen, H.L., Wang, G., Ma, C., Cai, Z.N., Liu, W.B., Wang, S.J.: An efficient hybrid kernel extreme learning machine approach for early diagnosis of Parkinson’s disease. Neurocomputing 184, 131–144 (2016)
Google Scholar
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Cilia, N.D., De Stefano, C., Fontanella, F., di Freca, A.S.: A ranking-based feature selection approach for handwritten character recognition. Pattern Recogn. Lett. 121, 77–86 (2019)
Google Scholar
Daneault, J.F., Lee, S.I., Golabchi, F.N., Patel, S., Shih, L.C., Paganoni, S., Bonato, P.: Estimating Bradykinesia in Parkinson’s disease with a minimum number of wearable sensors. In: Proceedings of the Second IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies, pp. 264–265. IEEE Press (2017)
De Rijk, Md, Launer, L., Berger, K., Breteler, M., Dartigues, J., Baldereschi, M., Fratiglioni, L., Lobo, A., Martinez-Lage, J., Trenkwalder, C., et al.: Prevalence of Parkinson’s disease in Europe: a collaborative study of population-based cohorts. Neurologic diseases in the elderly research group. Neurology 54(11 Suppl 5), S21–3 (2000)
Google Scholar
Ertuğrul, Ö.F., Kaya, Y., Tekin, R., Almalı, M.N.: Detection of Parkinson’s disease by shifted one dimensional local binary patterns from gait. Expert Syst. Appl. 56, 156–163 (2016)
Google Scholar
Fahn, S.: Description of Parkinson’s disease as a clinical syndrome. Ann. N. Y. Acad. Sci. 991(1), 1–14 (2003)
Google Scholar
Gautam, R., Sharma, M.: Prevalence and diagnosis of neurological disorders using different deep learning techniques: a meta-analysis. J. Med. Syst. 44(2), 49 (2020)
Google Scholar
Goetz, C.G., Poewe, W., Rascol, O., Sampaio, C., Stebbins, G.T., Counsell, C., Giladi, N., Holloway, R.G., Moore, C.G., Wenning, G.K., et al.: Movement disorder society task force report on the Hoehn and Yahr staging scale: status and recommendations the movement disorder society task force on rating scales for Parkinson’s disease. Mov. Disord. 19(9), 1020–1028 (2004)
Google Scholar
Goetz, C.G., Fahn, S., Martinez-Martin, P., Poewe, W., Sampaio, C., Stebbins, G.T., Stern, M.B., Tilley, B.C., Dodel, R., Dubois, B., et al.: Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): process, format, and clinimetric testing plan. Mov. Disord. 22(1), 41–47 (2007)
Google Scholar
Goetz, C.G., Tilley, B.C., Shaftman, S.R., Stebbins, G.T., Fahn, S., Martinez-Martin, P., Poewe, W., Sampaio, C., Stern, M.B., Dodel, R., et al.: Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov. Disord.: Off. J. Mov. Disord. Soc. 23(15), 2129–2170 (2008)
Google Scholar
Gómez-Ríos, A., Luengo, J., Herrera, F.: A study on the noise label influence in boosting algorithms: AdaBoost, GBM and XGBoost. In: International Conference on Hybrid Artificial Intelligence Systems, pp. 268–280. Springer (2017)
Guha, R., Ghosh, M., Kapri, S., Shaw, S., Mutsuddi, S., Bhateja, V., Sarkar, R.: Deluge based genetic algorithm for feature selection. Evolut. Intell., 1–11 (2019)
Haq, A.U., Li, J., Memon, M.H., Khan, J., Din, S.U., Ahad, I., Sun, R., Lai, Z.: Comparative analysis of the classification performance of machine learning classifiers and deep neural network classifier for prediction of Parkinson disease. In: 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 101–106. IEEE (2018)
Jankovic, J.: Parkinson’s disease: clinical features and diagnosis. J. Neurol. Neurosurg. Psychiatry 79(4), 368–376 (2008)
Google Scholar
Jin, X., Ma, E.W., Cheng, L.L., Pecht, M.: Health monitoring of cooling fans based on Mahalanobis distance with mRMR feature selection. IEEE Trans. Instrum. Meas. 61(8), 2222–2229 (2012)
Google Scholar
Kaur, P., Sharma, M.: Diagnosis of human psychological disorders using supervised learning and nature-inspired computing techniques: a meta-analysis. J. Med. Syst. 43(7), 204 (2019)
Google Scholar
Kečo, D., Subasi, A., Kevric, J.: Cloud computing-based parallel genetic algorithm for gene selection in cancer classification. Neural Comput. Appl. 30(5), 1601–1610 (2018)
Google Scholar
King, G., Zeng, L.: Logistic regression in rare events data. Polit. Anal. 9(2), 137–163 (2001)
Google Scholar
Kotsavasiloglou, C., Kostikis, N., Hristu-Varsakelis, D., Arnaoutoglou, M.: Machine learning-based classification of simple drawing movements in Parkinson’s disease. Biomed. Signal Process. Control 31, 174–180 (2017)
Google Scholar
Koutanaei, F.N., Sajedi, H., Khanbabaei, M.: A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring. J. Retail. Consum. Serv. 27, 11–23 (2015)
Google Scholar
Lahmiri, S., Shmuel, A.: Detection of Parkinson’s disease based on voice patterns ranking and optimized support vector machine. Biomed. Signal Process. Control 49, 427–433 (2019)
Google Scholar
Lahmiri, S., Dawson, D.A., Shmuel, A.: Performance of machine learning methods in diagnosing Parkinson’s disease based on dysphonia measures. Biomed. Eng. Lett. 8(1), 29–39 (2018)
Google Scholar
Lawson, R.A., Yarnall, A.J., Duncan, G.W., Breen, D.P., Khoo, T.K., Williams-Gray, C.H., Barker, R.A., Collerton, D., Taylor, J.P., Burn, D.J., et al.: Cognitive decline and quality of life in incident Parkinson’s disease: the role of attention. Parkinsonism Rel. Disord. 27, 47–53 (2016)
Google Scholar
Leardi, R., Boggia, R., Terrile, M.: Genetic algorithms as a strategy for feature selection. J. Chemom. 6(5), 267–281 (1992)
Google Scholar
Mostafa, S.A., Mustapha, A., Mohammed, M.A., Hamed, R.I., Arunkumar, N., Ghani, M.K.A., Jaber, M.M., Khaleefah, S.H.: Examining multiple feature evaluation and classification methods for improving the diagnosis of Parkinson’s disease. Cogn. Syst. Res. 54, 90–99 (2019)
Google Scholar
Nielsen, A.N., Barch, D.M., Petersen, S.E., Schlaggar, B.L., Greene, D.J.: Machine learning with neuroimaging: evaluating its applications in psychiatry. Biol. Psychiatry: Cogn. Neurosci. Neuroimaging (2019)
Oung, Q.W., Muthusamy, H., Basah, S.N., Lee, H., Vijean, V.: Empirical wavelet transform based features for classification of Parkinson’s disease severity. J. Med. Syst. 42(2), 29 (2018)
Google Scholar
Pal, M.: Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005)
Google Scholar
Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, and classification. IEEE Trans. Neural Netw. 3(5), 683–697 (1992)
Google Scholar
Pape, K., Tamouza, R., Leboyer, M., Zipp, F.: Immunoneuropsychiatry—novel perspectives on brain disorders. Nat. Rev. Neurol. 15(6), 317–328 (2019)
Google Scholar
Parisi, L., RaviChandran, N., Manaog, M.L.: Feature-driven machine learning to improve early diagnosis of Parkinson’s disease. Expert Syst. Appl. 110, 182–190 (2018)
Google Scholar
Parkinson, J.: An essay on the shaking palsy. J. Neuropsychiatry Clin. Neurosci. 14(2), 223–236 (2002)
Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 8, 1226–1238 (2005)
Google Scholar
Pham, T.T., Moore, S.T., Lewis, S.J.G., Nguyen, D.N., Dutkiewicz, E., Fuglevand, A.J., McEwan, A.L., Leong, P.H.: Freezing of gait detection in Parkinson’s disease: a subject-independent detector using anomaly scores. IEEE Trans. Biomed. Eng. 64(11), 2719–2728 (2017)
Google Scholar
Politis, M., Wu, K., Molloy, S., Bain, P.G., Chaudhuri, K.R., Piccini, P.: Parkinson’s disease symptoms: the patient’s perspective. Mov. Disord. 25(11), 1646–1651 (2010)
Google Scholar
Pringsheim, T., Jette, N., Frolkis, A., Steeves, T.D.: The prevalence of Parkinson’s disease: a systematic review and meta-analysis. Mov. Disord. 29(13), 1583–1590 (2014)
Google Scholar
Qiao, C., Lu, L., Yang, L., Kennedy, P.J.: Identifying brain abnormalities with schizophrenia based on a hybrid feature selection technology. Appl. Sci. 9(10), 2148 (2019)
Google Scholar
Rajagopal, P.C., Choudhury, T., Sharma, A., Kumar, P.: Diagnosis of Parkinson’s diseases using classification based on voice recordings. In: Emerging Trends in Expert Applications and Security, pp. 575–581. Springer (2019)
Rish, I., et al.: An empirical study of the Naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3, pp. 41–46 (2001)
Sakar, B.E., Isenkul, M.E., Sakar, C.O., Sertbas, A., Gurgen, F., Delil, S., Apaydin, H., Kursun, O.: Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. IEEE J. Biomed. Health Inform. 17(4), 828–834 (2013)
Google Scholar
Sakar, B.E., Serbes, G., Sakar, C.O.: Analyzing the effectiveness of vocal features in early telediagnosis of Parkinson’s disease. PLoS ONE 12(8), e0182428 (2017)
Google Scholar
Sakar, C.O., Serbes, G., Gunduz, A., Tunc, H.C., Nizam, H., Sakar, B.E., Tutuncu, M., Aydin, T., Isenkul, M.E., Apaydin, H.: A comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and the use of the tunable q-factor wavelet transform. Appl. Soft Comput. 74, 255–263 (2019)
Google Scholar
Scarpazza, C., Baecker, L., Vieira, S., Mechelli, A.: Applications of machine learning to brain disorders. In: Machine Learning, pp. 45–65. Elsevier (2020)
Schapire, R.E.: Explaining AdaBoost. In: Empirical Inference, pp. 37–52. Springer (2013)
Sharma, M., Romero, N.: Future prospective of soft computing techniques in psychiatric disorder diagnosis. EAI Endorsed Trans. Pervasive Health Technol. 4(15), e1 (2018)
Google Scholar
Sharma, P., Sundaram, S., Sharma, M., Sharma, A., Gupta, D.: Diagnosis of Parkinson’s disease using modified grey wolf optimization. Cogn. Syst. Res. 54, 100–115 (2019)
Google Scholar
Shukla, A.K., Singh, P., Vardhan, M.: Medical diagnosis of Parkinson disease driven by multiple preprocessing technique with Scarce Lee Silverman voice treatment data. In: Engineering Vibration, Communication and Information Processing, pp. 407–421. Springer (2019)
Thanawattano, C., Anan, C., Pongthornseri, R., Dumnin, S., Bhidayasiri, R.: Temporal fluctuation analysis of tremor signal in Parkinson’s disease and essential tremor subjects. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 6054–6057. IEEE (2015)
von Campenhausen, S., Winter, Y., e Silva, A.R., Sampaio, C., Ruzicka, E., Barone, P., Poewe, W., Guekht, A., Mateus, C., Pfeiffer, K.P., et al.: Costs of illness and care in Parkinson’s disease: an evaluation in six countries. Eur. Neuropsychopharmacol. 21(2), 180–191 (2011)
Google Scholar
Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. In: Feature Extraction, Construction and Selection, pp. 117–136. Springer (1998)
Yoon, H., Li, J.: A novel positive transfer learning approach for telemonitoring of Parkinson’s disease. IEEE Trans. Autom. Sci. Eng. 16(1), 180–191 (2018)
Google Scholar
Zhang, A., San-Segundo, R., Panev, S., Tabor, G., Stebbins, K., Whitford, A., De la Torre, F., Hodgins, J.: Automated tremor detection in Parkinson’s disease using accelerometer signals. In: 2018 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), pp. 13–14. IEEE (2018)
Zhang, S., Li, X., Zong, M., Zhu, X., Cheng, D.: Learning K for KNN classification. ACM Trans. Intell. Syst. Technol. (TIST) 8(3), 43 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering Department, Punjab Engineering College (Deemed to be University), Chandigarh, India
Jinee Goyal, Padmavati Khandnor & Trilok Chand Aseri

Authors

Jinee Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Padmavati Khandnor
View author publications
You can also search for this author in PubMed Google Scholar
Trilok Chand Aseri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinee Goyal.

Ethics declarations

Conflict of interest

No conflicts of interest are declared related to the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Parameter selection for GA

Genetic Algorithm (GA) has four different input parameters which affects the performance of the system. These parameters include the number of generations, population size, crossover probability, and mutation probability. There is no direct formula to estimate the optimal size of these parameters. So, an experimental research design methodology is followed to tune these four input parameters.

1.1 Generation size

To select the optimal generation size, experiment has been done on 10, 20, 30, 40, 50 and 60 generations on the best performing classifier, i.e. XGBoost. The results are shown in Table 4

Table 4 Generation size

Full size table

The maximum accuracy is 90.60% for dataset 1 with 20 generations and maximum accuracy is 77.23% for dataset 2 with 40 generations. So, to choose from 20 or 40 generations, other performance parameters are checked and the generation size is chosen as 20. Specificity with generation 20 is greater as compared to specificity with generation size 40. The computation time with 20 generations is also reduced as compared to computation time with 40 generations.

1.2 Population size

The experiment for population size is done with size as 10, 15, 20, 25, and 30. The results are calculated with classifier having maximum accuracy, i.e. XGBoost and optimal generation size, i.e. 20. The results are shown in Table 5.

Table 5 Population size

Full size table

It can be observed from the table that there is an increasing trend with an increase in population size. As the population size increases, accuracy also increases. Increasing the population size does not increase much computation time. We have experimented the population size till 30 but it can be extended depending upon the requirement of the system, i.e. whether accuracy is a major concern or a trade-off is required between accuracy and computation time.

1.3 Crossover probability

The experiments of crossover probability are done with a probability of 0.5, 0.7, and 1 with XGBoost classifier as it is having maximum performance. The optimal generation size of 20 and a population size of 30 is taken for experimentation. The results are discussed in Table 6.

Table 6 Crossover probability

Full size table

It can be observed from the table that the crossover probability of 1 generates maximum accuracy. This means new generation children need to be crossover in every generation.

1.4 Mutation probability

The mutation probability is selected by experimenting with a probability of 0.1, 0.2, and 0.3. The experiments are done with best classifier (XGBoost), optimal generation size (20), optimal population size (30), and optimal crossover probability (1). The results are shown in Table 7.

Table 7 Mutation probability

Full size table

It can be observed from the table that optimal mutation probability is 0.2 as it generates maximum accuracy of 91.40% with dataset 1 and 77.31% with dataset 2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goyal, J., Khandnor, P. & Aseri, T.C. A Comparative Analysis of Machine Learning classifiers for Dysphonia-based classification of Parkinson’s Disease. Int J Data Sci Anal 11, 69–83 (2021). https://doi.org/10.1007/s41060-020-00234-0

Download citation

Received: 26 May 2020
Accepted: 23 September 2020
Published: 15 October 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s41060-020-00234-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparative Analysis of Machine Learning classifiers for Dysphonia-based classification of Parkinson’s Disease

Abstract

Access this article

Similar content being viewed by others

Parkinson’s Disease Detection from Voice and Speech Data Using Machine Learning

Performance of machine learning methods in diagnosing Parkinson’s disease based on dysphonia measures

Prediction of Parkinson’s Disease Using Machine Learning Models—A Classifier Analysis

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Parameter selection for GA

1.1 Generation size

1.2 Population size

1.3 Crossover probability

1.4 Mutation probability

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Comparative Analysis of Machine Learning classifiers for Dysphonia-based classification of Parkinson’s Disease

Abstract

Access this article

Similar content being viewed by others

Parkinson’s Disease Detection from Voice and Speech Data Using Machine Learning

Performance of machine learning methods in diagnosing Parkinson’s disease based on dysphonia measures

Prediction of Parkinson’s Disease Using Machine Learning Models—A Classifier Analysis

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Parameter selection for GA

Appendix A: Parameter selection for GA

1.1 Generation size

1.2 Population size

1.3 Crossover probability

1.4 Mutation probability

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation