Abstract
Parkinson’s disease (PD) is a neurodegenerative condition of the nervous system that impairs movement caused by brain cell degeneration and it is most common after Alzheimer’s disease seen in old people. One of the most significant symptoms in the first stages of PD is vocal impairment. As a consequence, current research has focused extensively on the possibilities for identifying PD through the analysis of speech and voice patterns. In this paper, we propose a new architecture for extracting features from audio files using mel-frequency cepstral coefficients, tonnetz representation, chromagram, spectral contrast, and mel-scale spectrograms, followed by feeding these generated features into a genetic algorithm for feature selection. Depending on acoustic features, the genetic algorithm is combined with the support vector machine to enhance the diagnosis accuracy of PD patients. The maximum classification accuracy of 98% is attained with /vowel-i/, and 90% with /grito/. The results indicate that the proposed feature combination is capable of correctly diagnosing PD in a person based only on that person’s speech.
Similar content being viewed by others
Data availability
The data that support the findings of this study are not openly available. Data may be available upon reasonable request.
References
Ali, L., Zhu, C., Zhang, Z., & Liu, Y. (2019). Automated detection of Parkinson’s disease based on multiple types of sustained phonations using linear discriminant analysis and genetically optimized neural network. IEEE Journal of Translational Engineering in Health and Medicine, 7, 1–10.
Corte, C., & Vapnik, V. (1995). Support vector machines. Machine Learning, 20, 273–297.
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press.
Deb, S., Warule, P., Nair, A., Sultan, H., Dash, R., & Krajewski, J. (2022). Detection of common cold from speech signals using deep neural network. Circuits, Systems, and Signal Processing, 1–16.
Elshewey, A. M., Shams, M. Y., El-Rashidy, N., Elhady, A. M., Shohieb, S. M., & Tarek, Z. (2023). Bayesian optimization with support vector machine model for Parkinson disease classification. Sensors, 23(4), 2085.
Er, M. B., Isik, E., & Isik, I. (2021). Parkinson’s detection based on combined CNN and LSTM using enhanced speech signals with variational mode decomposition. Biomedical Signal Processing and Control, 70, 103006.
Goldberg, D. E. (2013). Genetic algorithms. Pearson Education India.
Hamida, S., El Gannour, O., Cherradi, B., Ouajji, H., & Raihani, A. (2020). Optimization of machine learning algorithms hyper-parameters for improving the prediction of patients infected with COVID-19, In IEEE 2nd international conference on electronics, control, optimization and computer science (ICECOCS) (pp. 1–6). IEEE.
Harte, C., Sandler, M., & Gasser, M. (2006). Detecting harmonic change in musical audio. In Proceedings of the 1st ACM workshop on audio and music computing multimedia (pp. 21–26).
Hireš, M., Gazda, M., Drotár, P., Pah, N. D., Motin, M. A., & Kumar, D. K. (2022). Convolutional neural network ensemble for Parkinson’s disease detection from voice recordings. Computers in Biology and Medicine, 141, 105021.
Hosny, M., Zhu, M., Gao, W., & Fu, Y. (2022). A novel deep learning model for STN localization from IFPS in Parkinson’s disease. Biomedical Signal Processing and Control, 77, 103830.
Issa, D., Demirci, M. F., & Yazici, A. (2020). Speech emotion recognition with deep convolutional neural networks. Biomedical Signal Processing and Control, 59, 101894.
Karan, B., & Sahu, S. S. (2021). An improved framework for Parkinson’s disease prediction using variational mode decomposition-Hilbert spectrum of speech signal. Biocybernetics and Biomedical Engineering, 41(2), 717–732.
Karan, B., Sahu, S. S., Orozco-Arroyave, J. R., & Mahto, K. (2020). Hilbert spectrum analysis for automatic detection and evaluation of Parkinson’s speech. Biomedical Signal Processing and Control, 61, 102050.
Lahmiri, S., Dawson, D. A., & Shmuel, A. (2018). Performance of machine learning methods in diagnosing Parkinson’s disease based on dysphonia measures. Biomedical Engineering Letters, 8, 29–39.
Logemann, J. A., Fisher, H. B., Boshes, B., & Blonsky, E. R. (1978). Frequency and cooccurrence of vocal tract dysfunctions in the speech of a large sample of Parkinson patients. Journal of Speech and hearing Disorders, 43(1), 47–57.
McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M., Battenberg, E., & Nieto, O. (2015). Librosa: Audio and music signal analysis in python. In Proceedings of the 14th python in science conference (Vol. 8, pp. 18–25)
Mishra, S. P., Warule, P., & Deb, S. (2023). Chirplet transform based time frequency analysis of speech signal for automated speech emotion recognition. Speech Communication, 102986.
Mishra, S. P., Warule, P., & Deb, S. (2023). Deep learning based emotion classification using MEL frequency magnitude coefficient. In 2023 1st international conference on innovations in high speed communication and signal processing (IHCSP) (pp. 93–98). IEEE
Mishra, S. P., Warule, P., & Deb, S. (2023). Speech emotion recognition using MFCC-based entropy feature. Signal, Image and Video Processing, 1–9
Mishra, S. P., Warule, P., & Deb, S. (2023). Variational mode decomposition based acoustic and entropy features for speech emotion recognition. Applied Acoustics, 212, 109578.
Narendra, N., Schuller, B., & Alku, P. (2021). The detection of parkinson’s disease from speech using voice source information. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 1925–1936.
Nayak, S. S., Darji, A. D., & Shah, P. K. (2023). Machine learning approach for detecting COVID-19 from speech signal using MEL frequency magnitude coefficient. Signal, Image and Video Processing, 1–8.
Nishat, M. M., Hasan, T., Nasrullah, S. M., Faisal, F., Asif, M. A.-A.-R., & Hoque, M. A. (2021). Detection of Parkinson’s disease by employing boosting algorithms, In 2021 joint 10th international conference on informatics, electronics & vision (ICIEV) and 2021 5th international conference on imaging, vision & pattern recognition (icIVPR) (pp. 1–7). IEEE
Orozco-Arroyave, J. R., Arias-Londoño, J. D., Vargas-Bonilla, J. F., Gonzalez-Rátiva, M. C., & Nöth, E. (2014). New Spanish speech corpus database for the analysis of people suffering from Parkinson’s disease. In LREC (pp. 342–347)
Orozco-Arroyave, J. R., Hönig, F., Arias-Londoño, J. D., Vargas-Bonilla, J. F., & Nöth, E. (2015). Spectral and cepstral analyses for Parkinson’s disease detection in Spanish vowels and words. Expert Systems, 32(6), 688–697.
Perez, K. S., Ramig, L. O., Smith, M. E., & Dromey, C. (1996). The Parkinson larynx: Tremor and videostroboscopic findings. Journal of Voice, 10(4), 354–361.
Polat, K., & Nour, M. (2020). Parkinson disease classification using one against all based data sampling with the acoustic features from the speech signals. Medical Hypotheses, 140, 109678.
Prabhavathi, K., & Patil, S. (2022). Tremors and bradykinesia. Techniques for Assessment of Parkinsonism for Diagnosis and Rehabilitation, 135–149.
Quan, C., Ren, K., Luo, Z., Chen, Z., & Ling, Y. (2022). End-to-end deep learning approach for Parkinson’s disease detection from speech signals. Biocybernetics and Biomedical Engineering, 42(2), 556–574.
Rueda, A., Vásquez-Correa, J. C., Rios-Urrego, C. D., Orozco-Arroyave, J. R., Krishnan, S., & Nöth, E. (2019). Feature representation of pathophysiology of Parkinsonian dysarthria. In Interspeech (pp. 3048–3052).
Senturk, Z. K. (2020). Early diagnosis of Parkinson’s disease using machine learning algorithms. Medical Hypotheses, 138, 109603.
Tarunika, K., Pradeeba, R., & Aruna, P. (2018). Applying machine learning techniques for speech emotion recognition. In 2018 9th international conference on computing. IEEE communication and networking technologies (ICCCNT) (pp. 1–5)
Trigeorgis, G., Ringeval, F., Brueckner, R., Marchi, E., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2016). Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network, In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5200–5204). IEEE
Tsanas, A., Little, M., McSharry, P., & Ramig, L. (2009). Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests. Nature Precedings, 1.
Tsanas, A., Little, M. A., McSharry, P. E., Spielman, J., & Ramig, L. O. (2012). Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 59(5), 1264–1271.
Vasquez-Correa, J. C., Arias-Vergara, T., Schuster, M., Orozco-Arroyave, J. R., & Nöth, E. (2020). Parallel representation learning for the classification of pathological speech: Studies on Parkinson’s disease and cleft lip and palate. Speech Communication, 122, 56–67.
Warule, P., Mishra, S. P., & Deb, S. (2022). Significance of voiced and unvoiced speech segments for the detection of common cold. Signal, Image and Video Processing, 1–8
Warule, P., Mishra, S. P., & Deb, S. (2023). Time-frequency analysis of speech signal using Chirplet transform for automatic diagnosis of Parkinson’s disease. Biomedical Engineering Letters, 1–11.
Warule, P., Mishra, S. P., Deb, S., & Krajewski, J. (2023). Sinusoidal model-based diagnosis of the common cold from the speech signal. Biomedical Signal Processing and Control, 83, 104653.
Wodzinski, M., Skalski, A., Hemmerling, D., Orozco-Arroyave, J. R., & Nöth, E. (2019). Deep learning approach to Parkinson’s disease detection using voice recordings and convolutional neural network dedicated to image classification, In 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC) (pp. 717–720). IEEE.
Yaman, O., Ertam, F., & Tuncer, T. (2020). Automated Parkinson’s disease recognition based on statistical pooling method using acoustic features. Medical Hypotheses, 135, 109483.
Zahid, L., Maqsood, M., Durrani, M. Y., Bakhtyar, M., Baber, J., Jamal, H., Mehmood, I., & Song, O.-Y. (2020). A spectrogram-based deep feature assisted computer-aided diagnostic system for Parkinson’s disease. IEEE Access, 8, 35482–35495.
Zhang, T., Lin, L., & Xue, Z. (2023). A voice feature extraction method based on fractional attribute topology for Parkinson’s disease detection. Expert Systems with Applications, 219, 119650.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
SSN conducted research and wrote the paper. ADD and PKS participated in the writing and preparation of the paper. All authors reviewed the results and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Ethical Approval
Not applicable.
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nayak, S.S., Darji, A.D. & Shah, P.K. Identification of Parkinson’s disease from speech signal using machine learning approach. Int J Speech Technol 26, 981–990 (2023). https://doi.org/10.1007/s10772-023-10068-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-023-10068-3