Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons

Upadhya, Savitha S.; Cheeran, A. N.; Nirmal, J. H.

doi:10.1007/s10772-017-9473-6

Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons

Published: 15 November 2017

Volume 21, pages 391–399, (2018)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Savitha S. Upadhya¹,
A. N. Cheeran¹ &
J. H. Nirmal²

336 Accesses
5 Citations
Explore all metrics

Abstract

The performance of multitaper perceptual linear prediction (PLP) features of speech samples to discriminate healthy and early stage Parkinson diseased subjects is investigated in this paper. The PLP features are conventionally obtained by computing the power spectrum using a single tapered Hamming window. This estimated spectrum exhibits large variance which can be reduced by computing the weighted average of power spectra obtained using a set of tapered windows, leading to multitaper spectral estimation. In this investigation, two multitaper techniques namely Sine wave taper and Thomson multitaper along with the conventional single taper windowing are investigated. Artificial Neural network is then used to classify the PLP features extracted by applying the three types of window tapers on the speech signals of healthy and early stage Parkinson affected people and their respective performances are compared. The results show more accuracy using the multitaper techniques when compared with the conventional single taper technique. It is seen that the accuracy obtained using Sine wave tapers as well as Thomson multitaper is maximum for five tapers. An improvement in the recognition accuracy by 7.5% using the Sine tapers and by 6.9% using the Thomson tapers is obtained when compared with the conventional method. An improvement in other performance measures like Equal error rate, False positive rate, False negative rate, Sensitivity and Specificity is also observed in the multitaper techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parkinson’s Disease Recognition from Speech Signal Using Discrete Wavelet Transform, Delta, Delta-Delta, and K-Nearest Neighbor

Perceptual Analysis of Speech Signals from People with Parkinson’s Disease

Parkinson’s Disease Recognition by Speech Acoustic Parameters Classification

References

Alam, M. J., Kinnunen, T., Kenny, P., Ouellet, P., & O’Shaughnessy, D. (2013). Multitaper MFCC and PLP features for speaker verification using i-vectors. Speech Communication, 55(2), 237–251.
Article Google Scholar
Attabi, Y., Alam, M. J., Dumouchel, P., Kenny, P., & O’Shaughnessy, D. (2013). Multiple Windowed Spectral Features for Emotion Recognition. In Proceedings of the 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’13) pp. 7527–7531, Vancouver: IEEE.
Chapter Google Scholar
Benba, A., Jilbab, A., & Hammouch, A. (2014). Voice analysis for detecting persons with Parkinson’s disease using MFCC and VQ. Recent Advances in Electrical Engineering and Computer Science, 96–100.
Benba, A., Jilbab, A., & Hammouch, A. (2016). Discriminating between Patients with Parkinson’s and neurological diseases using cepstral analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 24(10), 1100–1108.
Article Google Scholar
Benba, A., Jilbab, A., Hammouch, A., & Sandabad, S. (2015). Voiceprints analysis using MFCC and SVM for detecting patients with Parkinson’s disease. In IEEE 1st International Conference on Electrical and Information Technologies ICEIT’2015, pp 300–304.
Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transaction Acoustics Speech Signal Process, 28(2), 357–366.
Article Google Scholar
Diez, M., Penagarikano, M., Bordel, G., Varona, A., & Rodriguez-Fuentes, L. J. (2014). On the complementarity of short-time Fourier analysis windows of different lengths for improved language recognition. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH’14), pp. 3032–3036, Singapore.
Godino-Llorente, J. I., Gomez-Vilda, P., & Blanco-Velasco, M. (2006). Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Transactions on Biomedical Engineering, 53(10), 1943–1953.
Article Google Scholar
Gupta, S. P. (2007). Statistical Methods (35th ed.). New Delhi: Educational Publishers.
Google Scholar
Hansson, M., & Salomonsson, G. (1997). A multiple window method for estimation of peaked spectra. IEEE Transactions on Signal Processing, 45(3), 778–781.
Article Google Scholar
Hansson-Sandsten, M., & Sandberg, J. (2009). Optimal cepstrum estimation using multiple windows. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009, pp. 3077–3080.
Hermansky, H. (1990). Perceptual linear prediction (PLP) analysis of speech. The Journal of the Acoustical Society of America, 87(4), 1738–1752.
Article Google Scholar
Hornykiewicz, O. (1998). Biochemical aspects of Parkinson’s disease. Neurology 51(2 Suppl 2):S2–S9.
Article Google Scholar
Kay, S. M. (1988). Modern spectral estimation. Englewood Cliffs, NJ: Prentice-Hall.
MATH Google Scholar
Kinnunen, T., Saeidi, R., Sandberg, J., & Hansson-Sandsten, M. (2010). What else is new than the hamming window? Robust MFCCs’ for speaker recognition via multitapering, In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10), Makuhari, pp 2734–2737.
Kinnunen, T., Saeidi, R., Sedlak, F., et al. (2012). Low-variance multitaper MFCC features: A case study in robust speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 20(7), 1990–2001.
Article Google Scholar
Orozco-Arroyave, J. R. et al. (2013). Perceptual analysis of speech signals from people with Parkinson’s disease. In IWINAC 2013, Part 1, LNCS 7930, pp. 201–211. Berlin Heidelberg: Springer-Verlag.
Google Scholar
Riedel, K. S., & Sidorenko, A. (1995). Minimum bias multiple taper spectral estimation. IEEE Transactions on Signal Processing, 43(1), 188–195.
Article Google Scholar
Rusz, J., & Cmejla, R. (2011). Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson’s disease. Journal of the Acoustical Society of America, 129(1), 350–367.
Article Google Scholar
Sandberg, J., Hansson-Sandsten, M., Kinnunen, T., Saeidi, R., Flandrin, P., & Borgnat, P. (2010). Multitaper estimation of frequency warped cepstra with application to speaker verification. IEEE Signal Processing Letters, 17(4), 343–346.
Article Google Scholar
Shahbakhi, M., Far, D. T., Tahami, E. (2014). Speech analysis for diagnosis of Parkinson’s disease using genetic algorithm and support vector machine. Journal of Biomedical Science and Engineering, 7, 147–156.
Article Google Scholar
Slepian, D., & Pollak, H. O. (1960). Prolate spheroidal wave functions, Fourier analysis and uncertainty. I. Bell Labs Technical Journal, 40, 43–63.
Article MathSciNet MATH Google Scholar
Thomson, D. J. (1982). Spectrum estimation and harmonic analysis. Proceedings of the IEEE, 70(9), 1055–1096.
Article Google Scholar
Tsanas, A., Little, M. A., McSharry, P. E., Spielman, J., & Ramig, L. O. (2012). Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 59(5), 1264–1271.
Article Google Scholar

Download references

Acknowledgements

We would sincerely like to thank the Parkinson Disease Movement Disorder Society (PDMDS) of India for allowing us to collect speech samples of participants from various PDMDS centers.

Author information

Authors and Affiliations

Electrical Engineering Department, Veermata Jijabai Technological Institute, Matunga, Mumbai, Maharashtra, 400019, India
Savitha S. Upadhya & A. N. Cheeran
Electronics Engineering Department, K J Somaiya College of Engineering, Vidyavihar, Mumbai, Maharashtra, 400077, India
J. H. Nirmal

Authors

Savitha S. Upadhya
View author publications
You can also search for this author in PubMed Google Scholar
A. N. Cheeran
View author publications
You can also search for this author in PubMed Google Scholar
J. H. Nirmal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Savitha S. Upadhya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Upadhya, S.S., Cheeran, A.N. & Nirmal, J.H. Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons. Int J Speech Technol 21, 391–399 (2018). https://doi.org/10.1007/s10772-017-9473-6

Download citation

Received: 13 July 2017
Accepted: 02 November 2017
Published: 15 November 2017
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10772-017-9473-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons

Abstract

Access this article

Similar content being viewed by others

Parkinson’s Disease Recognition from Speech Signal Using Discrete Wavelet Transform, Delta, Delta-Delta, and K-Nearest Neighbor

Perceptual Analysis of Speech Signals from People with Parkinson’s Disease

Parkinson’s Disease Recognition by Speech Acoustic Parameters Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons

Abstract

Access this article

Similar content being viewed by others

Parkinson’s Disease Recognition from Speech Signal Using Discrete Wavelet Transform, Delta, Delta-Delta, and K-Nearest Neighbor

Perceptual Analysis of Speech Signals from People with Parkinson’s Disease

Parkinson’s Disease Recognition by Speech Acoustic Parameters Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation