Skip to main content
Log in

Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The performance of multitaper perceptual linear prediction (PLP) features of speech samples to discriminate healthy and early stage Parkinson diseased subjects is investigated in this paper. The PLP features are conventionally obtained by computing the power spectrum using a single tapered Hamming window. This estimated spectrum exhibits large variance which can be reduced by computing the weighted average of power spectra obtained using a set of tapered windows, leading to multitaper spectral estimation. In this investigation, two multitaper techniques namely Sine wave taper and Thomson multitaper along with the conventional single taper windowing are investigated. Artificial Neural network is then used to classify the PLP features extracted by applying the three types of window tapers on the speech signals of healthy and early stage Parkinson affected people and their respective performances are compared. The results show more accuracy using the multitaper techniques when compared with the conventional single taper technique. It is seen that the accuracy obtained using Sine wave tapers as well as Thomson multitaper is maximum for five tapers. An improvement in the recognition accuracy by 7.5% using the Sine tapers and by 6.9% using the Thomson tapers is obtained when compared with the conventional method. An improvement in other performance measures like Equal error rate, False positive rate, False negative rate, Sensitivity and Specificity is also observed in the multitaper techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Alam, M. J., Kinnunen, T., Kenny, P., Ouellet, P., & O’Shaughnessy, D. (2013). Multitaper MFCC and PLP features for speaker verification using i-vectors. Speech Communication, 55(2), 237–251.

    Article  Google Scholar 

  • Attabi, Y., Alam, M. J., Dumouchel, P., Kenny, P., & O’Shaughnessy, D. (2013). Multiple Windowed Spectral Features for Emotion Recognition. In Proceedings of the 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP13) pp. 7527–7531, Vancouver: IEEE.

    Chapter  Google Scholar 

  • Benba, A., Jilbab, A., & Hammouch, A. (2014). Voice analysis for detecting persons with Parkinson’s disease using MFCC and VQ. Recent Advances in Electrical Engineering and Computer Science, 96–100.

  • Benba, A., Jilbab, A., & Hammouch, A. (2016). Discriminating between Patients with Parkinson’s and neurological diseases using cepstral analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 24(10), 1100–1108.

    Article  Google Scholar 

  • Benba, A., Jilbab, A., Hammouch, A., & Sandabad, S. (2015). Voiceprints analysis using MFCC and SVM for detecting patients with Parkinson’s disease. In IEEE 1st International Conference on Electrical and Information Technologies ICEIT2015, pp 300–304.

  • Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transaction Acoustics Speech Signal Process, 28(2), 357–366.

    Article  Google Scholar 

  • Diez, M., Penagarikano, M., Bordel, G., Varona, A., & Rodriguez-Fuentes, L. J. (2014). On the complementarity of short-time Fourier analysis windows of different lengths for improved language recognition. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH14), pp. 3032–3036, Singapore.

  • Godino-Llorente, J. I., Gomez-Vilda, P., & Blanco-Velasco, M. (2006). Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Transactions on Biomedical Engineering, 53(10), 1943–1953.

    Article  Google Scholar 

  • Gupta, S. P. (2007). Statistical Methods (35th ed.). New Delhi: Educational Publishers.

    Google Scholar 

  • Hansson, M., & Salomonsson, G. (1997). A multiple window method for estimation of peaked spectra. IEEE Transactions on Signal Processing, 45(3), 778–781.

    Article  Google Scholar 

  • Hansson-Sandsten, M., & Sandberg, J. (2009). Optimal cepstrum estimation using multiple windows. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009, pp. 3077–3080.

  • Hermansky, H. (1990). Perceptual linear prediction (PLP) analysis of speech. The Journal of the Acoustical Society of America, 87(4), 1738–1752.

    Article  Google Scholar 

  • Hornykiewicz, O. (1998). Biochemical aspects of Parkinson’s disease. Neurology 51(2 Suppl 2):S2–S9.

    Article  Google Scholar 

  • Kay, S. M. (1988). Modern spectral estimation. Englewood Cliffs, NJ: Prentice-Hall.

    MATH  Google Scholar 

  • Kinnunen, T., Saeidi, R., Sandberg, J., & Hansson-Sandsten, M. (2010). What else is new than the hamming window? Robust MFCCs’ for speaker recognition via multitapering, In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH10), Makuhari, pp 2734–2737.

  • Kinnunen, T., Saeidi, R., Sedlak, F., et al. (2012). Low-variance multitaper MFCC features: A case study in robust speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 20(7), 1990–2001.

    Article  Google Scholar 

  • Orozco-Arroyave, J. R. et al. (2013). Perceptual analysis of speech signals from people with Parkinson’s disease. In IWINAC 2013, Part 1, LNCS 7930, pp. 201–211. Berlin Heidelberg: Springer-Verlag.

    Google Scholar 

  • Riedel, K. S., & Sidorenko, A. (1995). Minimum bias multiple taper spectral estimation. IEEE Transactions on Signal Processing, 43(1), 188–195.

    Article  Google Scholar 

  • Rusz, J., & Cmejla, R. (2011). Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson’s disease. Journal of the Acoustical Society of America, 129(1), 350–367.

    Article  Google Scholar 

  • Sandberg, J., Hansson-Sandsten, M., Kinnunen, T., Saeidi, R., Flandrin, P., & Borgnat, P. (2010). Multitaper estimation of frequency warped cepstra with application to speaker verification. IEEE Signal Processing Letters, 17(4), 343–346.

    Article  Google Scholar 

  • Shahbakhi, M., Far, D. T., Tahami, E. (2014). Speech analysis for diagnosis of Parkinson’s disease using genetic algorithm and support vector machine. Journal of Biomedical Science and Engineering, 7, 147–156.

    Article  Google Scholar 

  • Slepian, D., & Pollak, H. O. (1960). Prolate spheroidal wave functions, Fourier analysis and uncertainty. I. Bell Labs Technical Journal, 40, 43–63.

    Article  MathSciNet  MATH  Google Scholar 

  • Thomson, D. J. (1982). Spectrum estimation and harmonic analysis. Proceedings of the IEEE, 70(9), 1055–1096.

    Article  Google Scholar 

  • Tsanas, A., Little, M. A., McSharry, P. E., Spielman, J., & Ramig, L. O. (2012). Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 59(5), 1264–1271.

    Article  Google Scholar 

Download references

Acknowledgements

We would sincerely like to thank the Parkinson Disease Movement Disorder Society (PDMDS) of India for allowing us to collect speech samples of participants from various PDMDS centers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Savitha S. Upadhya.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Upadhya, S.S., Cheeran, A.N. & Nirmal, J.H. Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons. Int J Speech Technol 21, 391–399 (2018). https://doi.org/10.1007/s10772-017-9473-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-017-9473-6

Keywords

Navigation