Abstract
The performance of multitaper perceptual linear prediction (PLP) features of speech samples to discriminate healthy and early stage Parkinson diseased subjects is investigated in this paper. The PLP features are conventionally obtained by computing the power spectrum using a single tapered Hamming window. This estimated spectrum exhibits large variance which can be reduced by computing the weighted average of power spectra obtained using a set of tapered windows, leading to multitaper spectral estimation. In this investigation, two multitaper techniques namely Sine wave taper and Thomson multitaper along with the conventional single taper windowing are investigated. Artificial Neural network is then used to classify the PLP features extracted by applying the three types of window tapers on the speech signals of healthy and early stage Parkinson affected people and their respective performances are compared. The results show more accuracy using the multitaper techniques when compared with the conventional single taper technique. It is seen that the accuracy obtained using Sine wave tapers as well as Thomson multitaper is maximum for five tapers. An improvement in the recognition accuracy by 7.5% using the Sine tapers and by 6.9% using the Thomson tapers is obtained when compared with the conventional method. An improvement in other performance measures like Equal error rate, False positive rate, False negative rate, Sensitivity and Specificity is also observed in the multitaper techniques.
Similar content being viewed by others
References
Alam, M. J., Kinnunen, T., Kenny, P., Ouellet, P., & O’Shaughnessy, D. (2013). Multitaper MFCC and PLP features for speaker verification using i-vectors. Speech Communication, 55(2), 237–251.
Attabi, Y., Alam, M. J., Dumouchel, P., Kenny, P., & O’Shaughnessy, D. (2013). Multiple Windowed Spectral Features for Emotion Recognition. In Proceedings of the 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’13) pp. 7527–7531, Vancouver: IEEE.
Benba, A., Jilbab, A., & Hammouch, A. (2014). Voice analysis for detecting persons with Parkinson’s disease using MFCC and VQ. Recent Advances in Electrical Engineering and Computer Science, 96–100.
Benba, A., Jilbab, A., & Hammouch, A. (2016). Discriminating between Patients with Parkinson’s and neurological diseases using cepstral analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 24(10), 1100–1108.
Benba, A., Jilbab, A., Hammouch, A., & Sandabad, S. (2015). Voiceprints analysis using MFCC and SVM for detecting patients with Parkinson’s disease. In IEEE 1st International Conference on Electrical and Information Technologies ICEIT’2015, pp 300–304.
Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transaction Acoustics Speech Signal Process, 28(2), 357–366.
Diez, M., Penagarikano, M., Bordel, G., Varona, A., & Rodriguez-Fuentes, L. J. (2014). On the complementarity of short-time Fourier analysis windows of different lengths for improved language recognition. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH’14), pp. 3032–3036, Singapore.
Godino-Llorente, J. I., Gomez-Vilda, P., & Blanco-Velasco, M. (2006). Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Transactions on Biomedical Engineering, 53(10), 1943–1953.
Gupta, S. P. (2007). Statistical Methods (35th ed.). New Delhi: Educational Publishers.
Hansson, M., & Salomonsson, G. (1997). A multiple window method for estimation of peaked spectra. IEEE Transactions on Signal Processing, 45(3), 778–781.
Hansson-Sandsten, M., & Sandberg, J. (2009). Optimal cepstrum estimation using multiple windows. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009, pp. 3077–3080.
Hermansky, H. (1990). Perceptual linear prediction (PLP) analysis of speech. The Journal of the Acoustical Society of America, 87(4), 1738–1752.
Hornykiewicz, O. (1998). Biochemical aspects of Parkinson’s disease. Neurology 51(2 Suppl 2):S2–S9.
Kay, S. M. (1988). Modern spectral estimation. Englewood Cliffs, NJ: Prentice-Hall.
Kinnunen, T., Saeidi, R., Sandberg, J., & Hansson-Sandsten, M. (2010). What else is new than the hamming window? Robust MFCCs’ for speaker recognition via multitapering, In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH’10), Makuhari, pp 2734–2737.
Kinnunen, T., Saeidi, R., Sedlak, F., et al. (2012). Low-variance multitaper MFCC features: A case study in robust speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 20(7), 1990–2001.
Orozco-Arroyave, J. R. et al. (2013). Perceptual analysis of speech signals from people with Parkinson’s disease. In IWINAC 2013, Part 1, LNCS 7930, pp. 201–211. Berlin Heidelberg: Springer-Verlag.
Riedel, K. S., & Sidorenko, A. (1995). Minimum bias multiple taper spectral estimation. IEEE Transactions on Signal Processing, 43(1), 188–195.
Rusz, J., & Cmejla, R. (2011). Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson’s disease. Journal of the Acoustical Society of America, 129(1), 350–367.
Sandberg, J., Hansson-Sandsten, M., Kinnunen, T., Saeidi, R., Flandrin, P., & Borgnat, P. (2010). Multitaper estimation of frequency warped cepstra with application to speaker verification. IEEE Signal Processing Letters, 17(4), 343–346.
Shahbakhi, M., Far, D. T., Tahami, E. (2014). Speech analysis for diagnosis of Parkinson’s disease using genetic algorithm and support vector machine. Journal of Biomedical Science and Engineering, 7, 147–156.
Slepian, D., & Pollak, H. O. (1960). Prolate spheroidal wave functions, Fourier analysis and uncertainty. I. Bell Labs Technical Journal, 40, 43–63.
Thomson, D. J. (1982). Spectrum estimation and harmonic analysis. Proceedings of the IEEE, 70(9), 1055–1096.
Tsanas, A., Little, M. A., McSharry, P. E., Spielman, J., & Ramig, L. O. (2012). Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 59(5), 1264–1271.
Acknowledgements
We would sincerely like to thank the Parkinson Disease Movement Disorder Society (PDMDS) of India for allowing us to collect speech samples of participants from various PDMDS centers.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Upadhya, S.S., Cheeran, A.N. & Nirmal, J.H. Multitaper perceptual linear prediction features of voice samples to discriminate healthy persons from early stage Parkinson diseased persons. Int J Speech Technol 21, 391–399 (2018). https://doi.org/10.1007/s10772-017-9473-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-017-9473-6