Fusing Various Audio Feature Sets for Detection of Parkinson’s Disease from Sustained Voice and Speech Recordings

Vaiciukynas, Evaldas; Verikas, Antanas; Gelzinis, Adas; Bacauskiene, Marija; Vaskevicius, Kestutis; Uloza, Virgilijus; Padervinskis, Evaldas; Ciceliene, Jolita

doi:10.1007/978-3-319-43958-7_39

Evaldas Vaiciukynas^16,17,
Antanas Verikas^16,18,
Adas Gelzinis¹⁶,
Marija Bacauskiene¹⁶,
Kestutis Vaskevicius¹⁶,
Virgilijus Uloza¹⁹,
Evaldas Padervinskis¹⁹ &
…
Jolita Ciceliene²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9811))

Included in the following conference series:

International Conference on Speech and Computer

2358 Accesses
1 Citations

Abstract

The aim of this study is the analysis of voice and speech recordings for the task of Parkinson’s disease detection. Voice modality corresponds to sustained phonation /a/ and speech modality to a short sentence in Lithuanian language. Diverse information from recordings is extracted by 22 well-known audio feature sets. Random forest is used as a learner, both for individual feature sets and for decision-level fusion. Essentia descriptors were found as the best individual feature set, achieving equal error rate of 16.3 % for voice and 13.3 % for speech. Fusion of feature sets and modalities improved detection and achieved equal error rate of 10.8 %. Variable importance in fusion revealed speech modality as more important than voice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bogdanov, D., Wack, N., Gómez, E., Gulati, S., Herrera, P., Mayor, O., Roma, G., Salamon, J., Zapata, J., Serra, X.: Essentia: an audio analysis library for music information retrieval. In: International Society for Music Information Retrieval Conference (ISMIR), pp. 493–498. Curitiba, Brazil, 4–8 November 2013. http://essentia.upf.edu
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Brümmer, N., de Villiers, E.: The BOSARIS toolkit: Theory, algorithms and code for surviving the new DCF. arXiv 1304(2865v1), 1–23, Presented at the NIST SRE 2011 Analysis Workshop, Atlanta, December 2011. http://sites.google.com/site/bosaristoolkit/
Crysandt, H., Tummarello, G., Piazza, F.: MPEG-7 encoding and processing: MPEG7AUDIOENC + MPEG7AUDIODB. In: 3rd MUSICNETWORK Open Workshop: MPEG AHG on Music Notation Requirements. Munich, Germany, 13–14 March 2004. http://mpeg7audioenc.sf.net
de Rijk, M.C., Launer, L.J., Berger, K., Breteler, M.M.B., Dartigues, J.F., Baldereschi, M., Fratiglioni, L., Lobo, A., Martínez-Lage, J.M., Trenkwalder, C., Hofman, A.: Prevalence of Parkinson’s disease in Europe: a collaborative study of population-based cohorts. Neurology 54(11 Supply 5), S21–S23 (2000). Neurologic Diseases in the Elderly Research Group
Google Scholar
Ellis, D.P.W.: PLP and RASTA (and MFCC, and inversion) in Matlab (2005). Matlab implementation of popular speech recognition feature extraction including MFCC and PLP (as defined by Hermansky and Morgan), http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, http://www.ee.columbia.edu/%7Edpwe/resources/matlab/rastamat/
Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia (MM), pp. 835–838. ACM Press, Barcelona, Spain, 21–25 October 2013. http://audeering.com/research/opensmile/
Gelzinis, A., Verikas, A., Bacauskiene, M.: Automated speech analysis applied to laryngeal disease categorization. Comput. Methods Programs Biomed. 91(1), 36–47 (2008)
Article Google Scholar
Guyon, I.: Practical Feature Selection: from Correlation to Causality, NATO Science for Peace and Security Series D: Information and Communication Security, vol. 19, Chap. 3, pp. 27–43. IOS Press (2008)
Google Scholar
Jaiantilal, A.: Random forest (regression, classification and clustering) implementation for Matlab (and standalone) (2012). http://code.google.com/archive/p/randomforest-matlab/
Mathieu, B., Essid, S., Fillon, T., Prado, J., Richard, G.: YAAFE, an easy to use and efficient audio feature extraction software. In: Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 441–446. Utrecht, Netherlands, 9–13 August 2010. http://yaafe.sf.net
McEnnis, D., McKay, C., Fujinaga, I.: jAudio: Additions and improvements. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR), pp. 385–386. University of Victoria, Victoria, British Columbia, Canada, 8–12 October 2006. http://github.com/dmcennis/jAudioGIT
Nilsson, R., Peña, J.M., Björkegren, J., Tegnér, J.: Evaluating feature selection for svms in high dimensions. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 719–726. Springer, Heidelberg (2006)
Chapter Google Scholar
Orozco-Arroyave, J.R., Hönig, F., Arias-Londoño, J.D., Vargas-Bonilla, J.F., Daqrouq, K., Skodda, S., Rusz, J., Nöth, E.: Automatic detection of Parkinson’s disease in running speech spoken in three different languages. J. Acoust. Soc. Am. 139(1), 481–500 (2016)
Article Google Scholar
Sakar, C.O., Kursun, O.: Telediagnosis of Parkinson’s disease using measurements of dysphonia. J. Med. Syst. 34(4), 591–599 (2010)
Article Google Scholar
Sáenz-Lechón, N., Godino-Llorente, J.I., Osma-Ruiz, V., Gómez-Vilda, P.: Methodological issues in the development of automatic systems for voice pathology detection. Biomed. Signal Process. Control 1(2), 120–128 (2006). Voice Models and Analysis for Biomedical Applications
Article MATH Google Scholar
Tsanas, A.: Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statistical machine learning. Ph.D. thesis, Oxford Centre for Industrial and Applied Mathematics, University of Oxford, Oxford, United Kingdom, http://people.maths.ox.ac.uk/tsanas/software.html
Tsanas, A., Little, M.A., McSharry, P.E., Spielman, J.L., Ramig, L.O.: Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Trans. Biomed. Eng. 59(5), 1264–1271 (2012)
Article Google Scholar
Verikas, A., Gelzinis, A., Vaiciukynas, E., Bacauskiene, M., Minelga, J., Hallander, M., Uloza, V., Padervinskis, E.: Data dependent random forest applied to screening for laryngeal disorders through analysis of sustained phonation: acoustic versus contact microphone. Med.Eng. Phys. 37(2), 210–218 (2015)
Article Google Scholar
Xu, H., Caramanis, C., Mannor, S.: Sparse algorithms are not stable: a no-free-lunch theorem. IEEE Trans. Pattern Anal. Mach. Intell. 34(1), 187–193 (2012)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This research was funded by a grant (No. MIP-075/2015) from the Research Council of Lithuania.

Author information

Authors and Affiliations

Department of Electrical Power Systems, Kaunas University of Technology, Kaunas, Lithuania
Evaldas Vaiciukynas, Antanas Verikas, Adas Gelzinis, Marija Bacauskiene & Kestutis Vaskevicius
Department of Information Systems, Kaunas University of Technology, Kaunas, Lithuania
Evaldas Vaiciukynas
Intelligent Systems Laboratory, Centre for Applied Intelligent Systems Research, Halmstad University, Halmstad, Sweden
Antanas Verikas
Department of Otolaryngology, Lithuanian University of Health Sciences, Kaunas, Lithuania
Virgilijus Uloza & Evaldas Padervinskis
Department of Neurology, Lithuanian University of Health Sciences, Kaunas, Lithuania
Jolita Ciceliene

Authors

Evaldas Vaiciukynas
View author publications
You can also search for this author in PubMed Google Scholar
Antanas Verikas
View author publications
You can also search for this author in PubMed Google Scholar
Adas Gelzinis
View author publications
You can also search for this author in PubMed Google Scholar
Marija Bacauskiene
View author publications
You can also search for this author in PubMed Google Scholar
Kestutis Vaskevicius
View author publications
You can also search for this author in PubMed Google Scholar
Virgilijus Uloza
View author publications
You can also search for this author in PubMed Google Scholar
Evaldas Padervinskis
View author publications
You can also search for this author in PubMed Google Scholar
Jolita Ciceliene
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Evaldas Vaiciukynas .

Editor information

Editors and Affiliations

SPIIRAS , Saint-Petersburg, Russia
Andrey Ronzhin
Moscow State Linguistic University , Moscow, Russia
Rodmonga Potapova
Budapest University of Technology and Economics, Budapest, Hungary
Géza Németh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vaiciukynas, E. et al. (2016). Fusing Various Audio Feature Sets for Detection of Parkinson’s Disease from Sustained Voice and Speech Recordings. In: Ronzhin, A., Potapova, R., Németh, G. (eds) Speech and Computer. SPECOM 2016. Lecture Notes in Computer Science(), vol 9811. Springer, Cham. https://doi.org/10.1007/978-3-319-43958-7_39

Download citation

DOI: https://doi.org/10.1007/978-3-319-43958-7_39
Published: 13 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43957-0
Online ISBN: 978-3-319-43958-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics