Going Beyond the Cookie Theft Picture Test: Detecting Cognitive Impairments Using Acoustic Features

Braun, Franziska; Erzigkeit, Andreas; Lehfeld, Hartmut; Hillemacher, Thomas; Riedhammer, Korbinian; Bayerl, Sebastian P.

doi:10.1007/978-3-031-16270-1_36

Franziska Braun¹¹,
Andreas Erzigkeit¹²,
Hartmut Lehfeld¹³,
Thomas Hillemacher¹³,
Korbinian Riedhammer¹¹ &
…
Sebastian P. Bayerl¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13502))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1288 Accesses
1 Citations

Abstract

Standardized tests play a crucial role in the detection of cognitive impairment. Previous work demonstrated that automatic detection of cognitive impairment is possible using audio data from a standardized picture description task. The presented study goes beyond that, evaluating our methods on data taken from two standardized neuropsychological tests, namely the German SKT and a German version of the CERAD-NB, and a semi-structured clinical interview between a patient and a psychologist. For the tests, we focus on speech recordings of three sub-tests: reading numbers (SKT 3), interference (SKT 7), and verbal fluency (CERAD-NB 1). We show that acoustic features from standardized tests can be used to reliably discriminate cognitively impaired individuals from non-impaired ones. Furthermore, we provide evidence that even features extracted from random speech samples of the interview can be a discriminator of cognitive impairment. In our baseline experiments, we use OpenSMILE features and Support Vector Machine classifiers. In an improved setup, we show that using wav2vec 2.0 features instead, we can achieve an accuracy of up to 85%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Detecting subtle signs of depression with automated speech analysis in a non-clinical sample

Article Open access 27 December 2022

Unveiling the sound of the cognitive status: Machine Learning-based speech analysis in the Alzheimer’s disease spectrum

Article Open access 02 February 2024

Reading and lexical–semantic retrieval tasks outperforms single task speech analysis in the screening of mild cognitive impairment and Alzheimer's disease

Article Open access 15 June 2023

Notes

1.
Research approved by the Ethics Committee of the Nuremberg Hospital under File No. IRB-2021-021; each subject gave informed consent prior to recording.

References

Aebi, C.: Validierung der neuropsychologischen Testbatterie CERAD-NP : eine Multi-Center Studie (2002). https://doi.org/10.5451/UNIBAS-002728525
Article Google Scholar
Al-Hameed, S., Benaissa, M., Christensen, H.: Simple and robust audio-based detection of biomarkers for Alzheimer’s disease. In: Proceedings of the 7th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT 2016), pp. 32–36 (2016). https://doi.org/10.21437/SLPAT.2016-6
Baevski, A., Hsu, W.N., Conneau, A., Auli, M.: Unsupervised speech recognition. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 27826–27839. Curran Associates, Inc. (2021)
Google Scholar
Baevski, A., Zhou, Y., Mohamed, A., Auli, M.: wav2vec 2.0: a framework for self-supervised learning of speech representations. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 12449–12460. Curran Associates, Inc. (2020)
Google Scholar
Becker, J.T., Boller, F., Lopez, O.L., Saxton, J., McGonigle, K.L.: The natural history of Alzheimer’s disease: description of study cohort and accuracy of diagnosis. Arch. Neurol. 51(6), 585–594 (1994)
Article Google Scholar
Berres, M., Monsch, A.U., Bernasconi, F., Thalmann, B., Stähelin, H.B.: Normal ranges of neuropsychological tests for the diagnosis of Alzheimer’s disease. Stud. Health Technol. Inf. 77, 195–199 (2000)
Google Scholar
Borod, J.C., Goodglass, H., Kaplan, E.: Normative data on the Boston diagnostic aphasia examination, parietal lobe battery, and the Boston naming test. J. Clin. Neuropsychol. 2(3), 209–215 (1980). https://doi.org/10.1080/01688638008403793
Article Google Scholar
Cattell, R.B., Tiner, L.G.: The varieties of structural rigidity. J. Pers. 17(3), 321–341 (1949). https://doi.org/10.1111/j.1467-6494.1949.tb01217.x
Article Google Scholar
Cooper, S.: The clinical assessment of the patient with early dementia. J. Neurol. Neurosurg. Psychiatry 76(suppl_5), v15–v24 (2005). https://doi.org/10.1136/jnnp.2005.081133
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)
Google Scholar
Eyben, F., Wöllmer, M., Schuller, B.: Opensmile: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the International Conference on Multimedia - MM 2010, p. 1459. ACM Press, Firenze, Italy (2010). https://doi.org/10.1145/1873951.1874246
Frankenberg, C., et al.: Verbal fluency in normal aging and cognitive decline: results of a longitudinal study. Comput. Speech Lang. 68, 101195 (2021). https://doi.org/10.1016/j.csl.2021.101195
Article Google Scholar
Fraser, K.C., Meltzer, J.A., Rudzicz, F.: Linguistic features identify Alzheimer’s disease in narrative speech. J. Alzheimer’s disease: JAD 49(2), 407–422 (2016). https://doi.org/10.3233/JAD-150520
Article Google Scholar
König, A., Linz, N., Tröger, J., Wolters, M., Alexandersson, J., Robert, P.: Fully automatic speech-based analysis of the semantic verbal fluency task. Dement. Geriatr. Cogn. Disord. 45(3–4), 198–209 (2018). https://doi.org/10.1159/000487852
Article Google Scholar
König, A., Satt, A., Sorin, A., Hoory, R., Toledo-Ronen, O., Derreumaux, A., Manera, V., Verhey, F., Aalten, P., Robert, P.H., David, R.: Automatic speech analysis for the assessment of patients with predementia and Alzheimer’s disease. Alzheimer’s Dement. Diagn. Assess. Dis. Monit. 1(1), 112–124 (2015). https://doi.org/10.1016/j.dadm.2014.11.012
Article Google Scholar
Luz, S., Haider, F., de la Fuente, S., Fromm, D., MacWhinney, B.: Alzheimer’s dementia recognition through spontaneous speech: the ADReSS challenge. In: Interspeech 2020, pp. 2172–2176. ISCA (2020). https://doi.org/10.21437/Interspeech.2020-2571
Luz, S., Haider, F., de la Fuente, S., Fromm, D., MacWhinney, B.: Detecting cognitive decline using speech only: the ADReSSo challenge. In: Interspeech 2021, pp. 3780–3784. ISCA, August 2021. https://doi.org/10.21437/Interspeech.2021-1220
Morris, J.C., et al.: The consortium to establish a registry for Alzheimer’s disease (CERAD). Part I. Clinical and neuropsychological assesment of Alzheimer’s disease. Neurology 39(9), 1159–1165 (1989). https://doi.org/10.1212/WNL.39.9.1159
Nguyen, D.D., et al.: Acoustic voice characteristics with and without wearing a facemask. Sci. Rep. 11(1), 5651 (2021). https://doi.org/10.1038/s41598-021-85130-8
Article Google Scholar
Orimaye, S.O., Wong, J.S.M., Golden, K.J., Wong, C.P., Soyiri, I.N.: Predicting probable Alzheimer’s disease using linguistic deficits and biomarkers. BMC Bioinf. 18(1), 34 (2017). https://doi.org/10.1186/s12859-016-1456-0
Article Google Scholar
Pepino, L., Riera, P., Ferrer, L.: Emotion recognition from speech using wav2vec 2.0 embeddings. In: Interspeech 2021, pp. 3400–3404. ISCA, August 2021. https://doi.org/10.21437/Interspeech.2021-703
Pérez-Toro, P., et al.: Influence of the interviewer on the automatic assessment of Alzheimer’s disease in the context of the ADReSSo challenge. In: Proceedings of the Interspeech 2021, pp. 3785–3789 (2021)
Google Scholar
Schuller, B.W., et al.: The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation and primates. In: Proceedings INTERSPEECH 2021, 22nd Annual Conference of the International Speech Communication Association. ISCA, Brno, Czechia, September 2021
Google Scholar
Schuller, B., et al.: The INTERSPEECH 2016 computational paralinguistics challenge: deception, sincerity and native language. In: Proceedings of the Interspeech 2016, pp. 2001–2005 (2016). https://doi.org/10.21437/Interspeech.2016-129
Sheehan, B.: Assessment scales in dementia. Ther. Adv. Neurol. Disord. 5(6), 349–358 (2012). https://doi.org/10.1177/1756285612455733
Article Google Scholar
Stemmler, M., Lehfeld, H., Horn, R.: SKT nach Erzigkeit - SKT Manual Edition 2015, vol. 1. Universität Erlangen-Nürnberg, Erlangen, Germany (2015)
Google Scholar
Vincze, V., et al.: Linguistic parameters of spontaneous speech for identifying mild cognitive impairment and Alzheimer disease. Comput. Linguist. 48, 119–153 (2022)
Article Google Scholar
World Health Organization: Global status report on the public health response to dementia. World Health Organization, Geneva (2021)
Google Scholar
Xu, X., Kang, Y., Cao, S., Lin, B., Ma, L.: Explore wav2vec 2.0 for mispronunciation detection. In: Interspeech 2021, pp. 4428–4432. ISCA, August 2021. https://doi.org/10.21437/Interspeech.2021-777

Download references

Author information

Authors and Affiliations

Technische Hochschule Nürnberg Georg Simon Ohm, Nuremberg, Germany
Franziska Braun, Korbinian Riedhammer & Sebastian P. Bayerl
Psychiatrische Klinik und Psychotherapie, Universitätsklinikum Erlangen, Erlangen, Germany
Andreas Erzigkeit
Klinik für Psychiatrie und Psychotherapie, Universitätsklinik der Paracelsus Medizinischen Privatuniversität, Klinikum Nürnberg, Nuremberg, Germany
Hartmut Lehfeld & Thomas Hillemacher

Authors

Franziska Braun
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Erzigkeit
View author publications
You can also search for this author in PubMed Google Scholar
Hartmut Lehfeld
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Hillemacher
View author publications
You can also search for this author in PubMed Google Scholar
Korbinian Riedhammer
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian P. Bayerl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Franziska Braun .

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Aleš Horák
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Braun, F., Erzigkeit, A., Lehfeld, H., Hillemacher, T., Riedhammer, K., Bayerl, S.P. (2022). Going Beyond the Cookie Theft Picture Test: Detecting Cognitive Impairments Using Acoustic Features. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2022. Lecture Notes in Computer Science(), vol 13502. Springer, Cham. https://doi.org/10.1007/978-3-031-16270-1_36

Download citation

DOI: https://doi.org/10.1007/978-3-031-16270-1_36
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16269-5
Online ISBN: 978-3-031-16270-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Going Beyond the Cookie Theft Picture Test: Detecting Cognitive Impairments Using Acoustic Features