Skip to main content

AI Hears Your Health: Computer Audition for Health Monitoring

  • Conference paper
  • First Online:
ICT for Health, Accessibility and Wellbeing (IHAW 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1538))

Abstract

Acoustic sounds produced by the human body reflect changes in our mental, physiological, and pathological states. A deep analysis of such audio that are of complex nature can give insight about imminent or existing health issues. For automatic processing and understanding of such data, sophisticated machine learning approaches are needed that can extract or learn robust features. In this paper, we introduce a set of machine learning toolkits both for supervised feature extraction and unsupervised representation learning from audio health data. We analyse the application of deep neural networks (DNNs), including end-to-end learning, recurrent autoencoders, and transfer learning for speech and body-acoustics health monitoring and provide state-of-the-art results for each area. As show-case examples, we pick three well-benchmarked examples for body-acoustics and speech, each, from the popular annual Interspeech Computational Paralinguistics Challenge (ComParE). In particular, the speech-based health tasks are COVID-19 speech analysis, recognition of upper respiratory tract infections, and continuous sleepiness recognition. The body-acoustics health tasks are COVID-19 cough analysis, speech breath monitoring, heartbeat abnormality recognition, and snore sound classification. The results for all tasks demonstrate the suitability of deep computer audition approaches for health monitoring and automatic audio-based early diagnosis of health issues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/audeering/opensmile.

  2. 2.

    https://github.com/end2you/end2you.

  3. 3.

    https://github.com/auDeep/auDeep.

  4. 4.

    https://github.com/DeepSpectrum/DeepSpectrum.

References

  1. Amiriparian, S., Freitag, M., Cummins, N., Schuller, B.: Sequence to sequence autoencoders for unsupervised representation learning from audio. In: Proceedings of DCASE 2017, Munich, Germany, pp. 17–21 (2017)

    Google Scholar 

  2. Amiriparian, S., et al.: Snore sound classification using image-based deep spectrum features. In: Proceedings of Interspeech 2017, Stockholm, Sweden, pp. 3512–3516 (2017)

    Google Scholar 

  3. Brown, C., Chauhan, J., Grammenos, A., et al.: Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data. In: Proceedings of KDD, San Diego, CA, pp. 3474–3484 (2020)

    Google Scholar 

  4. Capellan, A., Fuchs, S.: The interplay of linguistic structure and breathing in German spontaneous speech. In: Proceedings of Interspeech, Lyon, France (2013)

    Google Scholar 

  5. Casanova, E., Candido Jr., A., Fernandes Jr., R.C., et al.: Transfer learning and data augmentation techniques to the COVID-19 identification tasks in ComParE 2021. In: Proceedings of Interspeech 2021, pp. 446–450 (2021)

    Google Scholar 

  6. Deshpande, G., Schuller, B.: An overview on audio, signal, speech, & language processing for COVID-19. arXiv preprint arXiv:2005.08579 (2020)

  7. Eyben, F., Wöllmer, M., Schuller, B.: Opensmile: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the International Conference on Multimedia, pp. 1459–1462. ACM (2010)

    Google Scholar 

  8. Fufurin, I.L., Golyak, I.S., Anfimov, D.R., et al.: Machine learning applications for spectral analysis of human exhaled breath for early diagnosis of diseases. In: Optics in Health Care and Biomedical Optics X, vol. 11553, p. 115531G. International Society for Optics and Photonics (2020)

    Google Scholar 

  9. Gosztolya, G.: Using fisher vector and bag-of-audio-words representations to identify styrian dialects, sleepiness, baby & orca sounds (2019)

    Google Scholar 

  10. Gosztolya, G., et al.: DNN-based feature extraction and classifier combination for child-directed speech, cold and snoring identification (2017)

    Google Scholar 

  11. Han, J., Brown, C., Chauhan, J., et al.: Exploring automatic COVID-19 diagnosis via voice and symptoms from crowdsourced data. In: Proceedings of ICASSP, Toronto, Canada (2021)

    Google Scholar 

  12. Hessel, N.S., de Vries, N.: Diagnostic work-up of socially unacceptable snoring. Eur. Arch. Otorhinolaryngol. 259(3), 158–161 (2002). https://doi.org/10.1007/s00405-001-0428-8

    Article  Google Scholar 

  13. Kaya, H., Karpov, A.A.: Introducing weighted kernel classifiers for handling imbalanced paralinguistic corpora: snoring, addressee and cold. In: INTERSPEECH, pp. 3527–3531 (2017)

    Google Scholar 

  14. Markitantov, M., Dresvyanskiy, D., Mamontov, D., et al.: Ensembling end-to-end deep models for computational paralinguistics tasks: compare 2020 mask and breathing sub-challenges. In: INTERSPEECH, pp. 2072–2076 (2020)

    Google Scholar 

  15. Qian, K., Schuller, B.W., Yamamoto, Y.: Recent advances in computer audition for diagnosing COVID-19: an overview. In: 2021 IEEE 3rd Global Conference on Life Sciences and Technologies (LifeTech), pp. 181–182. IEEE (2021)

    Google Scholar 

  16. Ringeval, F., Schuller, B., Valstar, et al.: AVEC 2018 workshop and challenge: bipolar disorder and cross-cultural affect recognition. In: Proceedings of the 2018 on Audio/visual Emotion Challenge and Workshop, pp. 3–13 (2018)

    Google Scholar 

  17. Roche, L., Zhang, D., Bartl-Pokorny, K.D., et al.: Early vocal development in autism spectrum disorder, rett syndrome, and fragile x syndrome: insights from studies using retrospective video analysis. Adv. Neurodevelop. Disorders 2(1), 49–61 (2018). https://doi.org/10.1007/s41252-017-0051-3

    Article  Google Scholar 

  18. Schuller, B., Batliner, A., Bergler, C., et al.: The interspeech 2020 computational paralinguistics challenge: elderly emotion, breathing & masks. In: Proceedings INTERSPEECH 2020, ISCA, pp. 2042–2046 (2020)

    Google Scholar 

  19. Schuller, B., Steidl, S., Batliner, A., Bergelson, et al.: The interspeech 2017 computational paralinguistics challenge: addressee, cold & snoring. In: Proceedings INTERSPEECH 2017, pp. 3442–3446 (2017)

    Google Scholar 

  20. Schuller, B., Steidl, S., Batliner, A., et al.: The INTERSPEECH 2015 computational paralinguistics challenge: degree of nativeness, Parkinson’s & eating condition. In: Proceedings of Interspeech, Dresden, Germany, pp. 478–482 (2015)

    Google Scholar 

  21. Schuller, B.W., Batliner, A., Bergler, C., et al.: The INTERSPEECH 2019 computational paralinguistics challenge: styrian dialects, continuous sleepiness, baby sounds & orca activity. In: Proceedings INTERSPEECH 2019, ISCA, ISCA, Graz, Austria, pp. 2378–2382 (2019)

    Google Scholar 

  22. Schuller, B.W., et al.: The INTERSPEECH 2018 computational paralinguistics challenge: atypical & self-assessed affect, crying & heart beats. In: Proceedings of INTERSPEECH 2018, pp. 122–126 (2018)

    Google Scholar 

  23. Schuller, B.W., Batliner, A., Bergler, C., et al.: The INTERSPEECH 2021 computational paralinguistics challenge: COVID-19 cough, COVID-19 speech, escalation & primates. In: Proceedings INTERSPEECH 2021, ISCA, Brno, Czechia (2021)

    Google Scholar 

  24. Tzirakis, P., Zafeiriou, S., Schuller, B.W.: End2You-the imperial toolkit for multimodal profiling by end-to-end learning. arXiv preprint arXiv:1802.01115 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Björn Schuller .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Amiriparian, S., Schuller, B. (2021). AI Hears Your Health: Computer Audition for Health Monitoring. In: Pissaloux, E., Papadopoulos, G.A., Achilleos, A., Velázquez, R. (eds) ICT for Health, Accessibility and Wellbeing. IHAW 2021. Communications in Computer and Information Science, vol 1538. Springer, Cham. https://doi.org/10.1007/978-3-030-94209-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94209-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94208-3

  • Online ISBN: 978-3-030-94209-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics