Skip to main content

Voice Pathology Classification by Using Features from High-Speed Videos

  • Conference paper
Artificial Intelligence in Medicine (AIME 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5651))

Included in the following conference series:

  • 2120 Accesses

Abstract

For the diagnosis of pathological voices it is of particular importance to examine the dynamic properties of the underlying vocal fold (VF) movements occurring at a fundamental frequency of 100–300 Hz. To this end, a patient’s laryngeal oscillation patterns are captured with state-of-the-art endoscopic high-speed (HS) camera systems capable of recording 4000 frames/second. To date the clinical analysis of these HS videos is commonly performed in a subjective manner via slow-motion playback. Hence, the resulting diagnoses are inherently error-prone, exhibiting high inter-rater variability. In this paper an objective method for overcoming this drawback is presented which employs a quantitative description and classification approach based on a novel image analysis strategy called Phonovibrography. By extracting the relevant VF movement information from HS videos the spatio-temporal patterns of laryngeal activity are captured using a set of specialized features. As reference for performance, conventional voice analysis features are also computed. The derived features are analyzed with different machine learning (ML) algorithms regarding clinically meaningful classification tasks. The applicability of the approach is demonstrated using a clinical data set comprising individuals with normophonic and paralytic voices. The results indicate that the presented approach holds a lot of promise for providing reliable diagnosis support in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dejonckere, P., Bradley, P., Clemente, P., Cornut, G., Crevier-Buchman, L., Friedrich, G., Heyning, P.V.D., Remacle, M., Woisard, V.: Committee on Phoniatrics of the European Laryngological Society (ELS): A basic protocol for functional assessment of voice pathology. Eur. Arch. Otorhinolaryngol. 258, 77–82 (2001)

    Article  CAS  PubMed  Google Scholar 

  2. Raes, J., Lebrun, Y., Clement, P.: Videostroboscopy of the larynx. Acta Otorhinolaryngol. Belg. 40, 421–425 (1986)

    CAS  PubMed  Google Scholar 

  3. Švec, J., Schutte, H.: Videokymography: high-speed line scanning of vocal fold vibration. J. Voice 10(2), 201–205 (1996)

    Article  PubMed  Google Scholar 

  4. Deliyski, D., Petrushev, P., Bonilha, H., Gerlach, T., Martin-Harris, B., Hillman, R.: Clinical implementation of laryngeal high-speed videoendoscopy: challenges and evolution. Folia Phoniatr Logop 60(1), 33–44 (2008)

    Article  PubMed  Google Scholar 

  5. Švec, J., Sram, F., Schutte, H.: Videokymography in voice disorders: what to look for? Ann. Otol. Rhinol. Laryngol. 116(3), 172–180 (2007)

    Article  PubMed  Google Scholar 

  6. Qiu, Q., Schutte, H., Gu, L., Yu, Q.: An automatic method to quantify the vibration properties of human vocal folds via videokymography. Folia Phoniatr Logop 55(3), 128–136 (2003)

    Article  PubMed  Google Scholar 

  7. Mergell, P., Herzel, H., Titze, I.: Irregular vocal-fold vibration–high-speed observation and modeling. J. Acoust. Soc. Am. 108(6), 2996–3002 (2000)

    Article  CAS  PubMed  Google Scholar 

  8. Lohscheller, J., Eysholdt, U., Toy, H., Döllinger, M.: Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-d diagrams for visualizing and analyzing the underlying laryngeal dynamics. IEEE Trans. Med. Imaging 27(3), 300–309 (2008)

    Article  PubMed  Google Scholar 

  9. Lohscheller, J., Toy, H., Rosanowski, F., Eysholdt, U., Döllinger, M.: Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos. Med. Image Anal. 11(4), 400–413 (2007)

    Article  PubMed  Google Scholar 

  10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons, Chichester (2001)

    Google Scholar 

  11. Beyer, H., Schwefel, H.: Evolution strategies - a comprehensive introduction. Natural Computing 1, 3–52 (2002)

    Article  Google Scholar 

  12. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995)

    Google Scholar 

  13. Verikas, A., Gelzinis, A., Bacauskiene, M., Uloza, V.: Towards a computer-aided diagnosis system for vocal cord diseases. Artif. Intell. Med. 36(1), 71–84 (2006)

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Voigt, D., Döllinger, M., Yang, A., Eysholdt, U., Lohscheller, J. (2009). Voice Pathology Classification by Using Features from High-Speed Videos. In: Combi, C., Shahar, Y., Abu-Hanna, A. (eds) Artificial Intelligence in Medicine. AIME 2009. Lecture Notes in Computer Science(), vol 5651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02976-9_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02976-9_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02975-2

  • Online ISBN: 978-3-642-02976-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics