Skip to main content

Pathological Voice Analysis and Classification Based on Empirical Mode Decomposition

  • Chapter
Development of Multimodal Interfaces: Active Listening and Synchrony

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

Abstract

Empirical mode decomposition (EMD) is an algorithm for signal analysis recently introduced by Huang. It is a completely data-driven non-linear method for the decomposition of a signal into AM - FM components. In this paper two new EMD-based methods for the analysis and classification of pathological voices are presented. They are applied to speech signals corresponding to real and simulated sustained vowels. We first introduce a method that allows the robust extraction of the fundamental frequency of sustained vowels. Its determination is crucial for pathological voice analysis and diagnosis. This new method is based on the ensemble empirical mode decomposition (EEMD) algorithm and its performance is compared with others from the state of the art. As a second EMD-based tool, we explore spectral properties of the intrinsic mode functions and apply them to the classification of normal and pathological sustained vowels. We show that just using a basic pattern classification algorithm, the selected spectral features of only three modes are enough to discriminate between normal and pathological voices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Huang, N., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C., Liu, H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc.: Math., Phys. and Eng. Sciences 454, 903–995 (1998)

    MathSciNet  MATH  Google Scholar 

  2. Huang, N.E., Shen, S.S.P. (eds.): Hilbert-Huang transform and its applications. Interdisciplin. Math. Sc., vol. 5. World Sci., Singapore (2005)

    MATH  Google Scholar 

  3. Schlotthauer, G., Torres, M.E.: Descomposición modal empírica: análisis y disminución de ruido en señales biológicas. In: Proc. XV Congreso Argentino de Bioingeniería SABI, Paraná, E.R. Argentina (2005) File:101PS.pdf

    Google Scholar 

  4. Rilling, G., Flandrin, P., Gonçalvès, P.: On empirical mode decomposition and its algorithms. In: Proc IEEE-EURASIP Workshop NSIP-03, Grado, Italia (2003)

    Google Scholar 

  5. Rilling, G., Flandrin, P.: On the influence of sampling on the empirical mode decomposition. In: IEEE Int. Conf. On Acoust., Speech and Signal Proc. ICASSP 2006, Toulouse, vol. III, pp. 444–447 (2006)

    Google Scholar 

  6. Dimitriadis, D., Maragos, P.: Continuous energy demodulation methods and application to speech analysis. Speech Commun. 48(7), 819–837 (2006)

    Article  Google Scholar 

  7. Schlotthauer, G., Torres, M.E., Rufiner, H.: A new algorithm for instantaneous F0 speech extraction based on ensemble empirical mode decomposition. In: Proc. of 17th Eur. Sign. Proces. Conf. 2009, Glasgow, UK, pp. 2347–2351 (2009)

    Google Scholar 

  8. Schlotthauer, G., Torres, M.E., Rufiner, H.: Voice fundamental frequency extraction algorithm based on ensemble empirical mode decomposition and entropies. In: Proc. of 11th Int. Congr. of the IFMBE 2009, Munich, pp. 984–987 (2009)

    Google Scholar 

  9. Torres, M.E., Schlotthauer, G., Rufiner, H.L., Jackson-Menaldi, M.C.: Empirical mode decomposition. spectral properties in normal and pathological voices. In: Proc. of the 4th Eur. Conf. of the Inter. Fed. for Med. and Biol. Eng., pp. 252–255 (2009)

    Google Scholar 

  10. Hess, W.: Pitch and Voicing Determination of Speech with an Extension Toward Music Signals. In: Springer Handbook of Speech Proc., pp. 181–208. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Schlotthauer, G., Torres, M.E., Jackson-Menaldi, M.C.: A pattern recognition approach to spasmodic dysphonia and muscle tension dysphonia automatic classification. J. of Voice (2010) (in press)

    Google Scholar 

  12. Huang, H., Pan, J.: Speech pitch determination based on Hilbert-Huang transform. Signal Process 86(4), 792–803 (2006)

    Article  MATH  Google Scholar 

  13. Weiping, H., Xiuxin, W., Yaling, L., Minghui, D.: A Novel Pitch Period Detection Algorithm Bases on HHT with Application to Normal and Pathological Voice. In: 27th Annual Intern. Conf. of the IEEE-EMBS 2005, pp. 4541–4544 (2005)

    Google Scholar 

  14. Wu, Z., Huang, N.E.: Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 1(1), 1–41 (2009)

    Article  Google Scholar 

  15. Verdolini, K., Rosen, C.A., Branski, R.C., Andrews, M.L.: Classification Manual for Voice Disorders-I, 1st edn. Lawrence Erlbaum Assoc., Mahwah (2006)

    Google Scholar 

  16. Schlotthauer, G., Torres, M.E., Jackson-Menaldi, C.: Automatic diagnosis of pathological voices. WSEAS Trans. on Signal Proc. 2, 1260–1267 (2006) (And references therein)

    Google Scholar 

  17. Kay Elemetrics Corp.: Disordered voice database 1.03. Mass. Eye and Ear Infirmary, Voice and Speech Lab, Boston (1994)

    Google Scholar 

  18. Jackson-Menaldi, M.C.: La voz patológica. In: Editorial Médica Panamericana, Buenos Aires (2002)

    Google Scholar 

  19. Flandrin, P., Rilling, G., Gonçalvès, P.: Empirical mode decomposition as a filter bank. Signal Proc. Lett., IEEE 11(2), 112–114 (2004)

    Article  Google Scholar 

  20. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423, 623–656 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  21. Papoulis, A.: Probability, Random Variables and Stochastic Processes, 3rd edn. McGraw-Hill Companies, New York (1991)

    MATH  Google Scholar 

  22. Maragos, P., Kaiser, J., Quatieri, T.: Energy separation in signal modulations with application to speech analysis. IEEE Trans. on Signal Proc. 41(10), 3024–3051 (1993)

    Article  MATH  Google Scholar 

  23. Diaz, M., Esteller, R.: Comparison of the non linear energy operator and the hilbert transform in the estimation of the instantaneous amplitude and frequency. Latin Am. Trans., IEEE (Revista IEEE America Latina) 5(1), 1–8 (2007)

    Article  Google Scholar 

  24. Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Speech Coding and Synth., pp. 121–173. Elsevier Science, Amsterdam (1995)

    Google Scholar 

  25. Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proc. of the Inst. of Phonetic Sci., vol. 17, pp. 97–110 (1993)

    Google Scholar 

  26. Jang, S., Choi, S., Kim, H., Choi, H., Yoon, Y.: Evaluation of performance of several established pitch detection algorithms in pathological voices. In: Proc. 29th Annual Intern. Conf. of the IEEE Eng. in Med. and Biol. Soc., vol. 2007, pp. 620–623 (2007) PMID: 18002032

    Google Scholar 

  27. Goddard, J., Schlotthauer, G., Torres, M.E., Rufiner, H.L.: Dimensionality reduction for visualization of normal and pathological speech data. Biomed. Sig. Proc. and Control 4, 194–201 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Schlotthauer, G., Torres, M.E., Rufiner, H.L. (2010). Pathological Voice Analysis and Classification Based on Empirical Mode Decomposition. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12397-9_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12396-2

  • Online ISBN: 978-3-642-12397-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics