Pathological Voice Analysis and Classification Based on Empirical Mode Decomposition

Schlotthauer, Gastón; Torres, María E.; Rufiner, Hugo L.

doi:10.1007/978-3-642-12397-9_32

Gastón Schlotthauer^20,23,
María E. Torres^20,22,23 &
Hugo L. Rufiner^21,22,23

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

2554 Accesses
10 Citations

Abstract

Empirical mode decomposition (EMD) is an algorithm for signal analysis recently introduced by Huang. It is a completely data-driven non-linear method for the decomposition of a signal into AM - FM components. In this paper two new EMD-based methods for the analysis and classification of pathological voices are presented. They are applied to speech signals corresponding to real and simulated sustained vowels. We first introduce a method that allows the robust extraction of the fundamental frequency of sustained vowels. Its determination is crucial for pathological voice analysis and diagnosis. This new method is based on the ensemble empirical mode decomposition (EEMD) algorithm and its performance is compared with others from the state of the art. As a second EMD-based tool, we explore spectral properties of the intrinsic mode functions and apply them to the classification of normal and pathological sustained vowels. We show that just using a basic pattern classification algorithm, the selected spectral features of only three modes are enough to discriminate between normal and pathological voices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Pathological Voice Recognition Based on Multi-feature Fusion

Smart Data Driven System for Pathological Voices Classification

Automatic Assessment of Pathological Voice Quality Using Multidimensional Acoustic Analysis Based on the GRBAS Scale

Article 05 June 2015

References

Huang, N., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C., Liu, H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc.: Math., Phys. and Eng. Sciences 454, 903–995 (1998)
MathSciNet MATH Google Scholar
Huang, N.E., Shen, S.S.P. (eds.): Hilbert-Huang transform and its applications. Interdisciplin. Math. Sc., vol. 5. World Sci., Singapore (2005)
MATH Google Scholar
Schlotthauer, G., Torres, M.E.: Descomposición modal empírica: análisis y disminución de ruido en señales biológicas. In: Proc. XV Congreso Argentino de Bioingeniería SABI, Paraná, E.R. Argentina (2005) File:101PS.pdf
Google Scholar
Rilling, G., Flandrin, P., Gonçalvès, P.: On empirical mode decomposition and its algorithms. In: Proc IEEE-EURASIP Workshop NSIP-03, Grado, Italia (2003)
Google Scholar
Rilling, G., Flandrin, P.: On the influence of sampling on the empirical mode decomposition. In: IEEE Int. Conf. On Acoust., Speech and Signal Proc. ICASSP 2006, Toulouse, vol. III, pp. 444–447 (2006)
Google Scholar
Dimitriadis, D., Maragos, P.: Continuous energy demodulation methods and application to speech analysis. Speech Commun. 48(7), 819–837 (2006)
Article Google Scholar
Schlotthauer, G., Torres, M.E., Rufiner, H.: A new algorithm for instantaneous F₀ speech extraction based on ensemble empirical mode decomposition. In: Proc. of 17th Eur. Sign. Proces. Conf. 2009, Glasgow, UK, pp. 2347–2351 (2009)
Google Scholar
Schlotthauer, G., Torres, M.E., Rufiner, H.: Voice fundamental frequency extraction algorithm based on ensemble empirical mode decomposition and entropies. In: Proc. of 11th Int. Congr. of the IFMBE 2009, Munich, pp. 984–987 (2009)
Google Scholar
Torres, M.E., Schlotthauer, G., Rufiner, H.L., Jackson-Menaldi, M.C.: Empirical mode decomposition. spectral properties in normal and pathological voices. In: Proc. of the 4th Eur. Conf. of the Inter. Fed. for Med. and Biol. Eng., pp. 252–255 (2009)
Google Scholar
Hess, W.: Pitch and Voicing Determination of Speech with an Extension Toward Music Signals. In: Springer Handbook of Speech Proc., pp. 181–208. Springer, Heidelberg (2008)
Chapter Google Scholar
Schlotthauer, G., Torres, M.E., Jackson-Menaldi, M.C.: A pattern recognition approach to spasmodic dysphonia and muscle tension dysphonia automatic classification. J. of Voice (2010) (in press)
Google Scholar
Huang, H., Pan, J.: Speech pitch determination based on Hilbert-Huang transform. Signal Process 86(4), 792–803 (2006)
Article MATH Google Scholar
Weiping, H., Xiuxin, W., Yaling, L., Minghui, D.: A Novel Pitch Period Detection Algorithm Bases on HHT with Application to Normal and Pathological Voice. In: 27th Annual Intern. Conf. of the IEEE-EMBS 2005, pp. 4541–4544 (2005)
Google Scholar
Wu, Z., Huang, N.E.: Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 1(1), 1–41 (2009)
Article Google Scholar
Verdolini, K., Rosen, C.A., Branski, R.C., Andrews, M.L.: Classification Manual for Voice Disorders-I, 1st edn. Lawrence Erlbaum Assoc., Mahwah (2006)
Google Scholar
Schlotthauer, G., Torres, M.E., Jackson-Menaldi, C.: Automatic diagnosis of pathological voices. WSEAS Trans. on Signal Proc. 2, 1260–1267 (2006) (And references therein)
Google Scholar
Kay Elemetrics Corp.: Disordered voice database 1.03. Mass. Eye and Ear Infirmary, Voice and Speech Lab, Boston (1994)
Google Scholar
Jackson-Menaldi, M.C.: La voz patológica. In: Editorial Médica Panamericana, Buenos Aires (2002)
Google Scholar
Flandrin, P., Rilling, G., Gonçalvès, P.: Empirical mode decomposition as a filter bank. Signal Proc. Lett., IEEE 11(2), 112–114 (2004)
Article Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423, 623–656 (1948)
Article MathSciNet MATH Google Scholar
Papoulis, A.: Probability, Random Variables and Stochastic Processes, 3rd edn. McGraw-Hill Companies, New York (1991)
MATH Google Scholar
Maragos, P., Kaiser, J., Quatieri, T.: Energy separation in signal modulations with application to speech analysis. IEEE Trans. on Signal Proc. 41(10), 3024–3051 (1993)
Article MATH Google Scholar
Diaz, M., Esteller, R.: Comparison of the non linear energy operator and the hilbert transform in the estimation of the instantaneous amplitude and frequency. Latin Am. Trans., IEEE (Revista IEEE America Latina) 5(1), 1–8 (2007)
Article Google Scholar
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Speech Coding and Synth., pp. 121–173. Elsevier Science, Amsterdam (1995)
Google Scholar
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proc. of the Inst. of Phonetic Sci., vol. 17, pp. 97–110 (1993)
Google Scholar
Jang, S., Choi, S., Kim, H., Choi, H., Yoon, Y.: Evaluation of performance of several established pitch detection algorithms in pathological voices. In: Proc. 29th Annual Intern. Conf. of the IEEE Eng. in Med. and Biol. Soc., vol. 2007, pp. 620–623 (2007) PMID: 18002032
Google Scholar
Goddard, J., Schlotthauer, G., Torres, M.E., Rufiner, H.L.: Dimensionality reduction for visualization of normal and pathological speech data. Biomed. Sig. Proc. and Control 4, 194–201 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lab. de Señales y Dinámicas no Lineales, Fac. de Ingeniería, Universidad Nacional de Entre Ríos, Oro Verde, Entre Ríos, Argentina
Gastón Schlotthauer & María E. Torres
Lab. de Cibernética, Fac. de Ingeniería, UNER, Oro Verde, Entre Ríos, Argentina
Hugo L. Rufiner
SINC(i), Fac. de Ing. y Cs. Hs., Univ. Nac. del Litoral, Santa Fe, Argentina
María E. Torres & Hugo L. Rufiner
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina
Gastón Schlotthauer, María E. Torres & Hugo L. Rufiner

Authors

Gastón Schlotthauer
View author publications
You can also search for this author in PubMed Google Scholar
María E. Torres
View author publications
You can also search for this author in PubMed Google Scholar
Hugo L. Rufiner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Second University of Naples, and IIASS, Via Pellegrino, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Centre for Language and Communication Studies, Trinity College, The University of Dublin, Dublin 2, Ireland
Nick Campbell & Carl Vogel &
Department of Computing Science & Mathematics, University of Stirling, FK9 4LA, Stirling, Scotland, UK
Amir Hussain
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands
Anton Nijholt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schlotthauer, G., Torres, M.E., Rufiner, H.L. (2010). Pathological Voice Analysis and Classification Based on Empirical Mode Decomposition. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-12397-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics