Abstract
Empirical mode decomposition (EMD) is an algorithm for signal analysis recently introduced by Huang. It is a completely data-driven non-linear method for the decomposition of a signal into AM - FM components. In this paper two new EMD-based methods for the analysis and classification of pathological voices are presented. They are applied to speech signals corresponding to real and simulated sustained vowels. We first introduce a method that allows the robust extraction of the fundamental frequency of sustained vowels. Its determination is crucial for pathological voice analysis and diagnosis. This new method is based on the ensemble empirical mode decomposition (EEMD) algorithm and its performance is compared with others from the state of the art. As a second EMD-based tool, we explore spectral properties of the intrinsic mode functions and apply them to the classification of normal and pathological sustained vowels. We show that just using a basic pattern classification algorithm, the selected spectral features of only three modes are enough to discriminate between normal and pathological voices.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Huang, N., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C., Liu, H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc.: Math., Phys. and Eng. Sciences 454, 903–995 (1998)
Huang, N.E., Shen, S.S.P. (eds.): Hilbert-Huang transform and its applications. Interdisciplin. Math. Sc., vol. 5. World Sci., Singapore (2005)
Schlotthauer, G., Torres, M.E.: Descomposición modal empírica: análisis y disminución de ruido en señales biológicas. In: Proc. XV Congreso Argentino de Bioingeniería SABI, Paraná, E.R. Argentina (2005) File:101PS.pdf
Rilling, G., Flandrin, P., Gonçalvès, P.: On empirical mode decomposition and its algorithms. In: Proc IEEE-EURASIP Workshop NSIP-03, Grado, Italia (2003)
Rilling, G., Flandrin, P.: On the influence of sampling on the empirical mode decomposition. In: IEEE Int. Conf. On Acoust., Speech and Signal Proc. ICASSP 2006, Toulouse, vol. III, pp. 444–447 (2006)
Dimitriadis, D., Maragos, P.: Continuous energy demodulation methods and application to speech analysis. Speech Commun. 48(7), 819–837 (2006)
Schlotthauer, G., Torres, M.E., Rufiner, H.: A new algorithm for instantaneous F0 speech extraction based on ensemble empirical mode decomposition. In: Proc. of 17th Eur. Sign. Proces. Conf. 2009, Glasgow, UK, pp. 2347–2351 (2009)
Schlotthauer, G., Torres, M.E., Rufiner, H.: Voice fundamental frequency extraction algorithm based on ensemble empirical mode decomposition and entropies. In: Proc. of 11th Int. Congr. of the IFMBE 2009, Munich, pp. 984–987 (2009)
Torres, M.E., Schlotthauer, G., Rufiner, H.L., Jackson-Menaldi, M.C.: Empirical mode decomposition. spectral properties in normal and pathological voices. In: Proc. of the 4th Eur. Conf. of the Inter. Fed. for Med. and Biol. Eng., pp. 252–255 (2009)
Hess, W.: Pitch and Voicing Determination of Speech with an Extension Toward Music Signals. In: Springer Handbook of Speech Proc., pp. 181–208. Springer, Heidelberg (2008)
Schlotthauer, G., Torres, M.E., Jackson-Menaldi, M.C.: A pattern recognition approach to spasmodic dysphonia and muscle tension dysphonia automatic classification. J. of Voice (2010) (in press)
Huang, H., Pan, J.: Speech pitch determination based on Hilbert-Huang transform. Signal Process 86(4), 792–803 (2006)
Weiping, H., Xiuxin, W., Yaling, L., Minghui, D.: A Novel Pitch Period Detection Algorithm Bases on HHT with Application to Normal and Pathological Voice. In: 27th Annual Intern. Conf. of the IEEE-EMBS 2005, pp. 4541–4544 (2005)
Wu, Z., Huang, N.E.: Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 1(1), 1–41 (2009)
Verdolini, K., Rosen, C.A., Branski, R.C., Andrews, M.L.: Classification Manual for Voice Disorders-I, 1st edn. Lawrence Erlbaum Assoc., Mahwah (2006)
Schlotthauer, G., Torres, M.E., Jackson-Menaldi, C.: Automatic diagnosis of pathological voices. WSEAS Trans. on Signal Proc. 2, 1260–1267 (2006) (And references therein)
Kay Elemetrics Corp.: Disordered voice database 1.03. Mass. Eye and Ear Infirmary, Voice and Speech Lab, Boston (1994)
Jackson-Menaldi, M.C.: La voz patológica. In: Editorial Médica Panamericana, Buenos Aires (2002)
Flandrin, P., Rilling, G., Gonçalvès, P.: Empirical mode decomposition as a filter bank. Signal Proc. Lett., IEEE 11(2), 112–114 (2004)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423, 623–656 (1948)
Papoulis, A.: Probability, Random Variables and Stochastic Processes, 3rd edn. McGraw-Hill Companies, New York (1991)
Maragos, P., Kaiser, J., Quatieri, T.: Energy separation in signal modulations with application to speech analysis. IEEE Trans. on Signal Proc. 41(10), 3024–3051 (1993)
Diaz, M., Esteller, R.: Comparison of the non linear energy operator and the hilbert transform in the estimation of the instantaneous amplitude and frequency. Latin Am. Trans., IEEE (Revista IEEE America Latina) 5(1), 1–8 (2007)
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Speech Coding and Synth., pp. 121–173. Elsevier Science, Amsterdam (1995)
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proc. of the Inst. of Phonetic Sci., vol. 17, pp. 97–110 (1993)
Jang, S., Choi, S., Kim, H., Choi, H., Yoon, Y.: Evaluation of performance of several established pitch detection algorithms in pathological voices. In: Proc. 29th Annual Intern. Conf. of the IEEE Eng. in Med. and Biol. Soc., vol. 2007, pp. 620–623 (2007) PMID: 18002032
Goddard, J., Schlotthauer, G., Torres, M.E., Rufiner, H.L.: Dimensionality reduction for visualization of normal and pathological speech data. Biomed. Sig. Proc. and Control 4, 194–201 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schlotthauer, G., Torres, M.E., Rufiner, H.L. (2010). Pathological Voice Analysis and Classification Based on Empirical Mode Decomposition. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-12397-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)