Abstract
Pathological voices have features that make them distinct from normophonic voices. In fact, the unstability of phonation associated to some voice disorders has a big impact on the spectral envelope of the speech signal and also on the feasibility of reliable pitch detection. These two issues (characteristics of the spectral envelope and pitch detection) and corresponding assumptions play a key role in many current inverse filtering algorithms. Thus, the inverse filtering of disordered or special voices is not a solved problem yet. Nevertheless, the assessment of glottal function is expected to be useful in voice function evaluation. This paper approaches the problem of inverse filtering by homomorphic prediction. While not favoured much by researchers in recent literature, such an approach offers two potential advantages: it does not require previous pitch detection and it does not rely on any assumptions about the spectral enevelope of the glottal signal. Its performance is herein assessed and compared to that of an adaptive inverse filtering method making use of synthetic voices produced with a biomechanical voice production model. Results indicate that the performance of the inverse filtering based on homomorphic prediction is within the range of that of adaptive inverse filtering and, at the same time, it has a better behaviour when the spectral envelope of the glottal signal does not suit an all-pole model of predefined order.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Rabiner, L.R., Schafer, R.W.: Digital processing of speech signals. Prentice-Hall, Englewood Cliffs (1978)
Walker, J., Murphy, P.: A review of glottal waveform analysis. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds.) COST 277. LNCS, vol. 4391, pp. 1–21. Springer, Heidelberg (2007)
Wong, D., Markel, J.D., Gray Jr., A.H.: Least squares glottal inverse filtering from the acoustic speech waveform. IEEE Transactions Acoustics, Speech and Signal Processing 27, 350–355 (1979)
Akande, O.O., Murphy, P.J.: Estimation of the vocal tract transfer function with application to glottal wave analysis. Speech Communication 46, 15–36 (2005)
Fu, Q., Murphy, P.: Robust glottal source estimation based on joint source-filter model optimization. IEEE Transactions on Audio, Speech and Language Processing 14, 492–501 (2006)
Alku, P.: An automatic method to estimate the time-based parameters of the glottal pulseform. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 29–32 (1992)
Gómez-Vilda, P., Fernández-Baillo, R., Rodellar-Biarge, V., Nieto-Lluis, V., Álvarez-Marquina, A., Mazaira-Fernández, L.M., Martínez-Olalla, R., Godino-Llorente, J.I.: Glottal source biometrical signature for voice pathology detection. Speech Communication 51, 759–781 (2009)
Oppenheim, A., Schafer, R.W.: Homomorphic analysis of speech. IEEE Transactions on Audio and Electroacoustics 16, 221–226 (1968)
Kopec, G., Oppenheim, A., Tribolet, J.: Speech analysis homomorphic prediction. IEEE Transactions on Acoustics, Speech and Signal Processing 25, 40–49 (1977)
Rahman, M.S., Shimamura, T.: Formant frequency estimation of high-pitched speech by homomorphic prediction. Acoustical Science and Technology 26, 502–510 (2005)
de Oliveira-Rosa, M., Pereira, J., Grellet, M.: Adaptive estimation of residue signal for voice pathology diagnosis. IEEE Transactions on Biomedical Engineering 47, 96–104 (2000)
Gómez-Vilda, P., Fernández-Baillo, R., Nieto, A., Díaz, F., Fernández-Camacho, F.J., Rodellar, V., Álvarez, A., Martínez, R.: Evaluation of voice pathology based on the estimation of vocal fold biomechanical parameters. Journal of Voice 21, 450–476 (2007)
Sapienza, C., Hoffman-Ruddy, B.: Voice Disorders. Plural Publishing (2009)
Moore, E., Torres, J.: A performance assessment of objective measures for evaluating the quality of glottal waveform estimates. Speech Communication 50, 56–66 (2008)
Kob, M., Alhuser, N., Reiter, U.: Time-domain model of the singing voice. In: Proceedings of the 2nd COST G-6 Workshop on Digital Audio Effects, Trodheim, Norway (1999)
Kob, M.: Physical Modeling of the Singing Voice. PhD thesis, Fakulät für Elektrotechnik und Informationstechnik - RWTH Aachen, Logos-Verlag (2002)
Kob, M.: Vox - a time-domain model for the singing voice (2002), http://www.akustik.rwth-aachen.de/~malte/vox/index.html.en (visited May 2009) Computer software
Mathur, S., Story, B.H., Rodriguez, J.J.: Vocal-tract modeling: fractional elongation of segment lengths in a waveguide model with half-sample delays. IEEE Transactions on Audio Speech and Language Processing 14, 1754–1762 (2006)
Story, B.H., Titze, I.R.: Parameterization of vocal tract area functions by empirical orthogonal modes. Journal of Phonetics 26, 223–260 (1998)
El-Jaroudi, A., Makhoul, J.: Discrete all-pole modeling. IEEE Transactions on Signal Processing 39, 411–423 (1991)
Arias, M., Bäckström, T.: TKK aparat (2008), http://aparat.sourceforge.net (visited May 2009)
Childers, D., Skinner, D., Kemerait, R.: The cepstrum: A guide to processing. Proceedings of the IEEE 65, 1428–1443 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fraile, R., Kob, M., Gutiérrez-Arriola, J.M., Sáenz-Lechón, N., Godino-Llorente, J.I., Osma-Ruiz, V. (2011). Glottal Inverse Filtering of Speech Based on Homomorphic Prediction: A Cepstrum-Based Algorithm Not Requiring Prior Detection of Either Pitch or Glottal Closure. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2010. Communications in Computer and Information Science, vol 127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18472-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-18472-7_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18471-0
Online ISBN: 978-3-642-18472-7
eBook Packages: Computer ScienceComputer Science (R0)