Glottal Inverse Filtering of Speech Based on Homomorphic Prediction: A Cepstrum-Based Algorithm Not Requiring Prior Detection of Either Pitch or Glottal Closure

Fraile, Rubén; Kob, Malte; Gutiérrez-Arriola, Juana M.; Sáenz-Lechón, Nicolás; Godino-Llorente, J. Ignacio; Osma-Ruiz, Víctor

doi:10.1007/978-3-642-18472-7_19

Glottal Inverse Filtering of Speech Based on Homomorphic Prediction: A Cepstrum-Based Algorithm Not Requiring Prior Detection of Either Pitch or Glottal Closure

Rubén Fraile⁴,
Malte Kob⁵,
Juana M. Gutiérrez-Arriola⁴,
Nicolás Sáenz-Lechón⁴,
J. Ignacio Godino-Llorente⁴ &
…
Víctor Osma-Ruiz⁴

Conference paper

1029 Accesses
1 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 127))

Abstract

Pathological voices have features that make them distinct from normophonic voices. In fact, the unstability of phonation associated to some voice disorders has a big impact on the spectral envelope of the speech signal and also on the feasibility of reliable pitch detection. These two issues (characteristics of the spectral envelope and pitch detection) and corresponding assumptions play a key role in many current inverse filtering algorithms. Thus, the inverse filtering of disordered or special voices is not a solved problem yet. Nevertheless, the assessment of glottal function is expected to be useful in voice function evaluation. This paper approaches the problem of inverse filtering by homomorphic prediction. While not favoured much by researchers in recent literature, such an approach offers two potential advantages: it does not require previous pitch detection and it does not rely on any assumptions about the spectral enevelope of the glottal signal. Its performance is herein assessed and compared to that of an adaptive inverse filtering method making use of synthetic voices produced with a biomechanical voice production model. Results indicate that the performance of the inverse filtering based on homomorphic prediction is within the range of that of adaptive inverse filtering and, at the same time, it has a better behaviour when the spectral envelope of the glottal signal does not suit an all-pole model of predefined order.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rabiner, L.R., Schafer, R.W.: Digital processing of speech signals. Prentice-Hall, Englewood Cliffs (1978)
Google Scholar
Walker, J., Murphy, P.: A review of glottal waveform analysis. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds.) COST 277. LNCS, vol. 4391, pp. 1–21. Springer, Heidelberg (2007)
Chapter Google Scholar
Wong, D., Markel, J.D., Gray Jr., A.H.: Least squares glottal inverse filtering from the acoustic speech waveform. IEEE Transactions Acoustics, Speech and Signal Processing 27, 350–355 (1979)
Article Google Scholar
Akande, O.O., Murphy, P.J.: Estimation of the vocal tract transfer function with application to glottal wave analysis. Speech Communication 46, 15–36 (2005)
Article Google Scholar
Fu, Q., Murphy, P.: Robust glottal source estimation based on joint source-filter model optimization. IEEE Transactions on Audio, Speech and Language Processing 14, 492–501 (2006)
Article Google Scholar
Alku, P.: An automatic method to estimate the time-based parameters of the glottal pulseform. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 29–32 (1992)
Google Scholar
Gómez-Vilda, P., Fernández-Baillo, R., Rodellar-Biarge, V., Nieto-Lluis, V., Álvarez-Marquina, A., Mazaira-Fernández, L.M., Martínez-Olalla, R., Godino-Llorente, J.I.: Glottal source biometrical signature for voice pathology detection. Speech Communication 51, 759–781 (2009)
Article Google Scholar
Oppenheim, A., Schafer, R.W.: Homomorphic analysis of speech. IEEE Transactions on Audio and Electroacoustics 16, 221–226 (1968)
Article Google Scholar
Kopec, G., Oppenheim, A., Tribolet, J.: Speech analysis homomorphic prediction. IEEE Transactions on Acoustics, Speech and Signal Processing 25, 40–49 (1977)
Article Google Scholar
Rahman, M.S., Shimamura, T.: Formant frequency estimation of high-pitched speech by homomorphic prediction. Acoustical Science and Technology 26, 502–510 (2005)
Article Google Scholar
de Oliveira-Rosa, M., Pereira, J., Grellet, M.: Adaptive estimation of residue signal for voice pathology diagnosis. IEEE Transactions on Biomedical Engineering 47, 96–104 (2000)
Article Google Scholar
Gómez-Vilda, P., Fernández-Baillo, R., Nieto, A., Díaz, F., Fernández-Camacho, F.J., Rodellar, V., Álvarez, A., Martínez, R.: Evaluation of voice pathology based on the estimation of vocal fold biomechanical parameters. Journal of Voice 21, 450–476 (2007)
Article Google Scholar
Sapienza, C., Hoffman-Ruddy, B.: Voice Disorders. Plural Publishing (2009)
Google Scholar
Moore, E., Torres, J.: A performance assessment of objective measures for evaluating the quality of glottal waveform estimates. Speech Communication 50, 56–66 (2008)
Article Google Scholar
Kob, M., Alhuser, N., Reiter, U.: Time-domain model of the singing voice. In: Proceedings of the 2^nd COST G-6 Workshop on Digital Audio Effects, Trodheim, Norway (1999)
Google Scholar
Kob, M.: Physical Modeling of the Singing Voice. PhD thesis, Fakulät für Elektrotechnik und Informationstechnik - RWTH Aachen, Logos-Verlag (2002)
Google Scholar
Kob, M.: Vox - a time-domain model for the singing voice (2002), http://www.akustik.rwth-aachen.de/~malte/vox/index.html.en (visited May 2009) Computer software
Mathur, S., Story, B.H., Rodriguez, J.J.: Vocal-tract modeling: fractional elongation of segment lengths in a waveguide model with half-sample delays. IEEE Transactions on Audio Speech and Language Processing 14, 1754–1762 (2006)
Article Google Scholar
Story, B.H., Titze, I.R.: Parameterization of vocal tract area functions by empirical orthogonal modes. Journal of Phonetics 26, 223–260 (1998)
Article Google Scholar
El-Jaroudi, A., Makhoul, J.: Discrete all-pole modeling. IEEE Transactions on Signal Processing 39, 411–423 (1991)
Article Google Scholar
Arias, M., Bäckström, T.: TKK aparat (2008), http://aparat.sourceforge.net (visited May 2009)
Childers, D., Skinner, D., Kemerait, R.: The cepstrum: A guide to processing. Proceedings of the IEEE 65, 1428–1443 (1977)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Circuits & Systems Engineering, Universidad Politécnica de Madrid, Carretera de Valencia Km 7, 28031, Madrid, Spain
Rubén Fraile, Juana M. Gutiérrez-Arriola, Nicolás Sáenz-Lechón, J. Ignacio Godino-Llorente & Víctor Osma-Ruiz
Erich Thienhaus Institute, Hochschule für Musik Detmold, Neustadt 22, D32756, Detmold, Germany
Malte Kob

Authors

Rubén Fraile
View author publications
You can also search for this author in PubMed Google Scholar
Malte Kob
View author publications
You can also search for this author in PubMed Google Scholar
Juana M. Gutiérrez-Arriola
View author publications
You can also search for this author in PubMed Google Scholar
Nicolás Sáenz-Lechón
View author publications
You can also search for this author in PubMed Google Scholar
J. Ignacio Godino-Llorente
View author publications
You can also search for this author in PubMed Google Scholar
Víctor Osma-Ruiz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IST - Technical University of Lisbon, Av.Rovisco Pais, 1, 1049-001, Lisbon, Portugal
Ana Fred
Departament of Systems and Informatics, Polytechnic Institute of Setúbal – INSTICC, Rua do Vale de Chaves - Estefanilha, 2910-761, Setúbal, Portugal
Joaquim Filipe
Institute of Telecommunications, Av. Rovisco Pais, 1, 1049-001, Lisboa, Portugal
Hugo Gamboa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fraile, R., Kob, M., Gutiérrez-Arriola, J.M., Sáenz-Lechón, N., Godino-Llorente, J.I., Osma-Ruiz, V. (2011). Glottal Inverse Filtering of Speech Based on Homomorphic Prediction: A Cepstrum-Based Algorithm Not Requiring Prior Detection of Either Pitch or Glottal Closure. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2010. Communications in Computer and Information Science, vol 127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18472-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-18472-7_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18471-0
Online ISBN: 978-3-642-18472-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics