Skip to main content

Glottal Inverse Filtering of Speech Based on Homomorphic Prediction: A Cepstrum-Based Algorithm Not Requiring Prior Detection of Either Pitch or Glottal Closure

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 127))

Abstract

Pathological voices have features that make them distinct from normophonic voices. In fact, the unstability of phonation associated to some voice disorders has a big impact on the spectral envelope of the speech signal and also on the feasibility of reliable pitch detection. These two issues (characteristics of the spectral envelope and pitch detection) and corresponding assumptions play a key role in many current inverse filtering algorithms. Thus, the inverse filtering of disordered or special voices is not a solved problem yet. Nevertheless, the assessment of glottal function is expected to be useful in voice function evaluation. This paper approaches the problem of inverse filtering by homomorphic prediction. While not favoured much by researchers in recent literature, such an approach offers two potential advantages: it does not require previous pitch detection and it does not rely on any assumptions about the spectral enevelope of the glottal signal. Its performance is herein assessed and compared to that of an adaptive inverse filtering method making use of synthetic voices produced with a biomechanical voice production model. Results indicate that the performance of the inverse filtering based on homomorphic prediction is within the range of that of adaptive inverse filtering and, at the same time, it has a better behaviour when the spectral envelope of the glottal signal does not suit an all-pole model of predefined order.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rabiner, L.R., Schafer, R.W.: Digital processing of speech signals. Prentice-Hall, Englewood Cliffs (1978)

    Google Scholar 

  2. Walker, J., Murphy, P.: A review of glottal waveform analysis. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds.) COST 277. LNCS, vol. 4391, pp. 1–21. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Wong, D., Markel, J.D., Gray Jr., A.H.: Least squares glottal inverse filtering from the acoustic speech waveform. IEEE Transactions Acoustics, Speech and Signal Processing 27, 350–355 (1979)

    Article  Google Scholar 

  4. Akande, O.O., Murphy, P.J.: Estimation of the vocal tract transfer function with application to glottal wave analysis. Speech Communication 46, 15–36 (2005)

    Article  Google Scholar 

  5. Fu, Q., Murphy, P.: Robust glottal source estimation based on joint source-filter model optimization. IEEE Transactions on Audio, Speech and Language Processing 14, 492–501 (2006)

    Article  Google Scholar 

  6. Alku, P.: An automatic method to estimate the time-based parameters of the glottal pulseform. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 29–32 (1992)

    Google Scholar 

  7. Gómez-Vilda, P., Fernández-Baillo, R., Rodellar-Biarge, V., Nieto-Lluis, V., Álvarez-Marquina, A., Mazaira-Fernández, L.M., Martínez-Olalla, R., Godino-Llorente, J.I.: Glottal source biometrical signature for voice pathology detection. Speech Communication 51, 759–781 (2009)

    Article  Google Scholar 

  8. Oppenheim, A., Schafer, R.W.: Homomorphic analysis of speech. IEEE Transactions on Audio and Electroacoustics 16, 221–226 (1968)

    Article  Google Scholar 

  9. Kopec, G., Oppenheim, A., Tribolet, J.: Speech analysis homomorphic prediction. IEEE Transactions on Acoustics, Speech and Signal Processing 25, 40–49 (1977)

    Article  Google Scholar 

  10. Rahman, M.S., Shimamura, T.: Formant frequency estimation of high-pitched speech by homomorphic prediction. Acoustical Science and Technology 26, 502–510 (2005)

    Article  Google Scholar 

  11. de Oliveira-Rosa, M., Pereira, J., Grellet, M.: Adaptive estimation of residue signal for voice pathology diagnosis. IEEE Transactions on Biomedical Engineering 47, 96–104 (2000)

    Article  Google Scholar 

  12. Gómez-Vilda, P., Fernández-Baillo, R., Nieto, A., Díaz, F., Fernández-Camacho, F.J., Rodellar, V., Álvarez, A., Martínez, R.: Evaluation of voice pathology based on the estimation of vocal fold biomechanical parameters. Journal of Voice 21, 450–476 (2007)

    Article  Google Scholar 

  13. Sapienza, C., Hoffman-Ruddy, B.: Voice Disorders. Plural Publishing (2009)

    Google Scholar 

  14. Moore, E., Torres, J.: A performance assessment of objective measures for evaluating the quality of glottal waveform estimates. Speech Communication 50, 56–66 (2008)

    Article  Google Scholar 

  15. Kob, M., Alhuser, N., Reiter, U.: Time-domain model of the singing voice. In: Proceedings of the 2nd COST G-6 Workshop on Digital Audio Effects, Trodheim, Norway (1999)

    Google Scholar 

  16. Kob, M.: Physical Modeling of the Singing Voice. PhD thesis, Fakulät für Elektrotechnik und Informationstechnik - RWTH Aachen, Logos-Verlag (2002)

    Google Scholar 

  17. Kob, M.: Vox - a time-domain model for the singing voice (2002), http://www.akustik.rwth-aachen.de/~malte/vox/index.html.en (visited May 2009) Computer software

  18. Mathur, S., Story, B.H., Rodriguez, J.J.: Vocal-tract modeling: fractional elongation of segment lengths in a waveguide model with half-sample delays. IEEE Transactions on Audio Speech and Language Processing 14, 1754–1762 (2006)

    Article  Google Scholar 

  19. Story, B.H., Titze, I.R.: Parameterization of vocal tract area functions by empirical orthogonal modes. Journal of Phonetics 26, 223–260 (1998)

    Article  Google Scholar 

  20. El-Jaroudi, A., Makhoul, J.: Discrete all-pole modeling. IEEE Transactions on Signal Processing 39, 411–423 (1991)

    Article  Google Scholar 

  21. Arias, M., Bäckström, T.: TKK aparat (2008), http://aparat.sourceforge.net (visited May 2009)

  22. Childers, D., Skinner, D., Kemerait, R.: The cepstrum: A guide to processing. Proceedings of the IEEE 65, 1428–1443 (1977)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fraile, R., Kob, M., Gutiérrez-Arriola, J.M., Sáenz-Lechón, N., Godino-Llorente, J.I., Osma-Ruiz, V. (2011). Glottal Inverse Filtering of Speech Based on Homomorphic Prediction: A Cepstrum-Based Algorithm Not Requiring Prior Detection of Either Pitch or Glottal Closure. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2010. Communications in Computer and Information Science, vol 127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18472-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18472-7_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18471-0

  • Online ISBN: 978-3-642-18472-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics