Skip to main content

MMSE Feature Reconstruction Based on an Occlusion Model for Robust ASR

  • Conference paper
Advances in Speech and Language Technologies for Iberian Languages

Abstract

This paper proposes a novel compensation technique developed in the log-spectral domain. Our proposal consists in a minimum mean square error (MMSE) estimator derived from an occlusion model [1]. According to this model, the effect of noise over speech is simplified to a binary masking, so that the noise is completely masked by the speech when the speech power dominates and the other way round when the noise is dominant. As for many MMSE-based techniques, a statistical model of clean speech is required. A Gaussian mixture model is employed here. The resulting technique has clear similarities with missing-data imputation techniques although, unlike these ones, an explicit model of noise is employed by our proposal. The experimental results show the superiority of our MMSE estimator with respect to missing-data imputation with both binary and soft masks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Varga, A.P., Moore, R.K.: Hidden Markov model decomposition of speech and noise. In: Proc. ICASSP, pp. 845–848 (April 1990)

    Google Scholar 

  2. Huang, X., Acero, A., Hon, H.: Spoken language processing: A guide to theory, algorithm, and system development. Prentice Hall (2001)

    Google Scholar 

  3. Deng, L., Droppo, J., Acero, A.: Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features. IEEE Trans. Speech Audio Process. 12(3), 218–233 (2004)

    Article  Google Scholar 

  4. Reddy, A.M., Raj, B.: Soft Mask Methods for Single-Channel Speaker Separation. IEEE Trans. Audio Speech and Language Process. 15(6), 1766–1776 (2007)

    Article  Google Scholar 

  5. Cooke, M., Green, P., Josifovski, L., Vizinho, A.: Robust automatic speech recognition with missing and unreliable data. Speech Comm. 34(3), 267–285 (2001)

    Article  MATH  Google Scholar 

  6. Raj, B., Seltzer, M.L., Stern, R.M.: Reconstruction of missing features for robust speech recognition. Speech Comm. 48(4), 275–296 (2004)

    Article  Google Scholar 

  7. González, J.A., Peinado, A.M., Gómez, A.M., Ma, N., Barker, J.: Combining missing-data reconstruction and uncertainty decoding for robust speech recognition. In: Proc. ICASSP, pp. 4693–4696 (March 2012)

    Google Scholar 

  8. Raj, B., Singh, R.: Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition. In: Proc. ASRU, pp. 275–296, 65–70 (2005)

    Google Scholar 

  9. Faubel, F., Raja, H., McDonough, J., Klakow, D.: Particle filter based soft-mask estimation for missing-feature reconstruction. In: Proc. IWAENC (2008)

    Google Scholar 

  10. Hirsch, H.G., Pearce, D.: The Aurora experimental framework for the performance evaluations of the speech recognition systems under noisy conditions. In: ISCA ITRW ASR 2000, Paris, France (2000)

    Google Scholar 

  11. Hirsch, H.G.: Experimental framework for the performance evaluation of speech recognition front-ends of large vocabulary task. Tech. Rep., STQ AURORA DSR Working Group (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

González, J.A., Peinado, A.M., Gómez, Á.M. (2012). MMSE Feature Reconstruction Based on an Occlusion Model for Robust ASR. In: Torre Toledano, D., et al. Advances in Speech and Language Technologies for Iberian Languages. Communications in Computer and Information Science, vol 328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35292-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35292-8_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35291-1

  • Online ISBN: 978-3-642-35292-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics