Skip to main content

Environment Compensation Based on Maximum a Posteriori Estimation for Improved Speech Recognition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3789))

Abstract

In this paper, we describe environment compensation approach based on MAP (maximum a posteriori) estimation assuming that the noise can be modeled as a single Gaussian distribution. It employs the prior information of the noise to deal with environmental variabilities. The acoustic-distorted environment model in the cepstral domain is approximated by the truncated first-order vector Taylor series(VTS) expansion and the clean speech is trained by using Self-Organizing Map (SOM) neural network with the assumption that the speech can be well represented as the multivariate diagonal Gaussian mixtures model (GMM). With the reasonable environment model approximation and effective clustering for the clean model, the noise is well refined using batch-EM algorithm under MAP criterion. Experiment with large vocabulary speaker-independent continuous speech recognition shows that this approach achieves considerable improvement on recognition performance.

This research was sponsored by NSFC (National Natural Science Foundation of China) under Grant No.60475007, the Foundation of China Education Ministry for Century Spanning Talent and BUPT Education Foundation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boll, S.F.: Suppression of Acoustic Noise in Speech Using Spectral Subtraction. IEEE Trans. Acoustics, Speech and Signal Processing, 113–120 (1979)

    Google Scholar 

  2. Moreno, P.J., Raj, B., Stern, R.M.: A Vector Taylor Series Approach for Environment-Independent Speech Recognition. The Proceedings of IEEE, 733–736 (1995)

    Google Scholar 

  3. Kim, N.S., Kim, D.Y., Kong, B.G.: Application of VTS to Environment Compensation with Noise Statistics. In: ESCA workshop on Robust Speech Recognition, Pont-a-Mousson, France, pp. 99–102 (1997)

    Google Scholar 

  4. Kim, N.S.: Statistical Linear Approximation for Environment Compensation. IEEE Signal Processing Letters 1, 8–10 (1998)

    Google Scholar 

  5. Shen, H., Liu, G., Guo, J., Li, Q.: Two-Domain Feature Compensation for Robust Speech Recognition. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3497, pp. 351–356. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Shen, H., Guo, J., Liu, G., Li, Q.: Non-Stationary Environment Compensation Using Sequential EM Algorithm for Robust Speech Recognition. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 264–273. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. Gauvain, J.L., Lee, C.H.: Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observation of Markov Chains. IEEE Transactions on Speech and Audio Processing 2, 291–298 (1994)

    Article  Google Scholar 

  8. Huo, Q., Lee, C.H.: On-Line Adaptive Learning of the Continuous Density Hidden Markov Model Based on Approximate Recursive Bayes Estimate. IEEE Transactions on Speech and Audio Processing 2, 161–172 (1997)

    Google Scholar 

  9. Huo, Q., Chan, C., Lee, C.H.: Bayesian Adaptive Learning of the Parameters of Hidden Markov Model for Speech Recognition. IEEE Transactions on Speech and Audio Processing 5, 334–345 (1995)

    Google Scholar 

  10. Kohonen, T.: The self-Organizing Map. The Proceedings of the IEEE 78, 1464–1480 (1990)

    Article  Google Scholar 

  11. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 1–38 (1977)

    Google Scholar 

  12. Zu, Y.Q.: Issues in the Scientific Design of the Continuous Speech Database. Available: http://www.cass.net.cn/chinese/s18_yys/yuyin/report/report_1998.htm

  13. Varga, A., Steenneken, H.J.M., Tomilson, M., Jones, D.: The NOISEX–92 Study on the Effect of Additive Noise on Automatic Speech Recognition. Tech. Rep. DRA Speech Research Unit (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shen, H., Guo, J., Liu, G., Huang, P., Li, Q. (2005). Environment Compensation Based on Maximum a Posteriori Estimation for Improved Speech Recognition. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science(), vol 3789. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11579427_87

Download citation

  • DOI: https://doi.org/10.1007/11579427_87

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29896-0

  • Online ISBN: 978-3-540-31653-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics