Skip to main content

A robust front-end for telephone speech recognition

  • Image Retrieval and Speech Recognition
  • Conference paper
  • First Online:
PRICAI’98: Topics in Artificial Intelligence (PRICAI 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1531))

Included in the following conference series:

  • 95 Accesses

Abstract

In this study, we propose an effective front-end technique to improve the performance of telephone speech recognition. Many works have been concentrated on compensating the noise and the channel distortions contained in telephone speech at the front-end stage of speech recognition. Based on RASTA processing which is well known for its channel robust feature parameters, we tried to further improve this method using the channel estimation power of cepstral mean subtraction and maximum likelihood method. As a hybrid method of channel estimation and RASTA processing, the proposed method was proved to be effective by experiments performed on real telephone speech data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust. Speech Signal Processing, Vol. ASSP-27, no. 2, pp 113–120, 1979.

    Article  Google Scholar 

  2. P. Lockwood and J. Boudy, “Experiments with a nonlinear spectral subtractor (nss), hidden markov models and the projection, for robust speech recognition in cars,” Speech Communication, 11:215–228, 1992.

    Article  Google Scholar 

  3. D. Mansour and B. H. Juang, “The short-time modified coherence representation and its application for noisy speech recognition,” Proc. ICASSP, pp. 525–528, 1988.

    Google Scholar 

  4. P. J. Moreno, “Speech Recognition in Telephone Environments,” MS. Thesis, Carnegie Mellon University, 1992.

    Google Scholar 

  5. C. Mokbel, J. Monne and D. Jouvet, “On-line adaptation of a speech recognizer to variations in telephone line conditions,” Proc. EUROSPEECH, pp. 1247–1250, 1993.

    Google Scholar 

  6. H. Hermansky, N. Morgan, A. Bayya and P. Kobn, “Compensation for the effect of the communication channel in Auditory-like analysis of speech (RASTA-PLP),” Proc. EUROSPEECH, pp. 1367–1370, 1991.

    Google Scholar 

  7. A. Acero, “Environmental Robustness in Automatic Speech Recognition,” Proc. ICASSP, pp. 849–852, 1990.

    Google Scholar 

  8. B. A. Hanson and T. H. Applebaum, “Subband or cepstral domain filtering for recognition of Lombard and channel-distorted speech,” Proc. ICASSP, pp. 79–82, 1993.

    Google Scholar 

  9. J. T. Chien, H. C. Wang and L. M. Lee, “Estimation of channel bias for telephone speech recognition,” Proc. ICSLP, pp. 1840–1843, 1996.

    Google Scholar 

  10. J. D. Veth and L. Boves, “Comparison of channel normalization technique for automatic speech recognition over the phone,” Proc. ICSLP, pp. 2332–2335, 1996.

    Google Scholar 

  11. C. Avendano, S. V. Vuuren and H. Hermansky, “Data Based Filter Design for RASTA-like Channel Normalization in ASR,” Proc. ICSLP, pp. 2087–2090, 1996.

    Google Scholar 

  12. J. L. Shen, W. L. Hwang and L. S. Lee, “Robust Speech Recognition Features Based on Temporal Trajectory Filtering of Frequency Band Spectrum,” Proc. ICSLP, pp. 881–884, 1996.

    Google Scholar 

  13. J. D. Veth and L. Boves, “Comparison of channel normalization technique for automatic speech recognition over the phone,” Proc. ICSLP, pp. 2332–2335, 1996.

    Google Scholar 

  14. B. A. Hanson and T. H. Applebaum, “Robust speaker-independent word recognition using static, dynamic and acceleration features: Experiments with Lombard and noisy speech,” Proc. ICASSP, pp. 857–860, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hing-Yan Lee Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cho, HY., Chi, SM., Oh, YH. (1998). A robust front-end for telephone speech recognition. In: Lee, HY., Motoda, H. (eds) PRICAI’98: Topics in Artificial Intelligence. PRICAI 1998. Lecture Notes in Computer Science, vol 1531. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0095307

Download citation

  • DOI: https://doi.org/10.1007/BFb0095307

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65271-7

  • Online ISBN: 978-3-540-49461-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics