A robust front-end for telephone speech recognition

Cho, Hoon-Young; Chi, Sang-Mun; Oh, Yung-Hwan

doi:10.1007/BFb0095307

Hoon-Young Cho¹,
Sang-Mun Chi¹ &
Yung-Hwan Oh¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1531))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

95 Accesses

Abstract

In this study, we propose an effective front-end technique to improve the performance of telephone speech recognition. Many works have been concentrated on compensating the noise and the channel distortions contained in telephone speech at the front-end stage of speech recognition. Based on RASTA processing which is well known for its channel robust feature parameters, we tried to further improve this method using the channel estimation power of cepstral mean subtraction and maximum likelihood method. As a hybrid method of channel estimation and RASTA processing, the proposed method was proved to be effective by experiments performed on real telephone speech data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust. Speech Signal Processing, Vol. ASSP-27, no. 2, pp 113–120, 1979.
Article Google Scholar
P. Lockwood and J. Boudy, “Experiments with a nonlinear spectral subtractor (nss), hidden markov models and the projection, for robust speech recognition in cars,” Speech Communication, 11:215–228, 1992.
Article Google Scholar
D. Mansour and B. H. Juang, “The short-time modified coherence representation and its application for noisy speech recognition,” Proc. ICASSP, pp. 525–528, 1988.
Google Scholar
P. J. Moreno, “Speech Recognition in Telephone Environments,” MS. Thesis, Carnegie Mellon University, 1992.
Google Scholar
C. Mokbel, J. Monne and D. Jouvet, “On-line adaptation of a speech recognizer to variations in telephone line conditions,” Proc. EUROSPEECH, pp. 1247–1250, 1993.
Google Scholar
H. Hermansky, N. Morgan, A. Bayya and P. Kobn, “Compensation for the effect of the communication channel in Auditory-like analysis of speech (RASTA-PLP),” Proc. EUROSPEECH, pp. 1367–1370, 1991.
Google Scholar
A. Acero, “Environmental Robustness in Automatic Speech Recognition,” Proc. ICASSP, pp. 849–852, 1990.
Google Scholar
B. A. Hanson and T. H. Applebaum, “Subband or cepstral domain filtering for recognition of Lombard and channel-distorted speech,” Proc. ICASSP, pp. 79–82, 1993.
Google Scholar
J. T. Chien, H. C. Wang and L. M. Lee, “Estimation of channel bias for telephone speech recognition,” Proc. ICSLP, pp. 1840–1843, 1996.
Google Scholar
J. D. Veth and L. Boves, “Comparison of channel normalization technique for automatic speech recognition over the phone,” Proc. ICSLP, pp. 2332–2335, 1996.
Google Scholar
C. Avendano, S. V. Vuuren and H. Hermansky, “Data Based Filter Design for RASTA-like Channel Normalization in ASR,” Proc. ICSLP, pp. 2087–2090, 1996.
Google Scholar
J. L. Shen, W. L. Hwang and L. S. Lee, “Robust Speech Recognition Features Based on Temporal Trajectory Filtering of Frequency Band Spectrum,” Proc. ICSLP, pp. 881–884, 1996.
Google Scholar
J. D. Veth and L. Boves, “Comparison of channel normalization technique for automatic speech recognition over the phone,” Proc. ICSLP, pp. 2332–2335, 1996.
Google Scholar
B. A. Hanson and T. H. Applebaum, “Robust speaker-independent word recognition using static, dynamic and acceleration features: Experiments with Lombard and noisy speech,” Proc. ICASSP, pp. 857–860, 1990.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Korea Advanced Institute of Science and Technology, Korea
Hoon-Young Cho, Sang-Mun Chi & Yung-Hwan Oh

Authors

Hoon-Young Cho
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Mun Chi
View author publications
You can also search for this author in PubMed Google Scholar
Yung-Hwan Oh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Hing-Yan Lee Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cho, HY., Chi, SM., Oh, YH. (1998). A robust front-end for telephone speech recognition. In: Lee, HY., Motoda, H. (eds) PRICAI’98: Topics in Artificial Intelligence. PRICAI 1998. Lecture Notes in Computer Science, vol 1531. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0095307

Download citation

DOI: https://doi.org/10.1007/BFb0095307
Published: 20 October 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65271-7
Online ISBN: 978-3-540-49461-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics