Abstract
Discriminatively trained HMMs are investigated in both clean and noisy environments in this study. First, a recognition error is defined at different levels including string, word, phone and acoustics. A high resolution error measure in terms of minimum divergence (MD) is specifically proposed and investigated along with other error measures. Using two speaker-independent continuous digit databases, Aurora2(English) and CNDigits (Mandarin Chinese), the recognition performance of recognizers, which are trained in terms of different error measures and using different training modes, is evaluated under different noise and SNR conditions. Experimental results show that discriminatively trained models performed better than the maximum likelihood baseline systems. Specifically, for MD trained systems, relative error reductions of 17.62% and 18.52% were obtained applying multi-training on Aurora2 and CNDigits, respectively.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Gong, Y.: Speech Recognition in Noisy Environments: A Survey. Speech Communication 16, 261–291 (1995)
Varga, A.P., Moore, R.K.: Hidden Markov model decomposition of speech and noise. In: Proc. ICASSP, pp. 845–848 (1990)
Gales, M.J.F., Young, S.J.: Robust Continuous Speech Recognition using Parallel Model Combination. Tech.Rep., Cambridge University (1994)
Schluter, R.: Investigations on Discriminative Training Criteria. Ph.D.thesis, Aachen University (2000)
Valtchev, V., Odell, J.J., Woodland, P.C., Young, S.J.: MMIE Training of Large Vocabulary Speech Recognition Systems. Speech Communication 22, 303–314
Juang, B.-H., Chou, W., Lee, C.-H.: Minimum Classification Error Rate Methods for Speech Recogtion. IEEE Trans. on Speech and Audio Processing 5(3), 257–265 (1997)
Povey, D.: Discriminative Training for Large Vocabulary Speech Recognition. Ph.D. Thesis, Cambridge University (2004)
Ohkura, K., Rainton, D., Sugiyama, M.: Noise-robust HMMs Based on Minimum Error Classification. In: Proc. ICASSP, pp. 75–78 (1993)
Meyer, C., Rose, G.: Improved Noise Robustness by Corrective and Rival Training. In: Proc. ICASSP, pp. 293–296 (2001)
Laurila, K., Vasilache, M., Viikki, O.: A Combination of Discriminative and Maximum Likelihood Techniques for Noise Robust Speech Recognition. In: Proc. ICASSP, pp. 85–88 (1998)
Kullback, S., Leibler, R.A.: On Information and Sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
Du, J., Liu, P., Soong, F.K., Zhou, J.-L., Wang, R.H.: Minimum Divergence Based Discriminative Training. Accepted by Proc. ICSLP (2006)
Hirsch, H.G., Pearce, D.: The AURORA Experimental Framework for the Performance Evaluations of Speech Recognition Systems under Noisy Conditions. In: ISCA ITRW ASR 2000, Paris, France (2000)
Liu, P., Soong, F.K., Zhou, J.-L.: Effective Estimation of Kullback-Leibler Divergence between Speech Models. Tech. Rep., Microsoft Research Asia (2005)
Goldberger, J.: An Efficient Image Similarity Measure based on Approximations of KL-Divergence between Two Gaussian Mixtures. In: Proc. International Conference on Computer Vision 2003, Nice France, pp. 370–377 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Du, J., Liu, P., Soong, F., Zhou, JL., Wang, RH. (2006). Noisy Speech Recognition Performance of Discriminative HMMs. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_39
Download citation
DOI: https://doi.org/10.1007/11939993_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)