Abstract
This paper presents a novel bilingual model modification approach to improve nonnative speech recognition accuracy when the variations of accented pronunciations occur. Each state of baseline nonnative acoustic model is modified with several candidate states from the auxiliary acoustic model, which is trained on speakers’ mother language. State mapping criterion and n-best candidates are investigated, and different numbers of Gaussian mixtures of the auxiliary acoustic model are compared based on a grammar-constrained speech recognition system. Using this bilingual model modification approach, compared to the nonnative acoustic model which has already been well trained by adaptation technique MAP, the Phrase Error Rate further achieves a 5.83% relative reduction, while only a small relative increase on Real Time Factor occurs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tomokiyo, L.M., Waibel, A.: Adaptation Methods for Nonnative Speech. In: Proceedings of Multilinguality in Spoken Language Processing (2001)
Zhang, Q., Pan, J., Yan, Y.: Mandarin-English Bilingual Speech Recognition for Real World Music Retrieval. In: ICASSP 2008, paper 1147, Las Vegas, March 30 - April 4 (2008)
Humphries, J., Woodland, P., Pearce, D.: Using accent-specific pronunciation modeling for robust speech recognition. In: Proc. ICSLP 1996, Philadelphia, PA, October 1996, pp. 2324–2327 (1996)
Teixeira, C., Trancoso, C., Serralheiro, A.: Recognition of Non-native Accents. In: Proc. Eurospeech 1997, Rhodes, Greece, September 1997, pp. 2375–2378 (1997)
Livescu, K.: Analysis and Modeling of Non-native Speech for Automatic Speech Recognition. Master’s thesis, MIT (August 1999)
Wang, Z., Schultz, T., Waibel, A.: Comparison of Acoustic Model Adaptation Techniques on Non-native Speech. In: Proc. ICASSP (2003)
Clarke, Constance, Jurafsky, Daniel: Limitations of MLLR Adaptation with Spanish-accented English: an Error Analysis. In: INTERSPEECH 2006, paper 1611-Tue2BuP.7 (2006)
Bohn, O.-S., Flege, J.E.: The production of New and Similar Vowels by Adult German Learners of English. Stud. Second Lang. Acquis. 14, 131–158 (1992)
The CMU Pronouncing Dictionary v0.6, The Carnegie Mellon University, http://www.speech.cs.cmu.edu/cgi-bin/cmudict
IPA. The International Phonetic Association (revised to 1993) IPA Chart. Journal of the International Phonetic Association 23 (1993)
Flege, J.E.: Production and Perception of a Novel, Second-language Phonetic Contrast. Journal of the Acoustical Society of America 93, 1589–1608 (1993)
Li, A., Yin, Z., Wang, T., Fang, Q., Hu, F.: RASC863 - A Chinese Speech Corpus with Four Regional Accents. In: ICSLT-o-COCOSDA, New Delhi, India (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Zhang, Q., Pan, J., Chan, Sd., Yan, Y. (2009). Nonnative Speech Recognition Based on Bilingual Model Modification at State Level. In: Wang, H., Shen, Y., Huang, T., Zeng, Z. (eds) The Sixth International Symposium on Neural Networks (ISNN 2009). Advances in Intelligent and Soft Computing, vol 56. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01216-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-01216-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01215-0
Online ISBN: 978-3-642-01216-7
eBook Packages: EngineeringEngineering (R0)