Abstract
This work describes classification of speech from native and non-native speakers, enabling accent-dependent automatic speech recognition. In addition to the acoustic signal, lexical features from transcripts of the speech data can also provide significant evidence of a speaker’s accent type. Subsets of the Fisher corpus, ranging over diverse accents, were used for these experiments. Relative to human-audited judgments, accent classifiers that exploited acoustic and lexical features achieved up to 84.5% classification accuracy. Compared to a system trained only on native speakers, using this classifier in a recognizer with accent-specific acoustic and language models resulted in 16.5% improvement for the non-native speakers, and a 7.2% improvement overall.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brill, E.: A report of recent progress in transformation-based error-driven learning. In: AAAI (1994)
Huang, C., Chang, E., Chen, T.: Accent issues in large vocabulary continuous speech recognition. Technical Report MSR-TR-2001-69, Microsoft Research China, Beijing, China (2001)
Ikeno, A., et al.: Issues in recognition of Spanish-accented spontaneous english. In: Proceedings of IEEE/ISCA Workshop on Spontaneous Speech Processing and Recognition, Tokyo, Japan (2003)
Joachims, T.: Text categorization with Support Vector Machines: Learning with many relevant features. In: Proc. of European Conference on Machine Learning (1998)
Kat, L.W., Fung, P.: MLLR-based accent model adaptation without accented data. In: Proceedings of ICSLP (2000)
Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Intl. Conf. on Spoken Language Processing (2002)
Stolcke, A., et al.: The SRI March 2000 Hub-5 conversational speech transcription system. In: Proc. NIST Speech Transcription Workshop, University of Maryland (May 2000)
Tomokiyo, L.M.: Recognizing Non-native Speech: Characterizing and Adapting to Non-native Usage in LVCSR. PhD thesis, Carnegie Mellon University (2001)
Ward, W., et al.: Lexicon adaptation for LVCSR: Speaker idiosyncracies, non-native speakers, and pronunciation choice. In: Proceedings of the PMLA Workshop, Estes Park, Colorado (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Faria, A. (2006). Accent Classification for Speech Recognition. In: Renals, S., Bengio, S. (eds) Machine Learning for Multimodal Interaction. MLMI 2005. Lecture Notes in Computer Science, vol 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_25
Download citation
DOI: https://doi.org/10.1007/11677482_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32549-9
Online ISBN: 978-3-540-32550-5
eBook Packages: Computer ScienceComputer Science (R0)