Accent Issues in Large Vocabulary Continuous Speech Recognition

Huang, Chao; Chen, Tao; Chang, Eric

doi:10.1023/B:IJST.0000017014.52972.1d

Accent Issues in Large Vocabulary Continuous Speech Recognition

Published: April 2004

Volume 7, pages 141–153, (2004)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Chao Huang¹,
Tao Chen¹ &
Eric Chang¹

540 Accesses
44 Citations
3 Altmetric
Explore all metrics

Abstract

This paper addresses accent¹ issues in large vocabulary continuous speech recognition. Cross-accent experiments show that the accent problem is very dominant in speech recognition. Analysis based on multivariate statistical tools (principal component analysis and independent component analysis) confirms that accent is one of the key factors in speaker variability. Considering different applications, we proposed two methods for accent adaptation. When a certain amount of adaptation data was available, pronunciation dictionary modeling was adopted to reduce recognition errors caused by pronunciation mistakes. When a large corpus was collected for each accent type, accent-dependent models were trained and a Gaussian mixture model-based accent identification system was developed for model selection. We report experimental results for the two schemes and verify their efficiency in each situation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accent Issues in Continuous Speech Recognition System

Text-Independent Automatic Accent Identification System for Kannada Language

Supervised Machine Learning Model for Accent Recognition in English Speech Using Sequential MFCC Features

References

Berkling, K., Zissman, M., Vonwiller, J., and Cleirigh, C. (1998). Improving accent identification through knowledge of English syllable structure. Proc. International Conference on Spoken Language Processing, vol. 2, pp. 89–92.
Google Scholar
Chang, E., Zhou, J., Huang, C., Di, S., and Lee, K.F. (2000). Large vocabulary mandarin speech recognition with different approaches in modeling tones. Proc. International Conference on Spoken Language Processing, vol. 2, pp. 983–986.
Google Scholar
Chen, T., Huang, C., Chang, E., and Wang, J. (2001). Automatic accent identification using Gaussian mixture models. Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, Italy.
Chen, T., Huang, C., Chang, E., and Wang, J. (2002). On the use of Gaussian mixture model for speaker variability analysis. Proc. International Conference on Spoken Language Processing, vol. 2, pp. 1249–1252.
Google Scholar
Dempster, A.P., Laird, N.M., and Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39:1–38.
Google Scholar
Fung, P. and Liu, W.K. (1999). Fast accent identification and accented speech recognition. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 221–224.
Google Scholar
Gales, M.J.F. (2000). Cluster adaptive training of hidden Markov models. IEEE Transactions on Speech and Audio Processing, 8:417–428.
Google Scholar
Hansen, J.H.L. and Arslan, L.M. (1995). Foreign accent classification using source generator based prosodic features. Proc. International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 836–839.
Google Scholar
Hotellings, H. (1933). Analysis of a complex of statistical variables into principle components. J. Educ. Psychol., 24:417–441, 498-520.
Google Scholar
Hu, Z.H. (1999). Understanding and adapting to speaker variability using correlation-based principal component analysis. PhD Dissertation, Oregon Graduate Institute of Science and Technology.
Huang, C., Chang, E., Zhou, J.L., and Lee, K.F. (2000). Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition. Proc. International Conference on Spoken Language Processing, vol. 3, pp. 818–821.
Google Scholar
Huang, C., Chen, T., Li, S., Chang, E., and Zhou, J.L. (2001). Analysis of speaker variability. Proc. European Conference on Speech Communication and Technology. Denmark, vol. 2, pp. 1377–1380.
Google Scholar
Huang, C., Chen, T., and Chang, E. (2002) Speaker selection training for large vocabulary continuous speech recognition, Proc. International Conference on Acoustics, Speech, and Signal Processing. Florida, USA. vol. 1, pp. 609–612.
Google Scholar
Humphries, J.J. and Woodland, P.C. (1998). The use of accentspecific pronunciation dictionaries in acoustic model training. Proc. International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 317–320.
Google Scholar
Hyvarinen, A. and Oja, E. (2000). Independent component analysis: algorithms and application. Neural Networks, 13:411–430.
Google Scholar
Lee, C.-H., Lin C.-H., and Juang, B.-H. (1991). A study on speaker adaptation of the parameters of continuous density hidden Markov models. IEEE Transactions on Signal Processing, 39:806–814.
Google Scholar
Leggetter, C.J. and Woodland, P.C. (1995). Maximum likely-hood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech and Language, 9:171–185.
Google Scholar
Liu, M.K., Xu, B., Huang, T.Y., Deng, Y.G., and Li, C.R. (2000). Mandarin accent adaptation based on context-independent/context-dependent pronunciation modeling. Proc. International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1025–1028.
Google Scholar
Malayath, N., Hermansky, H., and Kain, A. (1997). Towards decomposing the sources of variability in speech. Proc. European Conference on Speech Communication and Technology, vol. 1, pp. 497–500.
Google Scholar
Riley, M.D. and Ljolje, A. (1996). Automatic generation of detailed pronunciation lexicon. Automatic Speech and Speaker Recognition: Advanced Topics. Kluwer Academic Press, ch. 12, pp. 285-302.
Riley, M.D., Byrne, W., Finke, M., Khudanpur, S., Ljolje, A., McDonough, J., Nock, H., Saraclar, M., Wooters, C., and Zavaliagkos, G. (1999). Stochastic pronunciation modeling from hand-labelled phonetic corpora. Speech Communication, 29:209–224.
Google Scholar
Strik, H. and Cucchiarini, C. (1998) Modeling pronunciation variation for ASR: Overview and comparison of methods. Proc. ETRW Workshop on Modeling Pronunciation Variation for ASR, Kerkrade, pp. 137-144.
Teixeira, C., Trancoso, I., and Serralheiro, A. (1996). Accent identification. Proc. International Conference on Spoken Language Processing, vol. 3, pp. 1784–1787.
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research Asia, 5F, Sigma Center, No. 49, Zhichun Road, Beijing, 100080, China
Chao Huang, Tao Chen & Eric Chang

Authors

Chao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Eric Chang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, C., Chen, T. & Chang, E. Accent Issues in Large Vocabulary Continuous Speech Recognition. International Journal of Speech Technology 7, 141–153 (2004). https://doi.org/10.1023/B:IJST.0000017014.52972.1d

Download citation

Issue Date: April 2004
DOI: https://doi.org/10.1023/B:IJST.0000017014.52972.1d

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accent Issues in Large Vocabulary Continuous Speech Recognition

Abstract

Access this article

Similar content being viewed by others

Accent Issues in Continuous Speech Recognition System

Text-Independent Automatic Accent Identification System for Kannada Language

Supervised Machine Learning Model for Accent Recognition in English Speech Using Sequential MFCC Features

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Accent Issues in Large Vocabulary Continuous Speech Recognition

Abstract

Access this article

Similar content being viewed by others

Accent Issues in Continuous Speech Recognition System

Text-Independent Automatic Accent Identification System for Kannada Language

Supervised Machine Learning Model for Accent Recognition in English Speech Using Sequential MFCC Features

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation