Abstract
This paper presents an approach for estimating pronunciation similarity between two speakers using the cepstral distance. General speech recognition systems have been used to find the matched words of a speaker, using the acoustical score of a speech signal and the grammatical score of a word sequence. In the case of learning a language, for a speaker with impaired hearing, it is not easy to estimate the pronunciation similarity using automatic speech recognition systems, as this requires more information of pronouncing characteristics, than information on word matching. This is a new challenge for computer aided pronunciation learning. The dynamic time warping algorithm is used for cepstral distance computation between two speech data with codebook distance subtracted to consider the characteristics of each speaker. The experiments evaluated on the Korean fundamental vowel set show that the similarity of two speaker’s pronunciation can be efficiently computed using computers.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)
Yan, Q., Vaseghi, S., Rentzos, D., Ho, H.C., Turajlic, E.: Analysis of acoustic correlates of Britich, Australian and American accents. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 345–350 (2003)
Humphries, J.: Accent modelling and adaptation in acoustic speech recognition, Ph.D. thesis, Cambridge University (1997)
Yan, Q., Vaseghi, S.: Analysis, modeling and synthesis of formants of British, American and Australian accents. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 712–715 (2003)
Rahim, M.G., Lee, C.H., Juang, B.H.: Discriminative utterance verification for connected digits recognition. IEEE Transactions on Speech and Audio Processing 5(3), 266–277 (1997)
Sukkar, R.A., Setlur, A.R., Lee, C.H., Jacob, J.: Verifying and correcting recognition string hypotheses using discriminative utterance verification. Speech Communication 22, 333–342 (1997)
Rose, R.C., Juang, B.H., Lee, C.H.: A training procedure for verifying string hypothesis in continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 281–284 (1995)
Jiang, H.: Confidence measure for speech recognition: A survey. Speech Communication 45, 455–470 (2005)
Witt, S.M.: Use of Speech Recognition in Computer-assisted Language Learning, Ph.D. thesis, Cambridge University (1999)
Myers, C., Rabiner, L., Rosenberg, A.: Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(6), 623–635 (1980)
Vapnik, V.N.: An overview of statistical learning theory. IEEE Transactions on Neural Networks 10(5), 998–999 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, D., Yook, D. (2006). Pronunciation Similarity Estimation for Spoken Language Learning. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_46
Download citation
DOI: https://doi.org/10.1007/11940098_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)