Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4285))

Included in the following conference series:

  • 1002 Accesses

Abstract

This paper presents an approach for estimating pronunciation similarity between two speakers using the cepstral distance. General speech recognition systems have been used to find the matched words of a speaker, using the acoustical score of a speech signal and the grammatical score of a word sequence. In the case of learning a language, for a speaker with impaired hearing, it is not easy to estimate the pronunciation similarity using automatic speech recognition systems, as this requires more information of pronouncing characteristics, than information on word matching. This is a new challenge for computer aided pronunciation learning. The dynamic time warping algorithm is used for cepstral distance computation between two speech data with codebook distance subtracted to consider the characteristics of each speaker. The experiments evaluated on the Korean fundamental vowel set show that the similarity of two speaker’s pronunciation can be efficiently computed using computers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  2. Yan, Q., Vaseghi, S., Rentzos, D., Ho, H.C., Turajlic, E.: Analysis of acoustic correlates of Britich, Australian and American accents. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 345–350 (2003)

    Google Scholar 

  3. Humphries, J.: Accent modelling and adaptation in acoustic speech recognition, Ph.D. thesis, Cambridge University (1997)

    Google Scholar 

  4. Yan, Q., Vaseghi, S.: Analysis, modeling and synthesis of formants of British, American and Australian accents. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 712–715 (2003)

    Google Scholar 

  5. Rahim, M.G., Lee, C.H., Juang, B.H.: Discriminative utterance verification for connected digits recognition. IEEE Transactions on Speech and Audio Processing 5(3), 266–277 (1997)

    Article  Google Scholar 

  6. Sukkar, R.A., Setlur, A.R., Lee, C.H., Jacob, J.: Verifying and correcting recognition string hypotheses using discriminative utterance verification. Speech Communication 22, 333–342 (1997)

    Article  Google Scholar 

  7. Rose, R.C., Juang, B.H., Lee, C.H.: A training procedure for verifying string hypothesis in continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 281–284 (1995)

    Google Scholar 

  8. Jiang, H.: Confidence measure for speech recognition: A survey. Speech Communication 45, 455–470 (2005)

    Article  Google Scholar 

  9. Witt, S.M.: Use of Speech Recognition in Computer-assisted Language Learning, Ph.D. thesis, Cambridge University (1999)

    Google Scholar 

  10. Myers, C., Rabiner, L., Rosenberg, A.: Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(6), 623–635 (1980)

    Article  MATH  Google Scholar 

  11. Vapnik, V.N.: An overview of statistical learning theory. IEEE Transactions on Neural Networks 10(5), 998–999 (1999)

    Article  Google Scholar 

  12. http://svmlight.joachims.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, D., Yook, D. (2006). Pronunciation Similarity Estimation for Spoken Language Learning. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_46

Download citation

  • DOI: https://doi.org/10.1007/11940098_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49667-0

  • Online ISBN: 978-3-540-49668-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics