Skip to main content

A Universal Human Machine Speech Interaction Language for Robust Speech Recognition Applications

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3206))

Abstract

Automatic speech recognition systems are prone to errors when there are confusable words in the dictionary. In this paper, a new approach to the solution of this problem is proposed. The idea is to create a human machine speech interaction language (HUMSIL) with acoustically orthogonal words. In order to minimize pronunciation variations among different nationalities, we selected a common subset of phonemes across world’s major languages and generated a vocabulary set using the algorithm described in this paper. We performed two experiments to compare English, Turkish and HUMSIL in terms of digit recognition performance using microphone recordings from multi-national speakers. We found that in both of the experiments, the proposed vocabulary resulted in a significantly smaller error rate.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hemphill, C.T., Agarwal, R., Muthusamy, Y.K., Gong, Y.: Voice-Driven Information Access in the Automobile. IEEE Vehicular Technology Society News, August 8-11 (2000)

    Google Scholar 

  2. Arslan, L.M., Hansen, J.H.L.: Likehood Decision Boundary Estimation between HMMPairs in Speech Recognition. IEEE Trans. On Acoust. Speech, and Signal Processing 6(4), 410–414 (1998)

    Article  Google Scholar 

  3. Schubert, K. (ed.): Interlinguistics Aspects of the Science of Planned Languages, Trends in Linguistics. Studies and Monographs, vol. 42, p. 10. Mouton de Gruyter, Berlin (1989)

    Google Scholar 

  4. Mackenzie, I.S., Zang, S.: The immediate usability of Graffiti. In: Proc. of Graphics Interface 1997, pp. 129–137 (1997)

    Google Scholar 

  5. Fromkin, V., Rodman, R.: An Inroduction to Language. Rinehart and Winston, Inc., Orlando (1998)

    Google Scholar 

  6. Deller, J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-Time Processing of Speech Signals. Macmillan Publishing Company, Basingstoke (1993)

    Google Scholar 

  7. IPA, Handbook of the International Phonetic Association, Cambridge University Press (1999)

    Google Scholar 

  8. Maddieson, I.: Patterns of Sounds. Cambridge University Press, Cambridge (1984)

    Book  Google Scholar 

  9. Rabiner, L.R., Schafer, W.: Digital Processing of Speech Signals. Prentice Hall, Englewood Cliffs (1978)

    Google Scholar 

  10. Forgie, J.W., Forgie, C.D.: Results Obtained from a Vowel Recognition Computer Program. The Journal of the Acoustical Soceity of America 31(11), 1480–1489 (1959)

    Article  Google Scholar 

  11. Miller, G.A., Nicely, P.E.: An Analysis of Perceptual Confusions Among Some English Consonants. The Journal of the Acoustical Society of America 27(2), 338–352 (1955)

    Article  Google Scholar 

  12. House, A.S., Williams, C.E., Hecker, M.H.L., Kryter, K.D.: Articulation-Testing Methods: Consonantal Differentiation with a Closed-Response Set. The Journal of the Acoustical Society of America 37(1) (1965)

    Google Scholar 

  13. Odlin, T.: Cross-linguistic Influence in Language Learning. Cambridge University Press, Cambridge (1989)

    Google Scholar 

  14. Roe, D.B., Riley, M.D.: Prediction of Word Confusabilities for Speech Recognition, pp. 227–230. ICSLP, Yokohama (1994)

    Google Scholar 

  15. Arslan, L.M.: A New Universal Language for Speech Recognition Applications. In: IEEE Proc. ICASSP, Istanbul Turkey (2000)

    Google Scholar 

  16. Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice Hall, Englewood Cliffs (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Arısoy, E., Arslan, L.M. (2004). A Universal Human Machine Speech Interaction Language for Robust Speech Recognition Applications. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30120-2_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23049-6

  • Online ISBN: 978-3-540-30120-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics