Skip to main content

Real-Time Recognition of Isolated Vowels

  • Conference paper
Perception in Multimodal Dialogue Systems (PIT 2008)

Abstract

In this paper we present a new approach to the problem of isolated vowel recognition in real-time. Language learning and speech therapy are examples of application areas that require real-time biofeedback of acoustic features. As the performance of known approaches usually drops for child speakers, we evaluated different alternatives of feature extraction and pattern recognition techniques, including PCA, LDA, ANN and Bayesian classification. In addition, we studied the explicit inclusion of pitch as a main parameter in both simulation and the real-time feature extraction process. Best results were obtained with our dataset when MFCCs are mapped, using LDA, to a 4-dimensional subspace that is followed by Bayesian classification. An interactive game was designed that implements the selected real-time vowel recognition technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Blandon, R.W.: Arguments agains formants in the auditory representation of speech. The representation of Speech in the Peripheral Auditory System, 95–102 (1982)

    Google Scholar 

  • de Lima AraĂºjo, A.M.: Jogos computacionais fonoarticulatĂ³rios para crianças com deficiĂªncia auditiva. PhD thesis, Universidade Estadual de Campinas (in Portuguese) (2000)

    Google Scholar 

  • Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. Wiley Interscience, Chichester (2001)

    MATH  Google Scholar 

  • Escobar, J., Partridge, C.: Flow synchronization protocol. IEEE/AXM Transactions on Networking 2, 111–121 (1994)

    Article  Google Scholar 

  • Ferreira, A.J.S.: New signal features for robust identification of isolated vowels. In: InterSpeech 2005 (2005)

    Google Scholar 

  • Ferreira, A.J.S.: Static features in real-time recognition of isolated vowels at high pitch. Journal of Acoustical Society of America 122, 2389–2404 (2007)

    Article  Google Scholar 

  • Hatzis, A.: Optical Logo-Therapy (OLT): Computer-based audio-visual feedback using interactive visual displays for speech training. PhD thesis, University of Sheffield (1999)

    Google Scholar 

  • van der Maaten, L.J.P.: An introduction to dimensionality reduction using matlab. Technical report, Universiteit Maastricht (2007)

    Google Scholar 

  • Wang, X., Paliwal, K.K.: Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition. Pattern Recognition 36, 2429–2439 (2003)

    Article  MATH  Google Scholar 

  • Whalen, D.H., Levitt, A.G.: The universality of intrinsic f0 of vowels. Journal of Phonetics 23, 349–366 (1995)

    Article  Google Scholar 

  • Zahorian, S.A., Jagharghi, A.J.: Spectral-shape features versus formants as acoustic correlates for vowels. Journal of Acoustic Society of America 94, 1966–1982 (1993)

    Article  Google Scholar 

  • Zahorian, S.A., Zimmer, A.M., Meng, F.: Vowel classification for computer-based visual feedback for speech training for the hearing impaired. International Conference on Spoken Language Processing 78, 973–976 (2002)

    Google Scholar 

  • Zimmer, A.M.: Vata: An improved personal computer based vowel articulation training aid. Master’s thesis, Old Dominion University (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Elisabeth André Laila Dybkjær Wolfgang Minker Heiko Neumann Roberto Pieraccini Michael Weber

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Carvalho, M., Ferreira, A. (2008). Real-Time Recognition of Isolated Vowels. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Pieraccini, R., Weber, M. (eds) Perception in Multimodal Dialogue Systems. PIT 2008. Lecture Notes in Computer Science(), vol 5078. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69369-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69369-7_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69368-0

  • Online ISBN: 978-3-540-69369-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics