Abstract
In this paper we present a new approach to the problem of isolated vowel recognition in real-time. Language learning and speech therapy are examples of application areas that require real-time biofeedback of acoustic features. As the performance of known approaches usually drops for child speakers, we evaluated different alternatives of feature extraction and pattern recognition techniques, including PCA, LDA, ANN and Bayesian classification. In addition, we studied the explicit inclusion of pitch as a main parameter in both simulation and the real-time feature extraction process. Best results were obtained with our dataset when MFCCs are mapped, using LDA, to a 4-dimensional subspace that is followed by Bayesian classification. An interactive game was designed that implements the selected real-time vowel recognition technique.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blandon, R.W.: Arguments agains formants in the auditory representation of speech. The representation of Speech in the Peripheral Auditory System, 95–102 (1982)
de Lima AraĂºjo, A.M.: Jogos computacionais fonoarticulatĂ³rios para crianças com deficiĂªncia auditiva. PhD thesis, Universidade Estadual de Campinas (in Portuguese) (2000)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. Wiley Interscience, Chichester (2001)
Escobar, J., Partridge, C.: Flow synchronization protocol. IEEE/AXM Transactions on Networking 2, 111–121 (1994)
Ferreira, A.J.S.: New signal features for robust identification of isolated vowels. In: InterSpeech 2005 (2005)
Ferreira, A.J.S.: Static features in real-time recognition of isolated vowels at high pitch. Journal of Acoustical Society of America 122, 2389–2404 (2007)
Hatzis, A.: Optical Logo-Therapy (OLT): Computer-based audio-visual feedback using interactive visual displays for speech training. PhD thesis, University of Sheffield (1999)
van der Maaten, L.J.P.: An introduction to dimensionality reduction using matlab. Technical report, Universiteit Maastricht (2007)
Wang, X., Paliwal, K.K.: Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition. Pattern Recognition 36, 2429–2439 (2003)
Whalen, D.H., Levitt, A.G.: The universality of intrinsic f0 of vowels. Journal of Phonetics 23, 349–366 (1995)
Zahorian, S.A., Jagharghi, A.J.: Spectral-shape features versus formants as acoustic correlates for vowels. Journal of Acoustic Society of America 94, 1966–1982 (1993)
Zahorian, S.A., Zimmer, A.M., Meng, F.: Vowel classification for computer-based visual feedback for speech training for the hearing impaired. International Conference on Spoken Language Processing 78, 973–976 (2002)
Zimmer, A.M.: Vata: An improved personal computer based vowel articulation training aid. Master’s thesis, Old Dominion University (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Carvalho, M., Ferreira, A. (2008). Real-Time Recognition of Isolated Vowels. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Pieraccini, R., Weber, M. (eds) Perception in Multimodal Dialogue Systems. PIT 2008. Lecture Notes in Computer Science(), vol 5078. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69369-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-69369-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69368-0
Online ISBN: 978-3-540-69369-7
eBook Packages: Computer ScienceComputer Science (R0)