Abstract
Prosody recognition experiments have been prepared in the Laboratory of Speech Acoustics, in which, among the others, we were searching for the possibilities of the recognition of sentence modalities. Due to our promising results in the sentence modality recognition, we adopted the method for children modality recognition, and looked for the possibility, how it can be used as an automatic feedback in an audio - visual pronunciation teaching and training system. Our goal was to develop a sentence intonation teaching and training system for speech handicapped children, helping them to learn the correct prosodic pronunciation of sentence. HMM models of modality types were built by training the recognizer with a correctly speaking children database. During the present work, a large database was collected from speech impaired children. Subjective tests were carried out with this database of speech impaired children, in order to examine how human listeners are able to categorize the heard recordings of sentence modalities. Then automatic sentence modality recognition experiments were done with the formerly trained HMM models. By the result of the subjective tests, the probability of acceptance of the sentence modality recognizer can be adjusted. Comparing the result of the subjective tests and the results of the automatic sentence modality recognition tests processed on the database of speech impaired children, it is showed that the automatic recognizer classified the recordings more strictly, but not worse. The introduced method could be implemented as a part of a speech teaching system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vicsi, K.: Computer-Assisted Pronunciation Teaching and Training Methods Based on the Dynamic Spectro-Temporal Characteristics of Speech. In: Dynamics of Speech Production and Perception, pp. 283–304. IOS Press, Amsterdam (2006)
de Bot, K.: Visual feedback of intonation: Effectiveness and induced practice behavior. Lang. Speech 26(4), 331–335 (1983)
James, E.: The acquisition of prosodic features of speech using a speech visualizer. IRAL 14(3), 227–243 (1976)
Vicsi, K., Csatári, F., Bakcsi, Z., Tantos, A.: Distance score evaluation of the visualized speech spectra at audio-visual articulation training. In: Proc. Eurospeech, pp. 1911–1914 (1999)
ISTRA Indiana Speech Training Aid Features. Bloomington, IN: Communication Disorders Technology, Inc. (2003), http://www.comdistec.com/istra_faq.shtml
Vicsi, K., Szaszák, Gy.: Using Prosody for the Imporvement of ASR - Sentence Modality Recognition. In: Proc. of Interspeech2008, Bristol, ISCA Archive (2008), http://www.isca-speech.org/archive
The Snack Sound Toolkit, http://www.speech.kth.se/snack/
HTK Speech Recognition Toolkit, http://htk.eng.cam.ac.uk/
Szaszák, Gy., Vicsi, K.: Speech recognition supported by prosodic information for fixed stress languages. In: Proceeding of TSD conference Brno, pp. 262–269 (2000)
Szaszák, Gy., Vicsi, K.: Using prosody in fixed stress languages for improvement of speech recognition. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) COST Action 2102. LNCS (LNAI), vol. 4775, pp. 138–149. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Sztaho, D., Nagy, K., Vicsi, K. (2010). Subjective Tests and Automatic Sentence Modality Recognition with Recordings of Speech Impaired Children. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-12397-9_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)