Abstract
Applying speech recognition into real working systems, spontaneous speech recognition has increasing importance. For the development purposes of such applications, the need of spontaneous speech database is evident both for general design or training and testing of such systems. This paper describes the collection of Czech spontaneous data recorded within technical lectures. It is supposed to be used as a material for the analysis of particular phenomena which appear within spontaneous speech but also as an extension material for training of spontaneous speech recognizers. Mainly the presence of spontaneous speech phenomena such as higher rate of non-speech events, changes in pronunciation, or sentence irregularities, should be the most important contribution of the collected database for the training purposes in comparison to the usage of available read speech databases only. Speech signals are captured in two different channels with slightly different quality and about 14 hours of speech from 15 different speakers are currently collected and annotated. The first analyses of spontaneous speech related effects in the collected data have been performed and the comparison with read speech databases is presented.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Shriberg, E.: Spontaneous speech: How people really talk, and why engineers should care. In: Proc. Eurospeech 2005, Lisbon, Portugal, pp. 1781–1784 (2005)
Trancoso, I., Nunes, R., Neves, L., Viana, C., Moniz, H., Caseiro, D., Mata, A.I.: Recognition of classroom lectures in european Portuguese. In: Proc. Interspeech 2006, Pittsburgh, USA (2006)
Psutka, J., Radová, V., Müller, L., Matoušek, J., Ircing, P., Graff, D.: Large broadcast news and read speech corpora of spoken Czech. In: Proc. Eurpospeech 2001, Ålborg, Denmark, pp. 2067–2070 (2001)
Rajnoha, J., Pollák, P.: Modelling of speaker non-speech events in robust speech recognition. In: Proceedings of the 16th Czech-German Workshop on Speech Processing, Academy of Sciences of the Czech Republic, Institute of Radioengineering and Electronics, Prague, pp. 149–155 (2006)
Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: A free tool for segmenting, labeling and transcribing speech. In: Proc. of the First international conference on language resources & evaluation (LREC), Granada, Spain, pp. 1373–1376 (1998)
Pollák, P., Černocký, J.: Czech SPEECON adult database (November 2003), http://www.speechdat.org/speecon
Pollák, P., Hanžl, V.: Tool for Czech pronunciation generation combining fixed rules with pronunciation lexicon and lexicon management tool. In: Proc. of LREC 2002, Third International Conference on Language Resources and Evaluation, Las Palmas, Spain (May 2002)
LC-STAR II project site, http://www.lc-star.org/
Gajić, B., Markhus, V., Pettersen, S.G., Johnsen, M.H.: Automatic recognition of spontaneously dictated medical records for Norwegian. In: COST 278 and ISCA Tutorial and Research Workshop - ROBUST 2004 (2004)
Rajnoha, J.: Speaker non-speech event recognition with standard speech datasets. Acta Polytechnica 47(4-5), 107–111 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rajnoha, J., Pollák, P. (2009). Czech Spontaneous Speech Collection and Annotation: The Database of Technical Lectures. In: Esposito, A., Vích, R. (eds) Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. Lecture Notes in Computer Science(), vol 5641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03320-9_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-03320-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03319-3
Online ISBN: 978-3-642-03320-9
eBook Packages: Computer ScienceComputer Science (R0)