Skip to main content

Czech Spontaneous Speech Collection and Annotation: The Database of Technical Lectures

  • Conference paper
Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5641))

Abstract

Applying speech recognition into real working systems, spontaneous speech recognition has increasing importance. For the development purposes of such applications, the need of spontaneous speech database is evident both for general design or training and testing of such systems. This paper describes the collection of Czech spontaneous data recorded within technical lectures. It is supposed to be used as a material for the analysis of particular phenomena which appear within spontaneous speech but also as an extension material for training of spontaneous speech recognizers. Mainly the presence of spontaneous speech phenomena such as higher rate of non-speech events, changes in pronunciation, or sentence irregularities, should be the most important contribution of the collected database for the training purposes in comparison to the usage of available read speech databases only. Speech signals are captured in two different channels with slightly different quality and about 14 hours of speech from 15 different speakers are currently collected and annotated. The first analyses of spontaneous speech related effects in the collected data have been performed and the comparison with read speech databases is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Shriberg, E.: Spontaneous speech: How people really talk, and why engineers should care. In: Proc. Eurospeech 2005, Lisbon, Portugal, pp. 1781–1784 (2005)

    Google Scholar 

  2. Trancoso, I., Nunes, R., Neves, L., Viana, C., Moniz, H., Caseiro, D., Mata, A.I.: Recognition of classroom lectures in european Portuguese. In: Proc. Interspeech 2006, Pittsburgh, USA (2006)

    Google Scholar 

  3. Psutka, J., Radová, V., Müller, L., Matoušek, J., Ircing, P., Graff, D.: Large broadcast news and read speech corpora of spoken Czech. In: Proc. Eurpospeech 2001, Ålborg, Denmark, pp. 2067–2070 (2001)

    Google Scholar 

  4. Rajnoha, J., Pollák, P.: Modelling of speaker non-speech events in robust speech recognition. In: Proceedings of the 16th Czech-German Workshop on Speech Processing, Academy of Sciences of the Czech Republic, Institute of Radioengineering and Electronics, Prague, pp. 149–155 (2006)

    Google Scholar 

  5. Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: A free tool for segmenting, labeling and transcribing speech. In: Proc. of the First international conference on language resources & evaluation (LREC), Granada, Spain, pp. 1373–1376 (1998)

    Google Scholar 

  6. Pollák, P., Černocký, J.: Czech SPEECON adult database (November 2003), http://www.speechdat.org/speecon

  7. Pollák, P., Hanžl, V.: Tool for Czech pronunciation generation combining fixed rules with pronunciation lexicon and lexicon management tool. In: Proc. of LREC 2002, Third International Conference on Language Resources and Evaluation, Las Palmas, Spain (May 2002)

    Google Scholar 

  8. LC-STAR II project site, http://www.lc-star.org/

  9. Gajić, B., Markhus, V., Pettersen, S.G., Johnsen, M.H.: Automatic recognition of spontaneously dictated medical records for Norwegian. In: COST 278 and ISCA Tutorial and Research Workshop - ROBUST 2004 (2004)

    Google Scholar 

  10. Rajnoha, J.: Speaker non-speech event recognition with standard speech datasets. Acta Polytechnica 47(4-5), 107–111 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rajnoha, J., Pollák, P. (2009). Czech Spontaneous Speech Collection and Annotation: The Database of Technical Lectures. In: Esposito, A., Vích, R. (eds) Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. Lecture Notes in Computer Science(), vol 5641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03320-9_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03320-9_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03319-3

  • Online ISBN: 978-3-642-03320-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics