Preparing Audio Recordings of Everyday Speech for Prosody Research: The Case of the ORD Corpus

Sherstinova, Tatiana

doi:10.1007/978-3-319-66429-3_62

Tatiana Sherstinova¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10458))

Included in the following conference series:

International Conference on Speech and Computer

2179 Accesses

Abstract

Studying prosody is important for understanding many linguistic, pragmatic, and discourse phenomena, as well as for solution of many applied tasks (in particular, in speech technologies). Prosody of everyday speech is extremely diverse, demonstrating high interpersonal and intrapersonal variations. Furthermore, natural everyday speech produces a multitude of effects which are hardly possible to obtain in speech laboratories. Because of this fact, it is very important to create resources containing representative collections of everyday speech data. The ORD corpus is a large resource aimed at studying everyday Russian speech. The paper describes the main stages of speech processing in the ORD corpus starting from segmentation of original files into macroepisodes and up to compiling prosody information into the database. This prosody database will be further used for building empirical prosody models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Couper-Kuhlen, E.: English Speech Rhythm: Form and Function in Everyday Verbal Interaction. John Benjamins Publications, Amsterdam (1993)
Book Google Scholar
Couper-Kuhlen, E., Selting, M. (eds.): Prosody in conversation: Interactional studies. Cambridge University Press, Cambridge (1996)
Google Scholar
Wells, B., Macfarlane, S.: Prosody as an interactional resource: turn-projection and overlap. Lang. Speech 41, 265–294 (1998)
Article Google Scholar
Klatt, D.H.: Linguistic uses of segmental duration in English: acoustic and perceptual evidence. J. Acoust. Soc. Am. 59, 1208–1221 (1976)
Article Google Scholar
Kello, C.T.: Patterns of timing in the acquisition, perception, and production of speech. J. Phonetics 31(3–4), 619–626 (2003)
Article Google Scholar
Campbell, N.: Timing in speech. A Multi-Level Process. In: Horne, M. (ed.) Prosody: Theory and Experiment, pp. 281–334. Kluwer Academic Publishers (2000)
Google Scholar
O’Connell, D.C.: Communicating with One Another: Toward a Psychology of Spontaneous Spoken Discourse. Springer New York, New York (2008)
Book Google Scholar
Barth-Weingarten, D., Reber, E., Selting, M.: Prosody in interaction. John Benjamins, Amsterdam, Philadelphia (2010)
Book Google Scholar
Benesty, J., Sondhi, M., Huang, Y. (eds.): Handbook of Speech Processing, Springer (2008)
Google Scholar
Harrington, J.: The Phonetic Analysis of Speech Corpora. Wiley-Blackwell, Chichester (2010)
Google Scholar
Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Pearson Prentice Hall, Englewood Cliffs (2001)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Prentice Hall, Englewood Cliffs (2008)
Google Scholar
Potapova, R.K., Potapov, V.V., Lebedeva, N.N., Agibalova. T.V.: Interdisciplinarity in the study of speech polyinformativity. Languages of Slavic Culture (2015)
Google Scholar
Wennerstrom, A.K.: The Music of Everyday Speech: Prosody and discourse analysis. Oxford University Press, New York (2001)
Google Scholar
Cummins, F.: Probing the dynamics of speech production. In: Sudhoff, S. et al. (ed.) Methods in Empirical Prosody Research. Language, Context and Cognition. W. De Gruyter, Berlin–New York, pp. 211–228 (2006)
Google Scholar
Sibata, T.: Sociolinguistics in Japanese contexts. In: Kunihiro, T., Inoue, F., Long, D. (eds.) Mouton de Gruyter. Berlin-New York (1999)
Google Scholar
Campbell, N.: Speech & expression; the value of a longitudinal corpus. LREC 2004, 183–186 (2004)
Google Scholar
Burnard, L. (ed.): Reference guide for the British National Corpus (XML edition). Published for the British National Corpus Consortium by Oxford University Computing Services (2007). http://www.natcorp.ox.ac.uk/docs/URG/. Accessed 2 June 2017
Asinovsky, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., Sherstinova, T.: The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 250–257. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04208-9_36
Chapter Google Scholar
Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Sociolinguistic extension of the ORD corpus of Russian everyday speech. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS, vol. 9811, pp. 659–666. Springer, Cham (2016). doi:10.1007/978-3-319-43958-7_80
Chapter Google Scholar
Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Everyday Russian language in different social groups. Commun. Res. 2(8), 81–92 (2016)
Google Scholar
Sherstinova, T.: Macro episodes of Russian everyday oral communication: towards pragmatic annotation of the ORD speech corpus. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 268–276. Springer, Cham (2015). doi:10.1007/978-3-319-23132-7_33
Chapter Google Scholar
Sherstinova, T.: The structure of the ORD speech corpus of Russian everyday communication. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 258–265. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04208-9_37
Chapter Google Scholar
Hellwig, B., Van Uytvanck, D., Hulsbosch, M., et al.: ELAN – Linguistic Annotator. Version 5.0.0-alfa [in:]. http://www.mpi.nl/corpus/html/elan/. Accessed 28 Mar 2017
Sherstinova, T.: Pragmaticheskoe annotirovanie konnunicativnykh jedinic v korpuse ORD: mikroepisody i rechevye akty (Approaches to Pragmatic Annotation in the ORD Corpus: Microepisodes and Speech Acts). In: Proceedings of the International Conference on “Corpus linguistics-2015”, pp. 436–446 (2015)
Google Scholar
Speech Technology Center. http://speechpro.com
Prodan, A., Chistikov, P., Talanov, A.: The system of preparation of a new voice for the speech synthesis system “VITALVOICE”. Komp’juternaja lingvistika i intellektual’nye tehnologii 9(16), 394–399 (2010)
Google Scholar
Praat: Doing Phonetics by computer. http://www.praat.org
Sherstinova, T.: Speech acts annotation of everyday conversations in the ORD corpus of spoken Russian. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) Speech and Computer (SPECOM 2016). LNAI. Springer, Switzerland (2016)
Google Scholar

Download references

Acknowledgements

The creation of the ORD speech corpus was supported by several grants: Russian Foundation for Humanities projects No. 07–04–94515e/Ya (Speech Corpus of Russian Everyday Communication “One Speaker’s Day”) and No. 12–04–12017 (Information System of Communication Scenarios of Russian Spontaneous Speech), the Russian Ministry of Education project “Sound Form of Russian Grammar System in Communicative and Informational Approach”. Significant extension of the corpus and the software development was achieved in the framework the project “Everyday Russian Language in Different Social Groups” supported by the Russian Science Foundation, project No. 14–18–02070.

Author information

Authors and Affiliations

Saint Petersburg State University, Universitetskaya nab. 11, St. Petersburg, 199034, Russia
Tatiana Sherstinova

Authors

Tatiana Sherstinova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tatiana Sherstinova .

Editor information

Editors and Affiliations

SPIIRAS, Saint Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova
University of Hertfordshire, Hatfield, United Kingdom
Iosif Mporas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sherstinova, T. (2017). Preparing Audio Recordings of Everyday Speech for Prosody Research: The Case of the ORD Corpus. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_62

Download citation

DOI: https://doi.org/10.1007/978-3-319-66429-3_62
Published: 13 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66428-6
Online ISBN: 978-3-319-66429-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics