Skip to main content

Preparing Audio Recordings of Everyday Speech for Prosody Research: The Case of the ORD Corpus

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10458))

Included in the following conference series:

  • 2179 Accesses

Abstract

Studying prosody is important for understanding many linguistic, pragmatic, and discourse phenomena, as well as for solution of many applied tasks (in particular, in speech technologies). Prosody of everyday speech is extremely diverse, demonstrating high interpersonal and intrapersonal variations. Furthermore, natural everyday speech produces a multitude of effects which are hardly possible to obtain in speech laboratories. Because of this fact, it is very important to create resources containing representative collections of everyday speech data. The ORD corpus is a large resource aimed at studying everyday Russian speech. The paper describes the main stages of speech processing in the ORD corpus starting from segmentation of original files into macroepisodes and up to compiling prosody information into the database. This prosody database will be further used for building empirical prosody models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Couper-Kuhlen, E.: English Speech Rhythm: Form and Function in Everyday Verbal Interaction. John Benjamins Publications, Amsterdam (1993)

    Book  Google Scholar 

  2. Couper-Kuhlen, E., Selting, M. (eds.): Prosody in conversation: Interactional studies. Cambridge University Press, Cambridge (1996)

    Google Scholar 

  3. Wells, B., Macfarlane, S.: Prosody as an interactional resource: turn-projection and overlap. Lang. Speech 41, 265–294 (1998)

    Article  Google Scholar 

  4. Klatt, D.H.: Linguistic uses of segmental duration in English: acoustic and perceptual evidence. J. Acoust. Soc. Am. 59, 1208–1221 (1976)

    Article  Google Scholar 

  5. Kello, C.T.: Patterns of timing in the acquisition, perception, and production of speech. J. Phonetics 31(3–4), 619–626 (2003)

    Article  Google Scholar 

  6. Campbell, N.: Timing in speech. A Multi-Level Process. In: Horne, M. (ed.) Prosody: Theory and Experiment, pp. 281–334. Kluwer Academic Publishers (2000)

    Google Scholar 

  7. O’Connell, D.C.: Communicating with One Another: Toward a Psychology of Spontaneous Spoken Discourse. Springer New York, New York (2008)

    Book  Google Scholar 

  8. Barth-Weingarten, D., Reber, E., Selting, M.: Prosody in interaction. John Benjamins, Amsterdam, Philadelphia (2010)

    Book  Google Scholar 

  9. Benesty, J., Sondhi, M., Huang, Y. (eds.): Handbook of Speech Processing, Springer (2008)

    Google Scholar 

  10. Harrington, J.: The Phonetic Analysis of Speech Corpora. Wiley-Blackwell, Chichester (2010)

    Google Scholar 

  11. Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Pearson Prentice Hall, Englewood Cliffs (2001)

    Google Scholar 

  12. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Prentice Hall, Englewood Cliffs (2008)

    Google Scholar 

  13. Potapova, R.K., Potapov, V.V., Lebedeva, N.N., Agibalova. T.V.: Interdisciplinarity in the study of speech polyinformativity. Languages of Slavic Culture (2015)

    Google Scholar 

  14. Wennerstrom, A.K.: The Music of Everyday Speech: Prosody and discourse analysis. Oxford University Press, New York (2001)

    Google Scholar 

  15. Cummins, F.: Probing the dynamics of speech production. In: Sudhoff, S. et al. (ed.) Methods in Empirical Prosody Research. Language, Context and Cognition. W. De Gruyter, Berlin–New York, pp. 211–228 (2006)

    Google Scholar 

  16. Sibata, T.: Sociolinguistics in Japanese contexts. In: Kunihiro, T., Inoue, F., Long, D. (eds.) Mouton de Gruyter. Berlin-New York (1999)

    Google Scholar 

  17. Campbell, N.: Speech & expression; the value of a longitudinal corpus. LREC 2004, 183–186 (2004)

    Google Scholar 

  18. Burnard, L. (ed.): Reference guide for the British National Corpus (XML edition). Published for the British National Corpus Consortium by Oxford University Computing Services (2007). http://www.natcorp.ox.ac.uk/docs/URG/. Accessed 2 June 2017

  19. Asinovsky, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., Sherstinova, T.: The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 250–257. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04208-9_36

    Chapter  Google Scholar 

  20. Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Sociolinguistic extension of the ORD corpus of Russian everyday speech. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS, vol. 9811, pp. 659–666. Springer, Cham (2016). doi:10.1007/978-3-319-43958-7_80

    Chapter  Google Scholar 

  21. Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Everyday Russian language in different social groups. Commun. Res. 2(8), 81–92 (2016)

    Google Scholar 

  22. Sherstinova, T.: Macro episodes of Russian everyday oral communication: towards pragmatic annotation of the ORD speech corpus. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 268–276. Springer, Cham (2015). doi:10.1007/978-3-319-23132-7_33

    Chapter  Google Scholar 

  23. Sherstinova, T.: The structure of the ORD speech corpus of Russian everyday communication. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 258–265. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04208-9_37

    Chapter  Google Scholar 

  24. Hellwig, B., Van Uytvanck, D., Hulsbosch, M., et al.: ELAN – Linguistic Annotator. Version 5.0.0-alfa [in:]. http://www.mpi.nl/corpus/html/elan/. Accessed 28 Mar 2017

  25. Sherstinova, T.: Pragmaticheskoe annotirovanie konnunicativnykh jedinic v korpuse ORD: mikroepisody i rechevye akty (Approaches to Pragmatic Annotation in the ORD Corpus: Microepisodes and Speech Acts). In: Proceedings of the International Conference on “Corpus linguistics-2015”, pp. 436–446 (2015)

    Google Scholar 

  26. Speech Technology Center. http://speechpro.com

  27. Prodan, A., Chistikov, P., Talanov, A.: The system of preparation of a new voice for the speech synthesis system “VITALVOICE”. Komp’juternaja lingvistika i intellektual’nye tehnologii 9(16), 394–399 (2010)

    Google Scholar 

  28. Praat: Doing Phonetics by computer. http://www.praat.org

  29. Sherstinova, T.: Speech acts annotation of everyday conversations in the ORD corpus of spoken Russian. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) Speech and Computer (SPECOM 2016). LNAI. Springer, Switzerland (2016)

    Google Scholar 

Download references

Acknowledgements

The creation of the ORD speech corpus was supported by several grants: Russian Foundation for Humanities projects No. 07–04–94515e/Ya (Speech Corpus of Russian Everyday Communication “One Speaker’s Day”) and No. 12–04–12017 (Information System of Communication Scenarios of Russian Spontaneous Speech), the Russian Ministry of Education project “Sound Form of Russian Grammar System in Communicative and Informational Approach”. Significant extension of the corpus and the software development was achieved in the framework the project “Everyday Russian Language in Different Social Groups” supported by the Russian Science Foundation, project No. 14–18–02070.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tatiana Sherstinova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Sherstinova, T. (2017). Preparing Audio Recordings of Everyday Speech for Prosody Research: The Case of the ORD Corpus. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66429-3_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66428-6

  • Online ISBN: 978-3-319-66429-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics