Automatic Detection of Speech Disfluencies in the Spontaneous Russian Speech

Verkhodanova, Vasilisa; Shapranov, Vladimir

doi:10.1007/978-3-319-01931-4_10

Vasilisa Verkhodanova²² &
Vladimir Shapranov²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Included in the following conference series:

International Conference on Speech and Computer

1199 Accesses
1 Citations

Abstract

Spontaneous speech is rarely fluent due to human nature. And among other characteristics of spontaneous speech there are the speech variation and the presence of speech disfluencies such as hesitations, fillers, artefacts. Such elements are an obstacle for automatic speech processing as well as for its tran-scriptions processing. For automatic detection of these elements a corpus of spontaneous Russian speech was collected basing on a task methodology. Corpus was annotated taking into account such types of disfluencies as hesitations, repairs, sound lengthening, as well as artefacts. For hesitation and artefacts detection there were used such parameters as duration, energy, fundamental frequency, and other spectral characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Podlesskaya, V.I., Kibrik, A.A.: Speech disfluencies and their reflection in discourse transcription. In: Proceedings of VII International Conference on Cognitive Modelling in Linguistics, Varna, Bulgaria, vol. 1, pp. 194–204 (2004)
Google Scholar
Clark, H.H., Fox Tree, J.E.: Using uh and um in spontaneous speaking. Cognition 84, 73–111 (2002)
Article Google Scholar
Verkhodanova, V.O., Karpov, A.A.: Speech disfluencies modeling in the automatic speech recognition systems. The Bulletin of University of Tomsk 363, 10–15 (2012) (in Rus.)
Google Scholar
Kipyatkova, I., Karpov, A., Verkhodanova, V., Zelezny, M.: Analysis of Long-distance Word Dependencies and Pronunciation Variability at Conversational Russian Speech Recognition. In: Proceedings of Federated Conference on Computer Science and Information Systems, FedCSIS 2012, Wroclaw, Poland, pp. 719–725 (2012)
Google Scholar
Masataka, G., Katunobu, I., Satoru, H.: A real-time filled pause detection system for spontaneous speech Recognition. In: Proceedings of the 6th European Conference on Speech Communication and Technology, Eurospeech 1999, Budapest, Hungary, pp. 227–230 (1999)
Google Scholar
Veiga, A., Candeias, S., Lopes, C., Perdigão, F.: Characterization of hesitations using acoustic models. In: Proceedings of the 17th International Congress of Phonetic Sciences, ICPhS XVII, Hong Kong, China, pp. 2054–2057 (2011)
Google Scholar
Liu, Y., Shriberg, E., Stolcke, A.: Automatic Disfluency Identication in Conversational Speech Multiple Knowledge Sources. In: Proceedings of the 8th European Conference on Speech Communication and Technology, Eurospeech 2003, Geneva, Switzerland, pp. 957–960 (2003)
Google Scholar
Liu, Y., Shriberg, E., Stolcke, A., et al.: Enriching Speech Recognition with Automatic Detection of Sentence Boundaries and Disfluencies. IEEE Transactions on Audio, Speech and Language Processing 1(5), 1526–1540 (2006)
Google Scholar
Lease, M., Johnson, M., Charniak, E.: Recognizing disfluencies in conversational speech. IEEE Transactions on Audio, Speech and Language Processing 14(5), 1566–1573 (2006)
Article Google Scholar
Kaushik, M., Trinkle, M., Hashemi-Sakhtsari, A.: Automatic Detection and Removal of Disfluencies from Spontaneous Speech. In: Proceedings of the 13th Australasian International Conference on Speech Science and Technology (SST), Melbourne, Australia, pp. 98–101 (2010)
Google Scholar
Snover, M., Dorr, B., Schwartz, R.: A lexically-driven algorithm for disfluency detection. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, HLT-NAACL-Short 2004, Boston, Massachusetts, USA, pp. 157–160 (2004)
Google Scholar
Liu, Y.: Structural Event Detection for Rich Transcription of Speech. PhD thesis, Purdue University and ICSI, Berkeley, 253 p. (2004)
Google Scholar
Corpus “Czech Broadcast Conversation MDE Transcripts”. In: LDC, http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2009T20 (accessed May 5, 2013)
Corpus “Czech Broadcast Conversation Speech”. In: LDC, http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2009S02 (accessed May 5, 2013)
Kol\(\acute{a}\breve{r}\), J., \(\breve{S}\)vec, J., Strassel, S., et al.: Czech Spontaneous Speech Corpus with Structural Metadata. In: Proceedings of the 9th European Conference on Speech Communication and Technology, INTERSPEECH 2005, Lisbon, Portugal, pp. 1165–1168 (2005)
Google Scholar
Zemskaya, E.A.: Russian spoken speech: linguistic analysis and the problems of learning, Moscow (1979) (in Rus.)
Google Scholar
Anderson, A., Bader, M., Bard, E., Boyle, E., Doherty, G.M., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H.S., Weinert, R.: The HCRC Map Task Corpus. Language and Speech 34, 351–366 (1991)
Google Scholar
Kohler, K.J.: Labelled data bank of spoken standard German: the Kiel corpus of read/spontaneous speech. In: Proceedings of Fourth International Conference on Spoken Language, ICSLP 1996, vol. 3, pp. 1938–1941 (1996)
Google Scholar
Wave Assistant, the speech analyzer program by Speech Technology Center, http://www.phonetics.pu.ru/wa/WA_S.EXE (accessed January 5, 2013)

Download references

Author information

Authors and Affiliations

SPIIRAS, 39, 14th line, St. Petersburg, Russia
Vasilisa Verkhodanova
Betria Systems, Inc, 50, Building 11, Ligovskii Prospekt, St. Petersburg, Russia
Vladimir Shapranov

Authors

Vasilisa Verkhodanova
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Shapranov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences, Department of Cybernetics, University of West Bohemia, Univerzitní 8, 306 14, Plzeň, Czech Republic
Miloš Železný
University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal
Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation for the Russian Academy of Sciences, 14-th line, 39, 199178, St. Petersburg, Russia
Andrey Ronzhin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verkhodanova, V., Shapranov, V. (2013). Automatic Detection of Speech Disfluencies in the Spontaneous Russian Speech. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-01931-4_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics