Development of a Large Spontaneous Speech Database of Agglutinative Hungarian Language

Neuberger, Tilda; Gyarmathy, Dorottya; Gráczi, Tekla Etelka; Horváth, Viktória; Gósy, Mária; Beke, András

doi:10.1007/978-3-319-10816-2_51

Tilda Neuberger²¹,
Dorottya Gyarmathy²¹,
Tekla Etelka Gráczi²¹,
Viktória Horváth²¹,
Mária Gósy²¹ &
…
András Beke²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8655))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1608 Accesses

Abstract

In this paper, a large Hungarian spoken language database is introduced. This phonetically-based multi-purpose database contains various types of spontaneous and read speech from 333 monolingual speakers (about 50 minutes of speech sample per speaker). This study presents the background and motivation of the development of the BEA Hungarian database, describes its protocol and the transcription procedure, and also presents existing and proposed research using this database. Due to its recording protocol and the transcription it provides a challenging material for various comparisons of segmental structures of speech also across languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

The ParlaSpeech Collection of Automatically Generated Speech and Text Datasets from Parliamentary Proceedings

Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks

Article Open access 09 August 2022

The “One Day of Speech” Corpus: Phonetic and Syntactic Studies of Everyday Spoken Russian

References

Mengusoglu, E., Deroo, O.: Turkish LVCSR: Database preparation and language modeling for an agglutinative language. In: IEEE International Conference on Acoustics Speech And Signal Processing, vol. 6, pp. 4018–4018. IEEE (1999, 2001)
Google Scholar
Seppänen, T., Toivanen, J., Väyrynen, E.: MediaTeam speech corpus: a first large Finnish emotional speech database. In: Proceedings of the Proceedings of XV International Conference of Phonetic Science, pp. 2469–2472 (2003)
Google Scholar
Mihajlik, P., Fegyyó, T., Tüske, Z., Ircing, P.: A morphographemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian. In: Proc. Interspeech 2007, Antwerp, Belgium, pp. 1497–1500 (2007)
Google Scholar
Keating, P., Byrd, D., Flemming, E., Todaka, Y.: Phonetic analyses of word and segment variation using the TIMIT corpus of American english. Speech Communication 14(2), 131–142 (1994)
Article Google Scholar
Bael, C.V., Boves, L., van den Heuvel, D., Strik, H.: Automatic phonetic transcription of large speech corpora. Journal of Computer Speech and Language 21(4), 652–668 (2007)
Article Google Scholar
Aston, G., Burnard, L.: The BNC Handbook. Exploring the British National Corpus with SARA. Oxford University Press (1998)
Google Scholar
Svartvik, J. (ed.): The London Corpus of Spoken English: Description and Research. Lund Studies in English, 82. Lund University Press, Lund (1990)
Google Scholar
Godfrey, J.J., Holliman, E.C., Daniel, J.: SWITCHBOARD: telephone speech corpus for research and development. In: Acoustics, Speech, and Signal Processing, ICASSP 1992, vol. 1, pp. 517–520 (1992)
Google Scholar
Anderson, A.H., Bader, M., Bard, E.G., Boyle, E., Doherty, G., Garrod, S.,…Weinert, R.: The HCRC map task corpus. Language and Speech 34(4), 351–366 (1991)
Google Scholar
Pitt, M.A., Johnson, K., Hume, E., Kiesling, S., Raymond, W.: The Buckeye corpus of conversational speech: labeling conventions and a test of transcriber reliability. Speech Communication 45, 89–95 (2005)
Article Google Scholar
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., ... Wooters, C.: The ICSI meeting corpus. In: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2003, vol. 1, pp. 364–367 (2003)
Google Scholar
Carletta, J.E., et al.: The AMI meeting corpus: A pre-announcement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 28–39. Springer, Heidelberg (2006)
Chapter Google Scholar
Kohler, K.J., Pätzold, M., Simpson, A.P.: From the acoustic data collection to a labelled speech data bank of spoken Standard German. Arbeitsberichte des Instituts fär Phonetik und digitale Sprachverarbeitung der Universität Kiel (AIPUK) 32, 1–29 (1997)
Google Scholar
Grønnum, N.: A Danish phonetically annotated spontaneous speech corpus (DanPASS). Speech Communication 51(7), 594–603 (2009)
Article Google Scholar
Maekawa, K.: Corpus of Spontaneous Japanese: Its design and evaluation. In: ISCA IEEE Workshop on Spontaneous Speech Processing and Recognition (2003)
Google Scholar
Chan, D., et al.: EUROM: a spoken language resource for the EU. In: Proceedings of the 4th European Conference on Speech Communication and Speech Tecnology, Eurospeech 1995, Madrid, vol. 1, pp. 867–880 (1995)
Google Scholar
Roach, P., Arnfield, S., Barry, W.J., Baltova, J., Boldea, M., Fourcin, A., ... Vicsi, K.: BABEL: an eastern european multi-language database. In: ICSLP (1996)
Google Scholar
Váradi, T.: A Budapesti Szociolingvisztikai Interjú. In: Kiefer F, Siptár P. (ed.). A magyar nyelv kézikényve Akadémiai Kiadó, Budapest, pp. 339–359 (2003)
Google Scholar
Vicsi, K., Tóth, L., Kocsor, A., Gordos, G., Csirik, J.: MTBA – magyar nyelvű telefonbeszéd-adatbázis. Híradástechnika 8, 35–39 (2002)
Google Scholar
Papay, K.: Designing a Hungarian multimodal database – speech recording and annotation. In: Esposito, A., Esposito, A.M., Martone, R., Müller, V.C., Scarpetta, G. (eds.) COST 2102 Int. Training School 2010. LNCS, vol. 6456, pp. 403–411. Springer, Heidelberg (2011)
Google Scholar
Gósy, M.: BEA A multifunctional Hungarian spoken language database. The Phonetician 105(106), 50–61 (2012)
Google Scholar
Gósy, M. (ed.): Beszéd, adatbázis, kutatások. Akadémiai Kiadó, Budapest (2012)
Google Scholar
Gráczi, T.E., Horváth, V.: A magánhangzók realizációja spontán beszédben. In: Beszédkutatás 2010, pp. 5–16 (2010)
Google Scholar
Beke, A., Gósy, M.: Characteristic and spectral features used in automatic prediction of vowel duration in spontaneous speech. In: Institute of Electrical Electronics Engineers (eds.): CogInfoCom 2012: 3rd International Conference on Cognitive Infocommunications, pp. 65–71 (2012)
Google Scholar
Gráczi, T.E., Beke, A.: Fricatives in spontaneous speech. In: ExAPP 2013, Copenhagen, March 20-22 (2013)
Google Scholar
Beke, A., Gósy, M., Horváth, V.: Temporal variability in spontaneous Hungarian speech. In: Proceedings of 6th Language Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Poznan, December 7-9, pp. 219–223 (2013)
Google Scholar
Gósy, M., Gyarmathy, D., Horváth, V.: Improper activation and monitoring failures in speech planning. Govor / Speech 29(1), 3–22 (2012)
Google Scholar
Gyarmathy, D., Neuberger, T.: Self-monitoring strategies: the factor of age. In: Presentation at the 19th International Congress of Linguists, Geneva, July 21-27 (2012)
Google Scholar
Beke, A.: Automatic speaker diarization in Hungarian spontaneous conversations. PhD thesis. ELTE, Budapest (2013)
Google Scholar
Neuberger, T., Beke, A.: Automatic laughter detection in spontaneous speech using GMM-SVM method. In: Habernal, I. (ed.) TSD 2013. LNCS (LNAI), vol. 8082, pp. 113–120. Springer, Heidelberg (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Departement of Phonetics, Research Institute for Linguistics of the Hungarian Academy of Sciences, Benczúr 33, 1068, Budapest, Hungary
Tilda Neuberger, Dorottya Gyarmathy, Tekla Etelka Gráczi, Viktória Horváth, Mária Gósy & András Beke

Authors

Tilda Neuberger
View author publications
You can also search for this author in PubMed Google Scholar
Dorottya Gyarmathy
View author publications
You can also search for this author in PubMed Google Scholar
Tekla Etelka Gráczi
View author publications
You can also search for this author in PubMed Google Scholar
Viktória Horváth
View author publications
You can also search for this author in PubMed Google Scholar
Mária Gósy
View author publications
You can also search for this author in PubMed Google Scholar
András Beke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Botanicá 6a, 60200, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Department of Information Technologies, Masaryk University, 602 00, Brno, Czech Republic
Aleš Horák , Ivan Kopeček & Karel Pala , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neuberger, T., Gyarmathy, D., Gráczi, T.E., Horváth, V., Gósy, M., Beke, A. (2014). Development of a Large Spontaneous Speech Database of Agglutinative Hungarian Language. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_51

Download citation

DOI: https://doi.org/10.1007/978-3-319-10816-2_51
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics