Abstract
This paper presents an idea and first results of sentence modality classifier for Czech based purely on intonational information. This is in contrast with other studies which usually use more features (including lexical features) for this type of classification. As the sentence melody (intonation) is the most important feature, all the experiments were done on an annotated sample of Czech audiobooks library recorded by Czech leading actors. A non-linear model implemented by artificial neural network (ANN) was chosen for the classification. Two types of ANN are considered in this work in terms of temporal pattern classifications - classical multi-layer perceptron (MLP) network and Elman’s network, results for MLP are presented. Pre-processing of temporal intonational patterns for use as ANN inputs is discussed. Results show that questions are very often misclassified as statements and exclamation marks are not detectable in current data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Beeferman, D., Berger, A., Lafferty, J.: Cyberpunc: A lightweight punctuation annotation system for speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 689–692 (1998)
Lu, W., Ng, H.T.: Better punctuation prediction with dynamic conditional random fields. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, pp. 177–186. Association for Computational Linguistics, Stroudsburg (2010)
Chen, C.J.: Speech Recognition with Automatic Punctuation. In: Proc. Proc. 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999, Budapest, Hungary, pp. 447–450 (1999)
Shriberg, E., Bates, R., Stolcke, A., Taylor, P., Erringer, A., Gregory, M., Heintzelman, L., Metzler, T., Oduro, A., The, T.: Can prosody aid the automatic classification of dialog acts in conversational speech? Language and Speech 41(3-4), 439–487 (1998)
Hwan Kim, J., Woodland, P.C.: The use of prosody in a combined system for punctuation generation and speech recognition. In: Proc. EUROSPEECH 2001, pp. 2757–2760 (2001)
Christensen, H., Gotoh, Y., Renals, S.: Punctuation annotation using statistical prosody models. In: Proc. ISCA Workshop on Prosody in Speech Recognition and Understanding, Red Bank, NJ, USA, pp. 35–40 (2001)
Huang, J., Zweig, G.: Maximum Entropy Model for Punctuation Annotation from Speech. In: Proc. International Conference on Spoken Language Processing (ICSLP 2002), pp. 917–920 (2002)
Strom, V.: Detection of accents, phrase boundaries and sentence modality in german with prosodic features. In: EUROSPEECH, vol. 3, pp. 3029–2042 (1995)
Král, P., Cerisara, C.: Sentence modality recognition in french based on prosody. In: VI International Conference on Enformatika, Systems Sciences and Engineering, ESSE 2005, vol. 8, pp. 185–188. International Academy of Sciences (2005)
Gotoh, Y., Renals, S.: Sentence boundary detection in broadcast speech transcripts. In: Proc. of ISCA Workshop: Automatic Speech Recognition: Challenges for the new Millennium ASR 2000, pp. 228–235. International Speech Communication Association (2000)
Harada, L.: Complex temporal patterns detection over continuous data streams. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, pp. 401–414. Springer, Heidelberg (2002)
Jiang, T., Feng, Y., Zhang, B.: Online detecting and predicting special patterns over financial data streams. Journal of Universal Computer Science - J. UCS 15(13), 2566–2585 (2009)
Elman, J.L.: Finding structure in time. Cognitive Science 14(2), 179–211 (1990)
Dorffner, G.: Neural networks for time series processing. Neural Network World 6, 447–468 (1996)
Haselsteiner, E., Pfurtscheller, G.: Using time-dependent neural networks for EEG classification. IEEE Transactions on Rehabilitation Engineering 8(4), 457–463 (2000)
Zhou, B., Hu, J.: A dynamic pattern recognition approach based on neural network for stock time-series. In: NaBIC, pp. 1552–1555 (2009)
Palková, Z.: Fonetika a fonologie češtiny, Karolinum, Praha (1994)
Boersma, P.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341–345 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bartošek, J., Hanžl, V. (2011). Intonation Based Sentence Modality Classifier for Czech Using Artificial Neural Network. In: Travieso-González, C.M., Alonso-Hernández, J.B. (eds) Advances in Nonlinear Speech Processing. NOLISP 2011. Lecture Notes in Computer Science(), vol 7015. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25020-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-25020-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25019-4
Online ISBN: 978-3-642-25020-0
eBook Packages: Computer ScienceComputer Science (R0)