Abstract
The paper presents the development of a dependency treebank for the Serbian language, intended for various applications in the field of natural language processing, primarily natural language understanding within human-machine dialogue. The databank is built by adding syntactical annotation to the part-of-speech (POS) tagged AlfaNum Text Corpus of Serbian. The annotation is carried out in line with the standards set by the Prague Dependency Treebank, which has been adopted as a starting point for the development of treebanks for some other kindred languages in the region. The initial dependency parsing experiments on the currently annotated portion of the corpus containing 1,148 sentences (7,117 words) provided relatively low parsing accuracy, as was expected from a preliminary experiment and a treebank of this size.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Han, A.L.-F., Wong, D.F., Chao, L.S., He, L., Li, S., Zhu, L.: Phrase Tagset Mapping for French and English Treebanks and Its Application in Machine Translation Evaluation. In: Gurevych, I., Biemann, C., Zesch, T. (eds.) GSCL. LNCS (LNAI), vol. 8105, pp. 119–131. Springer, Heidelberg (2013)
Hajič, J., Böhmová, A., Hajičová, E., Hladká, B.V.: The Prague Dependency Treebank: A Three-Level Annotation Scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Parsed Corpora, pp. 103–127. Kluwer, Amsterdam (2000)
Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. D. Reidel Publishing Company, Dordrecht (1986)
Nivre, J.: Inductive Dependency Parsing. Text, Speech and Language Technology, vol. 34, p. 46. Springer (2006)
Ledinek, N., Žele, A.: Building of the Slovene Dependency Treebank Corpus According to the Prague Dependency Treebank Corpus. In: Grammar and Corpus, Prague, Czech Republic (2005)
Džeroski, S., Erjavec, T., Ledinek, N., Pajas, P., Žabokrtský, Z., Žele, A.: Towards a Slovene Dependency Treebank. In: LREC, Genova, Italy (2006)
Berović, D., Agić, Ž., Tadić, M.: Croatian Dependency Treebank: Recent Development and Initial Experiments. In: LREC, Istanbul, Turkey, pp. 1902–1906 (2012)
Agić, Ž., Merkler, D., Berović, D.: Slovene-Croatian Treebank Transfer Using Bilingual Lexicon Improves Croatian Dependency Parsing, In: 15th Int. Multiconf. Information Society 2012, Ljubljana, Slovenia (2012)
Vitas, D., Popović, L., Krstev, C., Obradović, I., Pavlović-Lažetić, G., Stanojević, M.: The Serbian Language in the Digital Age. In: Rehm, G., Uszkoreit, H. (eds.) METANET White Paper Series. Springer (2012)
Sečujski, M.: Automatic part-of-speech tagging of texts in the Serbian language. PhD thesis, Faculty of Technical Sciences, Novi Sad, Serbia (2009)
Hajič, J., Panevová, J., Buráňová, E., Urešová, Z., Bémová, A.: Annotations at Analytical Level: Instructions for Annotators. Technical report. UFAL MFF UK, Prague (1999)
Rosa, R., Mašek, J., Mareček, D., Popel, M., Zeman, D., Žabokrtský, Z.: HamleDT 2.0: Thirty Dependency Treebanks Stanfordized. In: LREC, Reykjavik, Iceland (2014)
Stevanović, M.: Savremeni srpskohrvatski jezik – gramatički sistemi i književnojezička norma II. Narodna knjiga, Belgrade (1991)
Nilsson, J., Nivre, J.: MaltEval: An Evaluation and Visualization Tool for Dependency Parsing. In: LREC, Marrakech, Morocco, pp. 161–166 (2008)
Nivre, J.: An Efficient Algorithm for Projective Dependency Parsing. In: The 8th International Workshop on Parsing Technologies (IWPT 2003), Nancy, France, pp. 149–160 (2003)
Covington, M.A.: A fundamental algorithm for dependency parsing. In: The 39th Annual ACM Southeast Conference, Athens, USA, pp. 95–102 (2001)
Nivre, J.: Non-Projective Dependency Parsing in Expected Linear Time. In: The Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, pp. 351–359 (2009)
Nivre, J., Kuhlmann, M., Hall, J.: An Improved Oracle for Dependency Parsing with Online Reordering. In: The 11th International Conference on Parsing Technologies (IWPT 2009), Paris, France, pp. 73–76 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Jakovljević, B., Kovačević, A., Sečujski, M., Marković, M. (2014). A Dependency Treebank for Serbian: Initial Experiments. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-11581-8_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11580-1
Online ISBN: 978-3-319-11581-8
eBook Packages: Computer ScienceComputer Science (R0)