Abstract
In this paper we compare automatic methods for disambiguation of verb senses, in particular we investigate Naïve Bayes classifier, decision trees, and a rule-based method. Different types of features are proposed, including morphological, syntax-based, idiomatic, animacy, and WordNet-based features. We evaluate the methods together with individual feature types on two essentially different Czech corpora, VALEVAL and the Prague Dependency Treebank. The best performing methods and features are discussed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dang, H.T., Palmer, M.: The Role of Semantic Roles in Disambiguating Verb Senses. In: Proceedings of ACL, Ann Arbor MI (2005)
Ye, P.: Selectional Preferenced Based Verb Sense Disambiguation Using WordNet. In: Australasian Language Technology Workshop 2004, Australia, pp. 155–162 (2004)
Lopatková, M., Bojar, O., Semecký, J., Benešová, V., Žabokrtský, Z.: Valency lexicon of czech verbs VALLEX: Recent experiments with frame disambiguation. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS, vol. 3658, pp. 99–106. Springer, Heidelberg (2005)
Král, R.: Jaký to má význam? Ph.D. thesis, Masaryk University (2004)
Kocek, J., Kopřivová, M., Kučera, K. (eds.): Czech National Corpus - introduction and user handbook (in Czech), FF UK - ÚČNK, Prague (2000)
Bojar, O., Semecký, J., Benešová, V.: VALEVAL: Testing VALLEX Consistency and Experimenting with Word-Frame Disambiguation. Prague Bulletin of Mathematical Linguistics 83 (2005)
Charniak, E.: A Maximum-Entropy-Inspired Parser. In: Proceedings of NAACL 2000, Seattle, Washington, USA, pp. 132–139 (2000)
Hajič, J.: Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. Issues of Valency and Meaning, pp. 106–132 (1998)
Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in its Semantic and Pragmatic Aspects, Academia, Prague. Czech Republic/Reidel Publishing Company, Dordrecht, Netherlands (1986)
McDonald, R., Pereira, F., Ribarov, K., Hajic, J.: Non-Projective Dependency Parsing using Spanning Tree Algorithms. In: Proceedings of HLT Conference and Conference on EMNLP, Vancouver, Canada, ACL, pp. 523–530 (2005)
Hajič, J.: Morphological Tagging: Data vs. Dictionaries. In: Proceedings of ANLP-NAACL Conference, Seattle, Washington, USA, pp. 94–101 (2000)
Fellbaum, C.: WordNet An Electronic Lexical Database. The MIT Press, Cambridge (1998)
Vossen, P., Bloksma, L., Rodriguez, H., Climent, S., Calzolari, N., Roventini, A., Bertagna, F., Alonge, A., Peters, W.: The EuroWordNet Base Concepts and Top Ontology. Technical report (1997)
Pala, K., Smrž, P.: Building Czech Wordnet. Romanian Journal of Information Science and Technology 7, 79–88 (2004)
Borgelt, C.: A Decision Tree Plug-In for DataEngine. In: Proceedings of 2nd Data Analysis Symposium, Aachen, Germany, MIT GmbH (1998)
Quinlan, J.R.: Data Mining Tools See5 and C5.0 (2005), http://www.rulequest.com/see5-info.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Semecký, J., Podveský, P. (2006). Extensive Study on Automatic Verb Sense Disambiguation in Czech. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_30
Download citation
DOI: https://doi.org/10.1007/11846406_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)