Abstract
Learning Bayesian Belief Networks from corpora has been applied to the automatic acquisition of verb subcategorization frames for Modern Greek (MG). We are incorporating minimal linguistic resources, i.e. morphological tagging and phrase chunking, since a general-purpose syntactic parser for MG is currently unavailable. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the conditional independence assumption along with two widely used methods, Log-Likelihood (LLR) and Relative Frequencies Threshold (RFT).We have experimented with a balanced corpus in order to assure unbiased behavior of the training model. Results have depicted that obtaining the inferential dependencies of the training data could lead to a precision improvement of about 4% compared to that of Naive Bayes and 7% compared to LLR and RFT Moreover, we have been able to achieve a precision exceeding 87% on the identification of subcategorization frames which are not known beforehand, while limited training data are proved to endowwith satisfactory results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Basili R., Pazienza M.T. and Vindigni M. 1997. Corpus-driven Unsupervised Learning of Verb Subcategorization frames. Proceedings of the Conference of the Italian Association for Artificial Intelligence, AI*IA 97, Rome.
Brent M. 1993. From Grammar to Lexicon: Unsupervised Learning of Lexical Syntax. Computational Linguistics, vol. 19, No. 3, pp. 243–262.
Briscoe T. and Carroll J. 1997. Automatic Extraction of Subcategorization from Corpora. Proceedings of the 5th ANLP Conference, pp. 356–363. ACL, Washington D.C.
Cooper J. and Herskovits E. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, pp.309–347.
Dunning T. 1993. Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics. vol.19, No. 1, pp. 61–74.
Eckle J. and Heid U. 1996. Extracting raw material for a German subcategorization lexicon from newspaper text. Proceedings of the 4th International Conference on Computational Lexicography, COMPLEX’96, Budapest, Hungary.
Gahl S. 1998. Automatic extraction of subcorpora based on subcategorization frames from a part-of-speech tagged corpus. Proceedings of COLING-ACL 1998, pp.428–432.
Glymour C. and Cooper G. (eds.). 1999. Computation, Causation & Discovery. AAAI Press/The MIT Press, Menlo Park.
Jeffreys H. 1939. Theory of Probability. Clarendon Press, Oxford.
Jensen F. 1996. An Introduction to Bayesian Networks. New York: Springer-Verlag.
Kawahara D., Kaji N. and Kurohashi S. 2000. Japanese Case Structure Analysis by Unsupervised Construction of a Case Frame Dictionary. Proceedings of COLING 2000.
Manning C. 1993. Automatic Acquisition of a Large Subcategorization Dictionary from Corpora. Proceedings of 31st Meeting of the ACL 1993, pp. 235–242.
Pearl J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann.
Sarkar A. and D. Zeman, 2000. Automatic Extraction of Subcategorization Frames for Czech. In Proceedings of the 18th International Conference on Computational Linguistics, pp. 691–697.
Stamatatos E., Fakotakis N. and Kokkinakis G. 2000. A Practical Chunker for Unrestricted Text. Proceedings of the 2nd International Conference of Natural Language Processing (NLP2000), pp. 139–150.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kermanidis, K., Maragoudakis, M., Fakotakis, N., Kokkinakis, G. (2001). Influence of Conditional Independence Assumption on Verb Subcategorization Detection. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_8
Download citation
DOI: https://doi.org/10.1007/3-540-44805-5_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive