Influence of Conditional Independence Assumption on Verb Subcategorization Detection

Kermanidis, K.; Maragoudakis, M.; Fakotakis, N.; Kokkinakis, G.

doi:10.1007/3-540-44805-5_8

Influence of Conditional Independence Assumption on Verb Subcategorization Detection

K. Kermanidis²,
M. Maragoudakis²,
N. Fakotakis² &
…
G. Kokkinakis²

Conference paper
First Online: 01 January 2001

399 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2166))

Abstract

Learning Bayesian Belief Networks from corpora has been applied to the automatic acquisition of verb subcategorization frames for Modern Greek (MG). We are incorporating minimal linguistic resources, i.e. morphological tagging and phrase chunking, since a general-purpose syntactic parser for MG is currently unavailable. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the conditional independence assumption along with two widely used methods, Log-Likelihood (LLR) and Relative Frequencies Threshold (RFT).We have experimented with a balanced corpus in order to assure unbiased behavior of the training model. Results have depicted that obtaining the inferential dependencies of the training data could lead to a precision improvement of about 4% compared to that of Naive Bayes and 7% compared to LLR and RFT Moreover, we have been able to achieve a precision exceeding 87% on the identification of subcategorization frames which are not known beforehand, while limited training data are proved to endowwith satisfactory results.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Basili R., Pazienza M.T. and Vindigni M. 1997. Corpus-driven Unsupervised Learning of Verb Subcategorization frames. Proceedings of the Conference of the Italian Association for Artificial Intelligence, AI*IA 97, Rome.
Google Scholar
Brent M. 1993. From Grammar to Lexicon: Unsupervised Learning of Lexical Syntax. Computational Linguistics, vol. 19, No. 3, pp. 243–262.
Google Scholar
Briscoe T. and Carroll J. 1997. Automatic Extraction of Subcategorization from Corpora. Proceedings of the 5th ANLP Conference, pp. 356–363. ACL, Washington D.C.
Google Scholar
Cooper J. and Herskovits E. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, pp.309–347.
MATH Google Scholar
Dunning T. 1993. Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics. vol.19, No. 1, pp. 61–74.
Google Scholar
Eckle J. and Heid U. 1996. Extracting raw material for a German subcategorization lexicon from newspaper text. Proceedings of the 4th International Conference on Computational Lexicography, COMPLEX’96, Budapest, Hungary.
Google Scholar
Gahl S. 1998. Automatic extraction of subcorpora based on subcategorization frames from a part-of-speech tagged corpus. Proceedings of COLING-ACL 1998, pp.428–432.
Google Scholar
Glymour C. and Cooper G. (eds.). 1999. Computation, Causation & Discovery. AAAI Press/The MIT Press, Menlo Park.
MATH Google Scholar
Jeffreys H. 1939. Theory of Probability. Clarendon Press, Oxford.
Google Scholar
Jensen F. 1996. An Introduction to Bayesian Networks. New York: Springer-Verlag.
Google Scholar
Kawahara D., Kaji N. and Kurohashi S. 2000. Japanese Case Structure Analysis by Unsupervised Construction of a Case Frame Dictionary. Proceedings of COLING 2000.
Google Scholar
Manning C. 1993. Automatic Acquisition of a Large Subcategorization Dictionary from Corpora. Proceedings of 31st Meeting of the ACL 1993, pp. 235–242.
Google Scholar
Pearl J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Sarkar A. and D. Zeman, 2000. Automatic Extraction of Subcategorization Frames for Czech. In Proceedings of the 18th International Conference on Computational Linguistics, pp. 691–697.
Google Scholar
Stamatatos E., Fakotakis N. and Kokkinakis G. 2000. A Practical Chunker for Unrestricted Text. Proceedings of the 2nd International Conference of Natural Language Processing (NLP2000), pp. 139–150.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electrical & Computer Engineering, University of Patras, 26500, Patras, Greece
K. Kermanidis, M. Maragoudakis, N. Fakotakis & G. Kokkinakis

Authors

K. Kermanidis
View author publications
You can also search for this author in PubMed Google Scholar
M. Maragoudakis
View author publications
You can also search for this author in PubMed Google Scholar
N. Fakotakis
View author publications
You can also search for this author in PubMed Google Scholar
G. Kokkinakis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, University of West Bohemia in Plzeň, Faculty of Applied Sciences, Univerzitní 22, 306-14, Plzeň, Czech Republic
Václav Matoušek , Pavel Mautner , Roman Mouček & Karel Taušer , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kermanidis, K., Maragoudakis, M., Fakotakis, N., Kokkinakis, G. (2001). Influence of Conditional Independence Assumption on Verb Subcategorization Detection. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_8

Download citation

DOI: https://doi.org/10.1007/3-540-44805-5_8
Published: 24 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics