Skip to main content

Influence of Conditional Independence Assumption on Verb Subcategorization Detection

  • Conference paper
  • First Online:
  • 399 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2166))

Abstract

Learning Bayesian Belief Networks from corpora has been applied to the automatic acquisition of verb subcategorization frames for Modern Greek (MG). We are incorporating minimal linguistic resources, i.e. morphological tagging and phrase chunking, since a general-purpose syntactic parser for MG is currently unavailable. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the conditional independence assumption along with two widely used methods, Log-Likelihood (LLR) and Relative Frequencies Threshold (RFT).We have experimented with a balanced corpus in order to assure unbiased behavior of the training model. Results have depicted that obtaining the inferential dependencies of the training data could lead to a precision improvement of about 4% compared to that of Naive Bayes and 7% compared to LLR and RFT Moreover, we have been able to achieve a precision exceeding 87% on the identification of subcategorization frames which are not known beforehand, while limited training data are proved to endowwith satisfactory results.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basili R., Pazienza M.T. and Vindigni M. 1997. Corpus-driven Unsupervised Learning of Verb Subcategorization frames. Proceedings of the Conference of the Italian Association for Artificial Intelligence, AI*IA 97, Rome.

    Google Scholar 

  2. Brent M. 1993. From Grammar to Lexicon: Unsupervised Learning of Lexical Syntax. Computational Linguistics, vol. 19, No. 3, pp. 243–262.

    Google Scholar 

  3. Briscoe T. and Carroll J. 1997. Automatic Extraction of Subcategorization from Corpora. Proceedings of the 5th ANLP Conference, pp. 356–363. ACL, Washington D.C.

    Google Scholar 

  4. Cooper J. and Herskovits E. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, pp.309–347.

    MATH  Google Scholar 

  5. Dunning T. 1993. Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics. vol.19, No. 1, pp. 61–74.

    Google Scholar 

  6. Eckle J. and Heid U. 1996. Extracting raw material for a German subcategorization lexicon from newspaper text. Proceedings of the 4th International Conference on Computational Lexicography, COMPLEX’96, Budapest, Hungary.

    Google Scholar 

  7. Gahl S. 1998. Automatic extraction of subcorpora based on subcategorization frames from a part-of-speech tagged corpus. Proceedings of COLING-ACL 1998, pp.428–432.

    Google Scholar 

  8. Glymour C. and Cooper G. (eds.). 1999. Computation, Causation & Discovery. AAAI Press/The MIT Press, Menlo Park.

    MATH  Google Scholar 

  9. Jeffreys H. 1939. Theory of Probability. Clarendon Press, Oxford.

    Google Scholar 

  10. Jensen F. 1996. An Introduction to Bayesian Networks. New York: Springer-Verlag.

    Google Scholar 

  11. Kawahara D., Kaji N. and Kurohashi S. 2000. Japanese Case Structure Analysis by Unsupervised Construction of a Case Frame Dictionary. Proceedings of COLING 2000.

    Google Scholar 

  12. Manning C. 1993. Automatic Acquisition of a Large Subcategorization Dictionary from Corpora. Proceedings of 31st Meeting of the ACL 1993, pp. 235–242.

    Google Scholar 

  13. Pearl J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  14. Sarkar A. and D. Zeman, 2000. Automatic Extraction of Subcategorization Frames for Czech. In Proceedings of the 18th International Conference on Computational Linguistics, pp. 691–697.

    Google Scholar 

  15. Stamatatos E., Fakotakis N. and Kokkinakis G. 2000. A Practical Chunker for Unrestricted Text. Proceedings of the 2nd International Conference of Natural Language Processing (NLP2000), pp. 139–150.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kermanidis, K., Maragoudakis, M., Fakotakis, N., Kokkinakis, G. (2001). Influence of Conditional Independence Assumption on Verb Subcategorization Detection. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-44805-5_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42557-1

  • Online ISBN: 978-3-540-44805-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics