Comparative results on using inductive logic programming for corpus-based parser construction

Zelle, John M.; Mooney, Raymond J.

doi:10.1007/3-540-60925-3_59

John M. Zelle¹ &
Raymond J. Mooney²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1040))

Included in the following conference series:

International Joint Conference on Artificial Intelligence

193 Accesses
5 Citations

Abstract

This paper presents results from recent experiments with Chill, a corpus-based parser acquisition system. Chill treats language acquisition as the learning of search-control rules within a logic program. Unlike many current corpus-based approaches that use statistical learning algorithms, Chill uses techniques from inductive logic programming (ILP) to learn relational representations. Chill is a very flexible system and has been used to learn parsers that produce syntactic parse trees, case-role analyses, and executable database queries. The reported experiments compare Chill's performance to that of a more naive application of ILP to parser acquisition. The results show that ILP techniques, as employed in Chill, are a viable alternative to statistical methods and that the control-rule framework is fundamental to Chill's success.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

B. Berwick. The Acquisition of Syntactic Knowledge. MIT Press, Cambridge, MA, 1985.
Google Scholar
E. Black and et. al. A procedure for quantitatively comparing the syntactic coverage of English grammars. In Proceedings of the Fourth DARPA Speech and Natural Language Workshop, pages 306–311, 1991.
Google Scholar
E. Black, F. Jelineck, J. Lafferty, D. Magerman, R. Mercer, and S. Roukos. Towards history-based grammars: Using richer models for probabilistic parsing. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 31–37, Columbus, Ohio, 1993.
Google Scholar
E. Black, J. Lafferty, and S. Roukaos. Development and evaluation of a broad-coverage probabilistic grammar of English-language computer manuals. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pages 185–192, Newark, Delaware, 1992.
Google Scholar
Borland International. Turbo Prolog 2.0 Reference Guide. Borland International, Scotts Valley, CA, 1988.
Google Scholar
E. Brill. Automatic grammar induction and parsing free text: A transformation-based approach. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 259–265, Columbus, Ohio, 1993.
Google Scholar
Eugene Charniak and Glenn Carroll. Context-sensitive statistics for improved grammatical language models. In Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, August 1994.
Google Scholar
D. Hindle and M. Rooth. Structural ambiguity and lexical relations. Computational Linguistics, 19(1):103–120, 1993.
Google Scholar
B. Kijsirikul, M. Numao, and M. Shimura. Discrimination-based constructive induction of logic programs. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 44–49, San Jose, CA, July 1992.
Google Scholar
N. Lavrac and S. Dzeroski, editors. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, 1994.
Google Scholar
Jill Fain Lehman. Toward the essential nature of satistical knowledge in sense resolution. In Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, August 1994.
Google Scholar
David M. Magerman. Natrual Lagnuage Parsing as Statistical Pattern Recognition. PhD thesis, Stanford University, 1994.
Google Scholar
Christopher D. Manning. Automatic acquisition of a large subcategorization dictionary from corpora. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 235–242, Columbus, Ohio, 1993.
Google Scholar
M. Marcus, B. Santorini, and M.A. Marcinkiewicz. Building a large annotated corpus of English: The Penn treebank. Computational Linguistics, 19(2):313–330, 1993.
Google Scholar
J. L. McClelland and A. H. Kawamoto. Mechanisms of sentence processing: Assigning roles to constituents of sentences. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed Processing, Vol. II, pages 318–362. MIT Press, Cambridge, MA, 1986.
Google Scholar
R. Miikkulainen and M. G. Dyer. Natural language processing with modular PDP networks and distributed lexicon. Cognitive Science, 15:343–399, 1991.
Google Scholar
Scott Miller, Robert Bobrow, Robert Ingria, and Richard Schwartz. Hidden understanding models of natural language. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 25–32, 1994.
Google Scholar
R. J. Mooney and M. E. Califf. Induction of first-order decision lists: Results on learning the past tense of English verbs. Journal of Artificial Intelligence Research, 3:1–24, 1995.
Google Scholar
S. Muggleton and W. Buntine. Machine invention of first-order predicates by inverting resolution. In Proceedings of the Fifth International Conference on Machine Learning, pages 339–352, Ann Arbor, MI, June 1988.
Google Scholar
S. Muggleton and C. Feng. Efficient induction of logic programs. In S. Muggleton, editor, Inductive Logic Programming, pages 281–297. Academic Press, New York, 1992.
Google Scholar
S. Muggleton, R. King, and M. Sternberg. Protein secondary structure prediction using logic-based machine learning. Protein Engineering, 5(7):647–657, 1992.
PubMed Google Scholar
S. H. Muggleton, editor. Inductive Logic Programming. Academic Press, New York, NY, 1992.
Google Scholar
F. Periera and Y. Shabes. Inside-outside reestimation from partially bracketed corpora. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pages 128–135, Newark, Delaware, 1992.
Google Scholar
J. R. Quinlan and R. M. Cameron-Jones. FOIL: A midterm report. In Proceedings of the European Conference on Machine Learning, pages 3–20, Vienna, 1993.
Google Scholar
J.R. Quinlan. Learning logical definitions from relations. Machine Learning, 5(3):239–266, 1990.
Google Scholar
R. F. Simmons and Y. Yu. The acquisition and use of context dependent grammars for English. Computational Linguistics, 18(4):391–418, 1992.
Google Scholar
J. M. Zelle. Using Inductive Logic Programming to Automate the Construction of Natural Language Parsers. PhD thesis, University of Texas, Austin, TX, August 1995.
Google Scholar
J. M. Zelle and R. J. Mooney. Learning semantic grammars with constructive inductive logic programming. In Proceedings of the Eleventh National Conference on Artificial Intelligence, pages 817–822, Washington, D.C., July 1993.
Google Scholar
J. M. Zelle and R. J. Mooney. Combining top-down and bottom-up methods in inductive logic programming. In Proceedings of the Eleventh International Conference on Machine Learning, pages 343–351, New Brunswick, NJ, July 1994.
Google Scholar
J. M. Zelle and R. J. Mooney. Inducing deterministic Prolog parsers from tree-banks: A machine learning approach. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 748–753, Seattle, WA, August 1994.
Google Scholar
John M. Zelle, Cynthia A. Thompson, Mary Elaine Califf, and Raymond J. Mooney. Inducing logic programs without explicit negative examples. In Proceedings of the Fifth International Workshop on Inductive Logic Programming, 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Drake University, 50311, Des Moines, IA, USA
John M. Zelle
Department of Computer Sciences, University of Texas, 78712, Austin, TX, USA
Raymond J. Mooney

Authors

John M. Zelle
View author publications
You can also search for this author in PubMed Google Scholar
Raymond J. Mooney
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Stefan Wermter Ellen Riloff Gabriele Scheler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zelle, J.M., Mooney, R.J. (1996). Comparative results on using inductive logic programming for corpus-based parser construction. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_59

Download citation

DOI: https://doi.org/10.1007/3-540-60925-3_59
Published: 07 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics