Abstract
This paper presents results from recent experiments with Chill, a corpus-based parser acquisition system. Chill treats language acquisition as the learning of search-control rules within a logic program. Unlike many current corpus-based approaches that use statistical learning algorithms, Chill uses techniques from inductive logic programming (ILP) to learn relational representations. Chill is a very flexible system and has been used to learn parsers that produce syntactic parse trees, case-role analyses, and executable database queries. The reported experiments compare Chill's performance to that of a more naive application of ILP to parser acquisition. The results show that ILP techniques, as employed in Chill, are a viable alternative to statistical methods and that the control-rule framework is fundamental to Chill's success.
Preview
Unable to display preview. Download preview PDF.
References
B. Berwick. The Acquisition of Syntactic Knowledge. MIT Press, Cambridge, MA, 1985.
E. Black and et. al. A procedure for quantitatively comparing the syntactic coverage of English grammars. In Proceedings of the Fourth DARPA Speech and Natural Language Workshop, pages 306–311, 1991.
E. Black, F. Jelineck, J. Lafferty, D. Magerman, R. Mercer, and S. Roukos. Towards history-based grammars: Using richer models for probabilistic parsing. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 31–37, Columbus, Ohio, 1993.
E. Black, J. Lafferty, and S. Roukaos. Development and evaluation of a broad-coverage probabilistic grammar of English-language computer manuals. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pages 185–192, Newark, Delaware, 1992.
Borland International. Turbo Prolog 2.0 Reference Guide. Borland International, Scotts Valley, CA, 1988.
E. Brill. Automatic grammar induction and parsing free text: A transformation-based approach. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 259–265, Columbus, Ohio, 1993.
Eugene Charniak and Glenn Carroll. Context-sensitive statistics for improved grammatical language models. In Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, August 1994.
D. Hindle and M. Rooth. Structural ambiguity and lexical relations. Computational Linguistics, 19(1):103–120, 1993.
B. Kijsirikul, M. Numao, and M. Shimura. Discrimination-based constructive induction of logic programs. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 44–49, San Jose, CA, July 1992.
N. Lavrac and S. Dzeroski, editors. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, 1994.
Jill Fain Lehman. Toward the essential nature of satistical knowledge in sense resolution. In Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, August 1994.
David M. Magerman. Natrual Lagnuage Parsing as Statistical Pattern Recognition. PhD thesis, Stanford University, 1994.
Christopher D. Manning. Automatic acquisition of a large subcategorization dictionary from corpora. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 235–242, Columbus, Ohio, 1993.
M. Marcus, B. Santorini, and M.A. Marcinkiewicz. Building a large annotated corpus of English: The Penn treebank. Computational Linguistics, 19(2):313–330, 1993.
J. L. McClelland and A. H. Kawamoto. Mechanisms of sentence processing: Assigning roles to constituents of sentences. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed Processing, Vol. II, pages 318–362. MIT Press, Cambridge, MA, 1986.
R. Miikkulainen and M. G. Dyer. Natural language processing with modular PDP networks and distributed lexicon. Cognitive Science, 15:343–399, 1991.
Scott Miller, Robert Bobrow, Robert Ingria, and Richard Schwartz. Hidden understanding models of natural language. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 25–32, 1994.
R. J. Mooney and M. E. Califf. Induction of first-order decision lists: Results on learning the past tense of English verbs. Journal of Artificial Intelligence Research, 3:1–24, 1995.
S. Muggleton and W. Buntine. Machine invention of first-order predicates by inverting resolution. In Proceedings of the Fifth International Conference on Machine Learning, pages 339–352, Ann Arbor, MI, June 1988.
S. Muggleton and C. Feng. Efficient induction of logic programs. In S. Muggleton, editor, Inductive Logic Programming, pages 281–297. Academic Press, New York, 1992.
S. Muggleton, R. King, and M. Sternberg. Protein secondary structure prediction using logic-based machine learning. Protein Engineering, 5(7):647–657, 1992.
S. H. Muggleton, editor. Inductive Logic Programming. Academic Press, New York, NY, 1992.
F. Periera and Y. Shabes. Inside-outside reestimation from partially bracketed corpora. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pages 128–135, Newark, Delaware, 1992.
J. R. Quinlan and R. M. Cameron-Jones. FOIL: A midterm report. In Proceedings of the European Conference on Machine Learning, pages 3–20, Vienna, 1993.
J.R. Quinlan. Learning logical definitions from relations. Machine Learning, 5(3):239–266, 1990.
R. F. Simmons and Y. Yu. The acquisition and use of context dependent grammars for English. Computational Linguistics, 18(4):391–418, 1992.
J. M. Zelle. Using Inductive Logic Programming to Automate the Construction of Natural Language Parsers. PhD thesis, University of Texas, Austin, TX, August 1995.
J. M. Zelle and R. J. Mooney. Learning semantic grammars with constructive inductive logic programming. In Proceedings of the Eleventh National Conference on Artificial Intelligence, pages 817–822, Washington, D.C., July 1993.
J. M. Zelle and R. J. Mooney. Combining top-down and bottom-up methods in inductive logic programming. In Proceedings of the Eleventh International Conference on Machine Learning, pages 343–351, New Brunswick, NJ, July 1994.
J. M. Zelle and R. J. Mooney. Inducing deterministic Prolog parsers from tree-banks: A machine learning approach. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 748–753, Seattle, WA, August 1994.
John M. Zelle, Cynthia A. Thompson, Mary Elaine Califf, and Raymond J. Mooney. Inducing logic programs without explicit negative examples. In Proceedings of the Fifth International Workshop on Inductive Logic Programming, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zelle, J.M., Mooney, R.J. (1996). Comparative results on using inductive logic programming for corpus-based parser construction. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_59
Download citation
DOI: https://doi.org/10.1007/3-540-60925-3_59
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive