Improving Learning by Choosing Examples Intelligently in Two Natural Language Tasks

Thompson, Cynthia A.; Elaine Califf, Mary

doi:10.1007/3-540-40030-3_18

Cynthia A. Thompson³ &
Mary Elaine Califf⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1925))

Included in the following conference series:

International Conference on Learning Language in Logic

392 Accesses
2 Citations

Abstract

In this chapter, we present relational learning algorithms for two natural language processing tasks, semantic parsing and information extraction. We describe the algorithms and present experimental results showing their effectiveness. We also describe our application of active learning techniques to these learning systems.We applied certainty-based selective sampling to each system, using fairly simple notions of certainty. We show that these selective sampling techniques greatly reduce the number of annotated examples required for the systems to achieve good generalization performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bennett, S., Aone, C., & Lovell, C. (1997). Learning to tag multilingual texts through observation. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 109–116.
Google Scholar
Berwick, B. (1985). The Acquisition of Syntactic Knowledge. MIT Press, Cambridge, MA.
Google Scholar
Borland International (1988). Turbo Prolog 2.0 Reference Guide. Borland International, Scotts Valley, CA.
Google Scholar
Brill, E. (1994). Some advances in rule-based part of speech tagging. In Proceedings of the Eleventh National Conference on Artificial Intelligence, pp. 722–727 Washington, D.C.
Google Scholar
Briscoe, T., & Carroll, J. (1993). Generalized probabilistic LR parsing of natural language (corpora) with unification-based grammars. Computational Linguistics, 19(1), 25–59.
Google Scholar
Cali., M., & Mooney, R. (1999). Relational learning of pattern-match rules for information extraction. In Proceedings of the Sixteenth National Conference on Artificial Intelligence, pp. 328–334 Orlando, FL.
Google Scholar
Cohn, D., Atlas, L., & Ladner, R. (1994). Improving generalization with active learning. Machine Learning, 15(2), 201–221.
Google Scholar
Dagan, I., & Engelson, S. P. (1995). Committee-based sampling for training probabilistic classifiers. In Proceedings of the Twelfth International Conference on Machine Learning, pp. 150–157 San Francisco, CA. Morgan Kaufman.
Google Scholar
Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.
MATH Google Scholar
Fillmore, C. J. (1968). The case for case. In Bach, E., & Harms, R. T. (Eds.), Universals in Linguistic Theory. Holt, Reinhart and Winston, New York.
Google Scholar
Freitag, D. (2000). Machine learning for information extraction in informal domains. Machine Learning, 39(2/3), 169–202.
Article MATH Google Scholar
Freitag, D. (1998). Multi-strategy learning for information extraction. In Proceedings of the Fifteenth International Conference on Machine Learning, pp. 161–169.
Google Scholar
Freund, Y., Seung, H. S., Shamir, E., & Tishby, N. (1997). Selective sampling using the query by committee algorithm. Machine Learning, 28, 133–168.
Article MATH Google Scholar
Holte, R. C., Acker, L., & Porter, B. (1989). Concept learning and the problem of small disjuncts. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pp. 813–818 Detroit, MI.
Google Scholar
Junker, M., Sintek, M., & Rinck, M. (2000). Learning for text categorization and information extraction with ILP. In This volume.
Google Scholar
Lehnert, W., & Sundheim, B. (1991). A performance evaluation of textanalysis technologies. AI Magazine, 12(3), 81–94.
Google Scholar
Lewis, D. D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the Eleventh International Conference on Machine Learning, pp. 148–156 New Brunswick, NJ. Morgan Kaufman.
Google Scholar
Liere, R., & Tadepalli, P. (1997). Active learning with committees for text categorization. In Proceedings of the Fourteenth National Conference on Artificial Intelligence, pp. 591–596 Providence, RI.
Google Scholar
Magerman, D. M. (1995). Statistical decision-tree models for parsing. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pp. 276–283 Cambridge, MA.
Google Scholar
Muggleton, S., & Feng, C. (1990). Efficient induction of logic programs. In Proceedings of the First Conference on Algorithmic Learning Theory Ohmsha, Tokyo, Japan.
Google Scholar
Plotkin, G. D. (1970). A note on inductive generalization. In Meltzer, B., & Michie, D. (Eds.), Machine Intelligence (Vol. 5). Elsevier North-Holland, New York.
Google Scholar
Quinlan, J. (1990). Learning logical definitions from relations. Machine Learning, 5(3), 239–266.
Google Scholar
Simmons, R. F., & Yu, Y. (1992). The acquisition and use of context dependent grammars for Engl ish. Computational Linguistics, 18(4), 391–418.
Google Scholar
Soderland, S. (1999). Learning information extraction rules for semistructured and free text. Machine Learning, 34, 233–272.
Article MATH Google Scholar
Zelle, J. M., & Mooney, R. J. (1996). Learning to parse database queries using inductive logic programming. In Proceedings of the Thirteenth National Conference on Artificial Intelligence Portland, OR.
Google Scholar

Download references

Author information

Authors and Affiliations

CSLI, Ventura Hall, Stanford University, CA 94305, Stanford, USA
Cynthia A. Thompson
Department of Applied Computer Science, Illinois State University, IL 61790, Normal, USA
Mary Elaine Califf

Authors

Cynthia A. Thompson
View author publications
You can also search for this author in PubMed Google Scholar
Mary Elaine Califf
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of York, YO10 5DD, Heslington, York, UK
James Cussens
Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Sašo Džeroski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Thompson, C.A., Elaine Califf, M. (2000). Improving Learning by Choosing Examples Intelligently in Two Natural Language Tasks. In: Cussens, J., Džeroski, S. (eds) Learning Language in Logic. LLL 1999. Lecture Notes in Computer Science(), vol 1925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40030-3_18

Download citation

DOI: https://doi.org/10.1007/3-540-40030-3_18
Published: 01 February 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41145-1
Online ISBN: 978-3-540-40030-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics