Using learned extraction patterns for text classification

Riloff, Ellen

doi:10.1007/3-540-60925-3_53

Ellen Riloff¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1040))

Included in the following conference series:

International Joint Conference on Artificial Intelligence

206 Accesses
2 Citations

Abstract

A major knowledge-engineering bottleneck for information extraction systems is the process of constructing an appropriate dictionary of extraction patterns. AutoSlog is a dictionary construction system that has been shown to substantially reduce the time required for knowledge engineering by learning extraction patterns automatically. However, an open question was whether these extraction patterns were useful for tasks other than information extraction. We describe a series of experiments that show how the extraction patterns learned by AutoSlog can be used for text classification. Three dictionaries produced by AutoSlog for different domains performed well in our text classification experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Carbonell, J. G. 1979. Towards a Self-Extending Parser. In Proceedings of the 17th Meeting of the Association for Computational Linguistics. 3–7.
Google Scholar
DeJong, Gerald and Mooney, R. 1986. Explanation-Based Learning: An Alternative View. Machine Learning 1:145–176.
Google Scholar
Fisher, D. H. 1987. Knowledge Acquisition Via Incremental Conceptual Clustering. Machine Learning 2:139–172.
Google Scholar
Granger, R. H. 1977. FOUL-UP: A Program that Figures Out Meanings of Words from Context. In Proceedings of the Fifth International Joint Conference on Artificial Intelligence. 172–178.
Google Scholar
Jacobs, Paul and Rau, Lisa 1990. SCISOR: Extracting Information from On-Line News. Communications of the ACM 33(11):88–97.
Google Scholar
Jacobs, P. and Zernik, U. 1988. Acquiring Lexical Knowledge from Text: A Case Study. In Proceedings of the Seventh National Conference on Artificial Intelligence. 739–744.
Google Scholar
Kim, J. and Moldovan, D. 1993. Acquisition of Semantic Patterns for Information Extraction from Corpora. In Proceedings of the Ninth IEEE Conference on Artificial Intelligence for Applications, Los Alamitos, CA. IEEE Computer Society Press. 171–176.
Google Scholar
Lehnert, W. G. and Sundheim, B. 1991. A Performance Evaluation of Text Analysis Technologies. AI Magazine 12(3):81–94.
Google Scholar
Lehnert, W.; Cardie, C.; Fisher, D.; Riloff, E.; and Williams, R. 1991. University of Massachusetts: Description of the CIRCUS System as Used for MUC-3. In Proceedings of the Third Message Understanding Conference (MUC-3), San Mateo, CA. Morgan Kaufmann. 223–233.
Google Scholar
Lehnert, W.; Cardie, C.; Fisher, D.; McCarthy, J.; Riloff, E.; and Soderland, S. 1992. University of Massachusetts: MUC-4 Test Results and Analysis. In Proceedings of the Fourth Message Understanding Conference (MUC-4), San Mateo, CA. Morgan Kaufmann. 151–158.
Google Scholar
Lehnert, W. 1991. Symbolic/Subsymbolic Sentence Analysis: Exploiting the Best of Two Worlds. In Barnden, J. and Pollack, J., editors 1991, Advances in Connectionist and Neural Computation Theory, Vol. 1. Ablex Publishers, Norwood, NJ. 135–164.
Google Scholar
Mitchell, T. M.; Keller, R.; and Kedar-Cabelli, S. 1986. Explanation-Based Generalization: A Unifying View. Machine Learning 1:47–80.
Google Scholar
Proceedings of the Third Message Understanding Conference (MUC-3), San Mateo, CA. Morgan Kaufmann.
Google Scholar
Proceedings of the Fourth Message Understanding Conference (MUC-4), San Mateo, CA. Morgan Kaufmann.
Google Scholar
Proceedings of the Fifth Message Understanding Conference (MUC-5), San Francisco, CA. Morgan Kaufmann.
Google Scholar
Quinlan, J. R. 1986. Induction of Decision Trees. Machine Learning 1:80–106.
Google Scholar
Riloff, E. and Lehnert, W. 1994. Information Extraction as a Basis for High-Precision Text Classification. ACM Transactions on Information Systems 12(3):296–333.
Google Scholar
Riloff, E. and Shoen, J. 1995. Automatically Acquiring Conceptual Patterns Without an Annotated Corpus. In Proceedings of the Third Workshop on Very Large Corpora. 148–161.
Google Scholar
Riloff, E. 1993. Automatically Constructing a Dictionary for Information Extraction Tasks. In Proceedings of the Eleventh National Conference on Artificial Intelligence. AAAI Press/The MIT Press. 811–816.
Google Scholar
Riloff, E. 1996. An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains. Artificial Intelligence. To appear.
Google Scholar
Soderland, S.; Fisher, D.; Aseltine, J.; and Lehnert, W. 1995. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1314–1319.
Google Scholar
Proceedings of the TIPSTER Text Program (Phase I), San Francisco, CA. Morgan Kaufmann.
Google Scholar
Utgoff, P. 1988. ID5: An Incremental ID3. In Proceedings of the Fifth International Conference on Machine Learning. 107–120.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Utah, 84112, Salt Lake City, UT, USA
Ellen Riloff

Authors

Ellen Riloff
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Stefan Wermter Ellen Riloff Gabriele Scheler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Riloff, E. (1996). Using learned extraction patterns for text classification. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_53

Download citation

DOI: https://doi.org/10.1007/3-540-60925-3_53
Published: 07 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics