Skip to main content

Using learned extraction patterns for text classification

  • Conference paper
  • First Online:
Book cover Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing (IJCAI 1995)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1040))

Included in the following conference series:

Abstract

A major knowledge-engineering bottleneck for information extraction systems is the process of constructing an appropriate dictionary of extraction patterns. AutoSlog is a dictionary construction system that has been shown to substantially reduce the time required for knowledge engineering by learning extraction patterns automatically. However, an open question was whether these extraction patterns were useful for tasks other than information extraction. We describe a series of experiments that show how the extraction patterns learned by AutoSlog can be used for text classification. Three dictionaries produced by AutoSlog for different domains performed well in our text classification experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Carbonell, J. G. 1979. Towards a Self-Extending Parser. In Proceedings of the 17th Meeting of the Association for Computational Linguistics. 3–7.

    Google Scholar 

  • DeJong, Gerald and Mooney, R. 1986. Explanation-Based Learning: An Alternative View. Machine Learning 1:145–176.

    Google Scholar 

  • Fisher, D. H. 1987. Knowledge Acquisition Via Incremental Conceptual Clustering. Machine Learning 2:139–172.

    Google Scholar 

  • Granger, R. H. 1977. FOUL-UP: A Program that Figures Out Meanings of Words from Context. In Proceedings of the Fifth International Joint Conference on Artificial Intelligence. 172–178.

    Google Scholar 

  • Jacobs, Paul and Rau, Lisa 1990. SCISOR: Extracting Information from On-Line News. Communications of the ACM 33(11):88–97.

    Google Scholar 

  • Jacobs, P. and Zernik, U. 1988. Acquiring Lexical Knowledge from Text: A Case Study. In Proceedings of the Seventh National Conference on Artificial Intelligence. 739–744.

    Google Scholar 

  • Kim, J. and Moldovan, D. 1993. Acquisition of Semantic Patterns for Information Extraction from Corpora. In Proceedings of the Ninth IEEE Conference on Artificial Intelligence for Applications, Los Alamitos, CA. IEEE Computer Society Press. 171–176.

    Google Scholar 

  • Lehnert, W. G. and Sundheim, B. 1991. A Performance Evaluation of Text Analysis Technologies. AI Magazine 12(3):81–94.

    Google Scholar 

  • Lehnert, W.; Cardie, C.; Fisher, D.; Riloff, E.; and Williams, R. 1991. University of Massachusetts: Description of the CIRCUS System as Used for MUC-3. In Proceedings of the Third Message Understanding Conference (MUC-3), San Mateo, CA. Morgan Kaufmann. 223–233.

    Google Scholar 

  • Lehnert, W.; Cardie, C.; Fisher, D.; McCarthy, J.; Riloff, E.; and Soderland, S. 1992. University of Massachusetts: MUC-4 Test Results and Analysis. In Proceedings of the Fourth Message Understanding Conference (MUC-4), San Mateo, CA. Morgan Kaufmann. 151–158.

    Google Scholar 

  • Lehnert, W. 1991. Symbolic/Subsymbolic Sentence Analysis: Exploiting the Best of Two Worlds. In Barnden, J. and Pollack, J., editors 1991, Advances in Connectionist and Neural Computation Theory, Vol. 1. Ablex Publishers, Norwood, NJ. 135–164.

    Google Scholar 

  • Mitchell, T. M.; Keller, R.; and Kedar-Cabelli, S. 1986. Explanation-Based Generalization: A Unifying View. Machine Learning 1:47–80.

    Google Scholar 

  • Proceedings of the Third Message Understanding Conference (MUC-3), San Mateo, CA. Morgan Kaufmann.

    Google Scholar 

  • Proceedings of the Fourth Message Understanding Conference (MUC-4), San Mateo, CA. Morgan Kaufmann.

    Google Scholar 

  • Proceedings of the Fifth Message Understanding Conference (MUC-5), San Francisco, CA. Morgan Kaufmann.

    Google Scholar 

  • Quinlan, J. R. 1986. Induction of Decision Trees. Machine Learning 1:80–106.

    Google Scholar 

  • Riloff, E. and Lehnert, W. 1994. Information Extraction as a Basis for High-Precision Text Classification. ACM Transactions on Information Systems 12(3):296–333.

    Google Scholar 

  • Riloff, E. and Shoen, J. 1995. Automatically Acquiring Conceptual Patterns Without an Annotated Corpus. In Proceedings of the Third Workshop on Very Large Corpora. 148–161.

    Google Scholar 

  • Riloff, E. 1993. Automatically Constructing a Dictionary for Information Extraction Tasks. In Proceedings of the Eleventh National Conference on Artificial Intelligence. AAAI Press/The MIT Press. 811–816.

    Google Scholar 

  • Riloff, E. 1996. An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains. Artificial Intelligence. To appear.

    Google Scholar 

  • Soderland, S.; Fisher, D.; Aseltine, J.; and Lehnert, W. 1995. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1314–1319.

    Google Scholar 

  • Proceedings of the TIPSTER Text Program (Phase I), San Francisco, CA. Morgan Kaufmann.

    Google Scholar 

  • Utgoff, P. 1988. ID5: An Incremental ID3. In Proceedings of the Fifth International Conference on Machine Learning. 107–120.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Stefan Wermter Ellen Riloff Gabriele Scheler

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Riloff, E. (1996). Using learned extraction patterns for text classification. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_53

Download citation

  • DOI: https://doi.org/10.1007/3-540-60925-3_53

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60925-4

  • Online ISBN: 978-3-540-49738-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics