Hostname: page-component-8448b6f56d-sxzjt Total loading time: 0 Render date: 2024-04-18T15:49:03.964Z Has data issue: false hasContentIssue false

Twenty-five years of information extraction

Published online by Cambridge University Press:  20 September 2019

Ralph Grishman*
Affiliation:
Computer Science Dept., New York University, 60 Fifth Avenue, Room 300, New York NY 10011, USA
*
*Corresponding author. Email: grishman@cs.nyu.edu

Abstract

Information extraction is the process of converting unstructured text into a structured data base containing selected information from the text. It is an essential step in making the information content of the text usable for further processing. In this paper, we describe how information extraction has changed over the past 25 years, moving from hand-coded rules to neural networks, with a few stops on the way. We connect these changes to research advances in NLP and to the evaluations organized by the US Government.

Type
Article
Copyright
© Cambridge University Press 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agichtein, E. and Gravano, L. (2000). Snowball: extracting relations from large plain-text collections. In Proceedings of the Fifth ACM Conference on Digital Libraries, DL ’00. New York, NY, USA: ACM, pp. 8594.CrossRefGoogle Scholar
Aguilar, J., Beller, C., McNamee, P., Van Durme, B., Strassel, S., Song, Z. and Ellis, J. (2014). A comparison of the events and relations across ACE, ERE, TAC-KBP, and FrameNet annotation standards. In Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference, and Representation, Baltimore, MD, USA. Association for Computational Linguistics, pp. 4553.CrossRefGoogle Scholar
Ahn, D. (2006). The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events, ACL, pp. 18.CrossRefGoogle Scholar
Akbik, A., Bergmann, T. and Vollgraf, R. (2019). Pooled contextualized embeddings for named entity recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, USA. Association for Computational Linguistics, pp. 724728.CrossRefGoogle Scholar
Cherry, C. and Guo, H. (2015). The unreasonable effectiveness of word representations for twitter named entity recognition. In HLT-NAACL, ACL.CrossRefGoogle Scholar
Chinchor, N., Hirschman, L. and Lewis, D.D. (1993). Evaluating message understanding systems: an analysis of the third Message Understanding Conference (MUC-3). Computational Linguistics 19(3), 409450.Google Scholar
Church, K.W. (1988). A stochastic parts program and noun phrase parser for unrestricted text. In Second Conference on Applied Natural Language Processing, Austin, Texas, USA. Association for Computational Linguistics, pp. 136143.CrossRefGoogle Scholar
Collins, M. and Singer, Y. (1999). Unsupervised models for named entity classification. In 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, ACL.Google Scholar
Collins, M.J. (1996). A new statistical parser based on bigram lexical dependencies. In 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, USA. Association for Computational Linguistics, pp. 184191.CrossRefGoogle Scholar
Devlin, J., Chang, M., Lee, K. and Toutanova, K. (2018). BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.Google Scholar
Ding, X., Zhang, Y., Liu, T. and Duan, J. (2015). Deep learning for event-driven stock prediction. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), AAAI.Google Scholar
Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S. and Weischedel, R. (2004). The automatic content extraction (ACE) program – tasks, data, and evaluation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC ’04), Lisbon, Portugal. European Language Resources Association (ELRA).Google Scholar
Finkel, J.R., Grenager, T. and Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL ’05), Ann Arbor, Michigan, USA. Association for Computational Linguistics, pp. 363370.CrossRefGoogle Scholar
Grishman, R. and Sundheim, B. (1996). Message Understanding Conference- 6: a brief history. In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics.Google Scholar
Hirschman, L., Robinson, P., Ferro, L., Chinchor, N., Brown, E., Grishman, R. and Sundheim, B. (1999). Hub-4 Event’99 general guidelines and templettes. In Broadcast News Workshop ’99 Proceedings. Morgan Kaufman.Google Scholar
Hobbs, J.R. (1993). The generic information extraction system. In Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, Morgan Kaufmann, August 25–27, 1993.Google Scholar
Hobbs, J.R., Appelt, D., Bear, J., Israel, D., Kameyalna, M. and Tyson, M. (1993). FASTUS: a system for extracting information from text. In HUMAN LANGUAGE TECHNOLOGY: Proceedings of a Workshop Held at Plainsboro, New Jersey, Morgan Kaufmann, March 21–24, 1993.Google Scholar
Hobbs, J.R., Appelt, D.E., Bear, J., Israel, D.J., Kameyama, M., Stickel, M.E. and Tyson, M. (1997). FASTUS: a cascaded finite-state transducer for extracting information from natural-language text. CoRR, cmp-lg/9705013.Google Scholar
Huang, L., Ji, H., Cho, K., Dagan, I., Riedel, S. and Voss, C. (2018). Zero-shot transfer learning for event extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia. Association for Computational Linguistics, pp. 21602170.CrossRefGoogle Scholar
Huang, Y.J., Lu, J., Kurohashi, S. and Ng, V. (2019). Improving event coreference resolution by learning argument compatibility from unlabeled data. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, USA. Association for Computational Linguistics, pp. 785795.CrossRefGoogle Scholar
Ji, H. and Grishman, R. (2011). Knowledge base population: successful approaches and challenges. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA. Association for Computational Linguistics, pp. 11481158.Google Scholar
Jiang, J. and Zhai, C. (2007). A systematic exploration of the feature space for relation extraction. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, USA. Association for Computational Linguistics, pp. 113120.Google Scholar
Kambhatla, N. (2004). Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction. In Proceedings of the ACL Interactive Poster and Demonstration Sessions, Barcelona, Spain. Association for Computational Linguistics, pp. 178181.Google Scholar
Klein, D., Smarr, J., Nguyen, H. and Manning, C.D. (2003). Named entity recognition with character-level models. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 180183.CrossRefGoogle Scholar
Levy, O., Seo, M., Choi, E. and Zettlemoyer, L. (2017). Zero-shot relation extraction via reading comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada. Association for Computational Linguistics, pp. 333342.CrossRefGoogle Scholar
Li, J., Sun, A., Han, J. and Li, C. (2018). A survey on deep learning for named entity recognition. CoRR, abs/1812.09449.Google Scholar
Li, Q., Ji, H. and Huang, L. (2013). Joint event extraction via structured prediction with global features. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Sofia, Bulgaria. Association for Computational Linguistics, pp. 7382.Google Scholar
Liu, X., Zhang, S., Wei, F. and Zhou, M.T. 2011. Recognizing named entities in tweets. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL.Google Scholar
Lu, J. and Ng, V. (2018). Event coreference resolution: a survey of two decades of research. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI ’18. AAAI Press, pp. 54795486.Google Scholar
Lu, W. and Nguyen, T.H. (2018). Similar but not the same: word sense disambiguation improves event detection via neural representation matching. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics, pp. 48224828.Google Scholar
Luo, X. (2005). On coreference resolution performance metrics. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05. Association for Computational Linguistics, pp. 2532.CrossRefGoogle Scholar
Marcus, M.P., Santorini, B. and Marcinkiewicz, M.A. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313330.Google Scholar
Miller, S., Crystal, M., Fox, H., Ramshaw, L., Schwartz, R., Stone, R., Weischedel, R. and The Annotation Group (1998). BBN: description of the SIFT system as used for MUC-7. In Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29–May 1, 1998.Google Scholar
Min, B. and Grishman, R. (2012). Compensating for annotation errors in training a relation extractor. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France. Association for Computational Linguistics, pp. 194203.Google Scholar
Mintz, M., Bills, S., Snow, R. and Jurafsky, D. (2009). Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore. Association for Computational Linguistics, pp. 10031011.CrossRefGoogle Scholar
MUC (no date). The message understanding conference scoring software user’s manual. Available at https://www-nlpir.nist.gov/related_projects/muc/muc_sw/muc_sw_manual.htmlGoogle Scholar
MUC (1991). Appendix H: Text and answer key templates for TST1-MUC3-0099. In THIRD MESSAGE UNDERSTANDING CONFERENCE (MUC-3): Proceedings of a Conference Held in San Diego, California, May 21–23, 1991.Google Scholar
MUC (1993). Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, August 25–27, 1993.Google Scholar
Nadeau, D. and Sekine, S. (2007). A survey of named entity recognition and classification. Linguisticae Investigationes 30(1), 326.Google Scholar
Ng, V. (2017). Machine learning for entity coreference resolution: a retrospective look at two decades of research. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI, pp. 48774884.Google Scholar
Nguyen, T.H., Cho, K. and Grishman, R. (2016). Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California. Association for Computational Linguistics, pp. 300309.CrossRefGoogle Scholar
Nguyen, T.H. and Grishman, R. (2015). Relation extraction: perspective from convolutional neural networks. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, Denver, Colorado, USA. Association for Computational Linguistics, pp. 3948.CrossRefGoogle Scholar
Okurowski, M.E. (1993). Information extraction overview. In TIPSTER TEXT PROGRAM: PHASE I: Proceedings of a Workshop held at Fredricksburg, Virginia, September 19–23, 1993, Fredericksburg, Virginia, USA. Association for Computational Linguistics, pp. 117121.CrossRefGoogle Scholar
Panem, S., Gupta, M. and Varma, V. (2014). Structured information extraction from natural disaster events on twitter. Available at https://www.microsoft.com/en-us/research/publication/structured-information-extraction-from-natural-disaster-events-on-twitter/ CrossRefGoogle Scholar
Pershina, M., Min, B., Xu, W. and Grishman, R. (2014). Infusion of labeled data into distant supervision for relation extraction. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Baltimore, Maryland, USA. Association for Computational Linguistics, pp. 732738.CrossRefGoogle Scholar
Peters, S.E., Zhang, C., Livny, M. and R.C. (2014). A machine reading system for assembling synthetic paleontological databases. PLOS. Available at https://doi.org/10.1371/journal.pone.0113523 CrossRefGoogle Scholar
Riloff, E. (1996). Automatically generating extraction patterns from untagged text. In Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, AAAI 96, IAAI 96, Portland, Oregon, USA, AAAI, August 4–8, 1996, Vol. 2, pp. 10441049.Google Scholar
Ritter, A., Clark, S., Mausam, and Etzioni, O. (2011). Named entity recognition in tweets: an experimental study. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, ACL, pp. 15241534.Google Scholar
Strauss, B., Toma, B., Ritter, A., de Marneffe, M.-C. and Xu, W. (2016). Results of the WNUT16 named entity recognition shared task. In Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT), Osaka, Japan. The COLING 2016 Organizing Committee, pp. 138144.Google Scholar
Sundheim, B.M. (1996). The message understanding conferences. In TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6–8, 1996, Vienna, Virginia, USA. Association for Computational Linguistics, pp. 3537.CrossRefGoogle Scholar
Surdeanu, M., Tibshirani, J., Nallapati, R. and Manning, C.D. (2012). Multi-instance multi-label learning for relation extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Korea. Association for Computational Linguistics, pp. 455465.Google Scholar
Wang, Y., Wang, L., Rastegar-Mojarad, M., Moon, S., Shen, F., Afzal, N., Liu, S., Zeng, Y., Mehrabi, S., Sohn, S. and Liu, H. (2018). Clinical information extraction applications: a literature review. Journal of Biomedical Informatics 77, 3449.CrossRefGoogle ScholarPubMed
Yangarber, R., Grishman, R., Tapanainen, P. and Huttunen, S. (2000). Automatic acquisition of domain knowledge for information extraction. In COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31–August 4, 2000. Saarbrücken, Germany: Universität des Saarlandes, pp. 940946.CrossRefGoogle Scholar
Yin, W., Kann, K., Yu, M. and Schütze, H. (2017). Comparative study of CNN and RNN for natural language processing. arXiv.1702.01923v1. 17 Feb 2017.Google Scholar
Zhao, S. and Grishman, R. (2005). Extracting relations with integrated information using kernel methods. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL ’05), Ann Arbor, Michigan. Association for Computational Linguistics, pp. 419426.CrossRefGoogle Scholar