Twenty-five years of information extraction

Ralph Grishman

doi:10.1017/S1351324919000512

Twenty-five years of information extraction

Published online by Cambridge University Press: 20 September 2019

Ralph Grishman

Show author details

Ralph Grishman*: Affiliation:
Computer Science Dept., New York University, 60 Fifth Avenue, Room 300, New York NY 10011, USA
*: *Corresponding author. Email: grishman@cs.nyu.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Information extraction is the process of converting unstructured text into a structured data base containing selected information from the text. It is an essential step in making the information content of the text usable for further processing. In this paper, we describe how information extraction has changed over the past 25 years, moving from hand-coded rules to neural networks, with a few stops on the way. We connect these changes to research advances in NLP and to the evaluations organized by the US Government.

Keywords

Information extraction Message understanding

Type: Article
Information: Natural Language Engineering , Volume 25 , Issue 6 , November 2019 , pp. 677 - 692

DOI: https://doi.org/10.1017/S1351324919000512 [Opens in a new window]
Copyright: © Cambridge University Press 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agichtein, E. and Gravano, L. (2000). Snowball: extracting relations from large plain-text collections. In Proceedings of the Fifth ACM Conference on Digital Libraries, DL ’00. New York, NY, USA: ACM, pp. 85–94.CrossRef Google Scholar

Aguilar, J., Beller, C., McNamee, P., Van Durme, B., Strassel, S., Song, Z. and Ellis, J. (2014). A comparison of the events and relations across ACE, ERE, TAC-KBP, and FrameNet annotation standards. In Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference, and Representation, Baltimore, MD, USA. Association for Computational Linguistics, pp. 45–53.CrossRef Google Scholar

Ahn, D. (2006). The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events, ACL, pp. 1–8.CrossRef Google Scholar

Akbik, A., Bergmann, T. and Vollgraf, R. (2019). Pooled contextualized embeddings for named entity recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, USA. Association for Computational Linguistics, pp. 724–728.CrossRef Google Scholar

Cherry, C. and Guo, H. (2015). The unreasonable effectiveness of word representations for twitter named entity recognition. In HLT-NAACL, ACL.CrossRef Google Scholar

Chinchor, N., Hirschman, L. and Lewis, D.D. (1993). Evaluating message understanding systems: an analysis of the third Message Understanding Conference (MUC-3). Computational Linguistics 19(3), 409–450.Google Scholar

Church, K.W. (1988). A stochastic parts program and noun phrase parser for unrestricted text. In Second Conference on Applied Natural Language Processing, Austin, Texas, USA. Association for Computational Linguistics, pp. 136–143.CrossRef Google Scholar

Collins, M. and Singer, Y. (1999). Unsupervised models for named entity classification. In 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, ACL.Google Scholar

Collins, M.J. (1996). A new statistical parser based on bigram lexical dependencies. In 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, USA. Association for Computational Linguistics, pp. 184–191.CrossRef Google Scholar

Devlin, J., Chang, M., Lee, K. and Toutanova, K. (2018). BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.Google Scholar

Ding, X., Zhang, Y., Liu, T. and Duan, J. (2015). Deep learning for event-driven stock prediction. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), AAAI.Google Scholar

Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S. and Weischedel, R. (2004). The automatic content extraction (ACE) program – tasks, data, and evaluation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC ’04), Lisbon, Portugal. European Language Resources Association (ELRA).Google Scholar

Finkel, J.R., Grenager, T. and Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL ’05), Ann Arbor, Michigan, USA. Association for Computational Linguistics, pp. 363–370.CrossRef Google Scholar

Grishman, R. and Sundheim, B. (1996). Message Understanding Conference- 6: a brief history. In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics.Google Scholar

Hirschman, L., Robinson, P., Ferro, L., Chinchor, N., Brown, E., Grishman, R. and Sundheim, B. (1999). Hub-4 Event’99 general guidelines and templettes. In Broadcast News Workshop ’99 Proceedings. Morgan Kaufman.Google Scholar

Hobbs, J.R. (1993). The generic information extraction system. In Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, Morgan Kaufmann, August 25–27, 1993.Google Scholar

Hobbs, J.R., Appelt, D., Bear, J., Israel, D., Kameyalna, M. and Tyson, M. (1993). FASTUS: a system for extracting information from text. In HUMAN LANGUAGE TECHNOLOGY: Proceedings of a Workshop Held at Plainsboro, New Jersey, Morgan Kaufmann, March 21–24, 1993.Google Scholar

Hobbs, J.R., Appelt, D.E., Bear, J., Israel, D.J., Kameyama, M., Stickel, M.E. and Tyson, M. (1997). FASTUS: a cascaded finite-state transducer for extracting information from natural-language text. CoRR, cmp-lg/9705013.Google Scholar

Huang, L., Ji, H., Cho, K., Dagan, I., Riedel, S. and Voss, C. (2018). Zero-shot transfer learning for event extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia. Association for Computational Linguistics, pp. 2160–2170.CrossRef Google Scholar

Huang, Y.J., Lu, J., Kurohashi, S. and Ng, V. (2019). Improving event coreference resolution by learning argument compatibility from unlabeled data. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, USA. Association for Computational Linguistics, pp. 785–795.CrossRef Google Scholar

Ji, H. and Grishman, R. (2011). Knowledge base population: successful approaches and challenges. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA. Association for Computational Linguistics, pp. 1148–1158.Google Scholar

Jiang, J. and Zhai, C. (2007). A systematic exploration of the feature space for relation extraction. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, Rochester, New York, USA. Association for Computational Linguistics, pp. 113–120.Google Scholar

Kambhatla, N. (2004). Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction. In Proceedings of the ACL Interactive Poster and Demonstration Sessions, Barcelona, Spain. Association for Computational Linguistics, pp. 178–181.Google Scholar

Klein, D., Smarr, J., Nguyen, H. and Manning, C.D. (2003). Named entity recognition with character-level models. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 180–183.CrossRef Google Scholar

Levy, O., Seo, M., Choi, E. and Zettlemoyer, L. (2017). Zero-shot relation extraction via reading comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada. Association for Computational Linguistics, pp. 333–342.CrossRef Google Scholar

Li, J., Sun, A., Han, J. and Li, C. (2018). A survey on deep learning for named entity recognition. CoRR, abs/1812.09449.Google Scholar

Li, Q., Ji, H. and Huang, L. (2013). Joint event extraction via structured prediction with global features. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Sofia, Bulgaria. Association for Computational Linguistics, pp. 73–82.Google Scholar

Liu, X., Zhang, S., Wei, F. and Zhou, M.T. 2011. Recognizing named entities in tweets. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL.Google Scholar

Lu, J. and Ng, V. (2018). Event coreference resolution: a survey of two decades of research. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI ’18. AAAI Press, pp. 5479–5486.Google Scholar

Lu, W. and Nguyen, T.H. (2018). Similar but not the same: word sense disambiguation improves event detection via neural representation matching. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics, pp. 4822–4828.Google Scholar

Luo, X. (2005). On coreference resolution performance metrics. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05. Association for Computational Linguistics, pp. 25–32.CrossRef Google Scholar

Marcus, M.P., Santorini, B. and Marcinkiewicz, M.A. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330.Google Scholar

Miller, S., Crystal, M., Fox, H., Ramshaw, L., Schwartz, R., Stone, R., Weischedel, R. and The Annotation Group (1998). BBN: description of the SIFT system as used for MUC-7. In Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29–May 1, 1998.Google Scholar

Min, B. and Grishman, R. (2012). Compensating for annotation errors in training a relation extractor. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France. Association for Computational Linguistics, pp. 194–203.Google Scholar

Mintz, M., Bills, S., Snow, R. and Jurafsky, D. (2009). Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore. Association for Computational Linguistics, pp. 1003–1011.CrossRef Google Scholar

MUC (no date). The message understanding conference scoring software user’s manual. Available at https://www-nlpir.nist.gov/related_projects/muc/muc_sw/muc_sw_manual.html Google Scholar

MUC (1991). Appendix H: Text and answer key templates for TST1-MUC3-0099. In THIRD MESSAGE UNDERSTANDING CONFERENCE (MUC-3): Proceedings of a Conference Held in San Diego, California, May 21–23, 1991.Google Scholar

MUC (1993). Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, August 25–27, 1993.Google Scholar

Nadeau, D. and Sekine, S. (2007). A survey of named entity recognition and classification. Linguisticae Investigationes 30(1), 3–26.Google Scholar

Ng, V. (2017). Machine learning for entity coreference resolution: a retrospective look at two decades of research. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, AAAI, pp. 4877–4884.Google Scholar

Nguyen, T.H., Cho, K. and Grishman, R. (2016). Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California. Association for Computational Linguistics, pp. 300–309.CrossRef Google Scholar

Nguyen, T.H. and Grishman, R. (2015). Relation extraction: perspective from convolutional neural networks. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, Denver, Colorado, USA. Association for Computational Linguistics, pp. 39–48.CrossRef Google Scholar

Okurowski, M.E. (1993). Information extraction overview. In TIPSTER TEXT PROGRAM: PHASE I: Proceedings of a Workshop held at Fredricksburg, Virginia, September 19–23, 1993, Fredericksburg, Virginia, USA. Association for Computational Linguistics, pp. 117–121.CrossRef Google Scholar

Panem, S., Gupta, M. and Varma, V. (2014). Structured information extraction from natural disaster events on twitter. Available at https://www.microsoft.com/en-us/research/publication/structured-information-extraction-from-natural-disaster-events-on-twitter/ CrossRef Google Scholar

Pershina, M., Min, B., Xu, W. and Grishman, R. (2014). Infusion of labeled data into distant supervision for relation extraction. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Baltimore, Maryland, USA. Association for Computational Linguistics, pp. 732–738.CrossRef Google Scholar

Peters, S.E., Zhang, C., Livny, M. and R.C. (2014). A machine reading system for assembling synthetic paleontological databases. PLOS. Available at https://doi.org/10.1371/journal.pone.0113523 CrossRef Google Scholar

Riloff, E. (1996). Automatically generating extraction patterns from untagged text. In Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, AAAI 96, IAAI 96, Portland, Oregon, USA, AAAI, August 4–8, 1996, Vol. 2, pp. 1044–1049.Google Scholar

Ritter, A., Clark, S., Mausam, and Etzioni, O. (2011). Named entity recognition in tweets: an experimental study. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, ACL, pp. 1524–1534.Google Scholar

Strauss, B., Toma, B., Ritter, A., de Marneffe, M.-C. and Xu, W. (2016). Results of the WNUT16 named entity recognition shared task. In Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT), Osaka, Japan. The COLING 2016 Organizing Committee, pp. 138–144.Google Scholar

Sundheim, B.M. (1996). The message understanding conferences. In TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6–8, 1996, Vienna, Virginia, USA. Association for Computational Linguistics, pp. 35–37.CrossRef Google Scholar

Surdeanu, M., Tibshirani, J., Nallapati, R. and Manning, C.D. (2012). Multi-instance multi-label learning for relation extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Korea. Association for Computational Linguistics, pp. 455–465.Google Scholar

Wang, Y., Wang, L., Rastegar-Mojarad, M., Moon, S., Shen, F., Afzal, N., Liu, S., Zeng, Y., Mehrabi, S., Sohn, S. and Liu, H. (2018). Clinical information extraction applications: a literature review. Journal of Biomedical Informatics 77, 34–49.CrossRef Google Scholar PubMed

Yangarber, R., Grishman, R., Tapanainen, P. and Huttunen, S. (2000). Automatic acquisition of domain knowledge for information extraction. In COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31–August 4, 2000. Saarbrücken, Germany: Universität des Saarlandes, pp. 940–946.CrossRef Google Scholar

Yin, W., Kann, K., Yu, M. and Schütze, H. (2017). Comparative study of CNN and RNN for natural language processing. arXiv.1702.01923v1. 17 Feb 2017.Google Scholar

Zhao, S. and Grishman, R. (2005). Extracting relations with integrated information using kernel methods. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL ’05), Ann Arbor, Michigan. Association for Computational Linguistics, pp. 419–426.CrossRef Google Scholar

Article contents

Twenty-five years of information extraction

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests