Information Extraction

Ji, Heng

doi:10.1007/978-1-4614-8265-9_204

Heng Ji³

52 Accesses

Definition

Information Extraction (IE) is a task of extracting pre-specified types of facts from written texts or speech transcripts, and converting them into structured representations (e.g., databases).

IE terminologies are explained via an example as follows.

Input Sentence:

Media tycoon Barry Diller on Wednesday quit as chief of Vivendi Universal Entertainment, the entertainment unit of French giant Vivendi Universal whose future appears up for grabs.

IE output:
- Entities:
  - Person Entity: {Media tycoon, Barry Diller}
  - Organization Entity: {Vivendi Universal Entertainment, the entertainment unit}
  - Organization Entity: {French giant, Vivendi Universal}
- “Part-Whole” relation:
  - {Vivendi Universal Entertainment, the entertainment unit} is part of {French giant, Vivendi Universal}.
- “End-Position” event.

The above sentence includes a “Personnel_End-Position” event mention, with the trigger word which most clearly expresses the event occurrence, the position, the person who quit the position,...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Bikel DM, Miller S, Schwartz R, Weischedel R. Nymble: a high-performance learning name-finder. In: Proceedings of the 5th Conference on Applied Natural Language Processing; 1997. p. 194–201.
Google Scholar
Boschee E, Weischedel R, Zamanian A. Automatic evidence extraction. In: Proceedings of the International Conference on Intelligence Analysis; 2005.
Google Scholar
Florian R, Jing H, Kambhatla N, Zitouni I. Factorizing complex models: a case study in mention detection. In: Proceedings of the 26th international conference on computational linguistics. 2006. p. 473–80.
Google Scholar
Grishman R, Sundheim B. Message understanding conference – 6: a brief history. In: Proceedings of the 16th international conference on computational linguistics. 1996. p. 466–71.
Google Scholar
Grishman R, Westbrook D, Meyers A. NYU’s English ACE 2005 system description. In: Proceedings of the ACE 2005 evaluation/PI workshop. 2005.
Google Scholar
Ji H, Grishman R. Refining event extraction through unsupervised cross-document inference. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics; 2008. p. 254–62.
Google Scholar
Ji H, Westbrook D, Grishman R. Using semantic relations to refine coreference decisions. In: Proceedings of the Conference Human Language Technology and Empirical Methods in Natural Language Processing; 2005. p. 17–24.
Google Scholar
Muslea I. Extraction patterns for information extraction tasks: a survey. In: Proceedings of the National Conference on Artificial Intelligence (AAAI-99) Workshop on Machine Learning for Information Extraction; 1999.
Google Scholar
Ng V, Cardie C. Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics; 2002. p. 104–11.
Google Scholar
Riloff E. Automatically generating extraction patterns from untagged text. In: Proceedings of the 10th national conference on AI. 1996. p. 1044–49.
Google Scholar
Sager N. Natural language information processing: a computer grammar of english and its applications. Reading: Addison Wesley; 1981.
Google Scholar
Sudo K, Sekine S, Grishman R. An improved extraction pattern representation model for automatic IE pattern acquisition. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics; 2003. p. 224–31.
Google Scholar
Yangarber R, Grishman R, Tapanainen P. and Huttunen S. Automatic acquisition of domain knowledge for information extraction. In: Proceedings of the 20th international conference on computational linguistics. 2000. p. 940–46.
Google Scholar
Zhou G, Su J, Zhang J, Zhang M. Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics; 2005. p. 427–34.
Google Scholar

Download references

Author information

Authors and Affiliations

New York University, New York, NY, USA
Heng Ji

Authors

Heng Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heng Ji .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Section Editor information

Microsoft Research Asia, Microsoft Corporation, Beijing, Haidian, China
Zheng Chen

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Ji, H. (2018). Information Extraction. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_204

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_204
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics