Skip to main content

Information Extraction

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems
  • 52 Accesses

Definition

Information Extraction (IE) is a task of extracting pre-specified types of facts from written texts or speech transcripts, and converting them into structured representations (e.g., databases).

IE terminologies are explained via an example as follows.

  • Input Sentence:

Media tycoon Barry Diller on Wednesday quit as chief of Vivendi Universal Entertainment, the entertainment unit of French giant Vivendi Universal whose future appears up for grabs.

  • IE output:

    • Entities:

      • Person Entity: {Media tycoon, Barry Diller}

      • Organization Entity: {Vivendi Universal Entertainment, the entertainment unit}

      • Organization Entity: {French giant, Vivendi Universal}

    • “Part-Whole” relation:

      • {Vivendi Universal Entertainment, the entertainment unit} is part of {French giant, Vivendi Universal}.

    • “End-Position” event.

The above sentence includes a “Personnel_End-Position” event mention, with the trigger word which most clearly expresses the event occurrence, the position, the person who quit the position,...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Bikel DM, Miller S, Schwartz R, Weischedel R. Nymble: a high-performance learning name-finder. In: Proceedings of the 5th Conference on Applied Natural Language Processing; 1997. p. 194–201.

    Google Scholar 

  2. Boschee E, Weischedel R, Zamanian A. Automatic evidence extraction. In: Proceedings of the International Conference on Intelligence Analysis; 2005.

    Google Scholar 

  3. Florian R, Jing H, Kambhatla N, Zitouni I. Factorizing complex models: a case study in mention detection. In: Proceedings of the 26th international conference on computational linguistics. 2006. p. 473–80.

    Google Scholar 

  4. Grishman R, Sundheim B. Message understanding conference – 6: a brief history. In: Proceedings of the 16th international conference on computational linguistics. 1996. p. 466–71.

    Google Scholar 

  5. Grishman R, Westbrook D, Meyers A. NYU’s English ACE 2005 system description. In: Proceedings of the ACE 2005 evaluation/PI workshop. 2005.

    Google Scholar 

  6. Ji H, Grishman R. Refining event extraction through unsupervised cross-document inference. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics; 2008. p. 254–62.

    Google Scholar 

  7. Ji H, Westbrook D, Grishman R. Using semantic relations to refine coreference decisions. In: Proceedings of the Conference Human Language Technology and Empirical Methods in Natural Language Processing; 2005. p. 17–24.

    Google Scholar 

  8. Muslea I. Extraction patterns for information extraction tasks: a survey. In: Proceedings of the National Conference on Artificial Intelligence (AAAI-99) Workshop on Machine Learning for Information Extraction; 1999.

    Google Scholar 

  9. Ng V, Cardie C. Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics; 2002. p. 104–11.

    Google Scholar 

  10. Riloff E. Automatically generating extraction patterns from untagged text. In: Proceedings of the 10th national conference on AI. 1996. p. 1044–49.

    Google Scholar 

  11. Sager N. Natural language information processing: a computer grammar of english and its applications. Reading: Addison Wesley; 1981.

    Google Scholar 

  12. Sudo K, Sekine S, Grishman R. An improved extraction pattern representation model for automatic IE pattern acquisition. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics; 2003. p. 224–31.

    Google Scholar 

  13. Yangarber R, Grishman R, Tapanainen P. and Huttunen S. Automatic acquisition of domain knowledge for information extraction. In: Proceedings of the 20th international conference on computational linguistics. 2000. p. 940–46.

    Google Scholar 

  14. Zhou G, Su J, Zhang J, Zhang M. Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics; 2005. p. 427–34.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heng Ji .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Ji, H. (2018). Information Extraction. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_204

Download citation

Publish with us

Policies and ethics