Skip to main content

A Framework for Generating Task Specific Information Extraction Systems

  • Conference paper
  • First Online:
  • 649 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2366))

Abstract

Information extraction helps in building advanced tools for text processing applications such as electronic publishing and information retrieval. The applications differ in their requirements on the input text, output information, and available resources for information extraction. Since existing IE technologies are application specific, extensive expert work is required to meet the needs of each new application. However, today, most users do not have this expertise and thus need a tool to easily create IE systems tailored to their needs. We introduce a framework that consists of (1) an extensible set of advanced IE technologies together with a description of the properties of their input, output, and resources, and (2) a generator that selects the relevant technologies for a specific application and integrates these into an IE system. A prototype of the framework is presented, and the generation of IE systems for two example applications is illustrated. The results are presented as guidelines for further development of the framework.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Proceedings of the 7th Message Understanding Conference (MUC-7). 1998. http://www.itl.nist.gov/iaui/894.02/related_projects/muc/index.html.

  2. Johan Aberg and Nahid Shahmehri. An Empirical Study of Human Web Assistants: Implications for User Support in Web Information Systems. In Proceedings of the CHI Conference on Human Factors in Computing Systems, pages 404–411, 2001.

    Google Scholar 

  3. Cécile Boisson-Aberg. Applying Similarity Measures for Management of Textual Templates. In Proceedings of the Student Workshop of the 38th Annual Meeting of the Association for Computational Linguistics, pages 463–473, 2000.

    Google Scholar 

  4. Cécile Boisson-Aberg and Nahid Shahmehri. Template Generation for Identifying Text Patterns. In Proceedings of the International Symposium on Methodologies for Intelligent Systems, pages 8–15, 2000.

    Google Scholar 

  5. M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery. Learning to Extract Symbolic Knowledge from the World Wide Web. In Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), 1998.

    Google Scholar 

  6. ETAI. Electronic Transactions on Artificial Intelligence. http://www.ida.liu.se/ext/etai/, 1997.

  7. R. Gaizauskas, T. Wakao, K. Humphreys, H. Cunningham, and Y. Wilks. Description of LaSIE System as used for MUC-6. In Proceedings of the 6th Message Understanding Conference (MUC-6), pages 207–220, 1995.

    Google Scholar 

  8. R. Gaizauskas and Y. Wilks. Natural Language Information Retrieval, chapter 8. LaSIE Jumps the GATE. Kluwer Academic: Berlin, 1999.

    Google Scholar 

  9. Tao Guan and Kam-Fai Wong. KPS: a Web Information Mining Algorithm. In Proceedings of the 8th International WorldWideWeb Conference (WWW8), pages 417–429, 1999.

    Google Scholar 

  10. Jerry R. Hobbs. Generic Information Extraction System. In Proceedings of the 5th Message Understanding Conference (MUC-5), 1993.

    Google Scholar 

  11. K. Humphreys, R. Gaizauskas, S. Azzam, C. Huyck, B. Mitchell, H. Cunningham, and Y. Wilks. Description of the University of Sheffield LaSIE-II System as used for MUC-7. In Proceedings of the 7th Message Understanding Conference (MUC-7), 1998.

    Google Scholar 

  12. Henry Lieberman, Bonnie A. Nardi, and David Wright. Training Agents to Recognize Text by Example. In Proceedings of the 3rd International Conference on Autonomous Agents, pages 116–122, 1999.

    Google Scholar 

  13. Bonnie A. Nardi, James R. Miller, and David J. Wright. Collaborative, Programmable Intelligent Agents. Communication of the ACM, 41(3):96–104, 1998.

    Article  Google Scholar 

  14. Milind S. Pandit and Sameer Kalbag. The Selection Recognition Agent: Instant Access to Relevant Information and Operations. In Proceedings of the International Conference on Intelligent User Interfaces, pages 47–52, 1997.

    Google Scholar 

  15. Ellen Riloff. Automatically Generating Extraction Patterns from Untagged Text. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI), pages 1044–1049, 1996.

    Google Scholar 

  16. Ellen Riloff and Rosie Jones. Learning Dictionaries for Information Extraction by Multi-Level Boostrapping. In Proceedings of the 16th National Conference on Artificial Intelligence (AAAI), 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aberg, C., Shahmehri, N. (2002). A Framework for Generating Task Specific Information Extraction Systems. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_53

Download citation

  • DOI: https://doi.org/10.1007/3-540-48050-1_53

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43785-7

  • Online ISBN: 978-3-540-48050-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics