Skip to main content

Using Information Extraction to Build a Directory of Conference Announcements

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2945))

Abstract

We describe an application of information extraction for building a directory of announcements of scientific conferences. We employ a cascaded finite-state transducer to identify possible conference names, titles, dates, locations and URLs in a conference announcement. In order to cope with agrammatical text that is typical for conference announcements, our system uses orthographic features of the text and a domain-specific tag set, rather than general purpose part-of-speech tags. Extraction accuracy is improved by recognizing other entities in the text that are not extracted but could be confused with slot values. A scoring scheme based on some simple heuristics is used to select among multiple extraction candidates. We also present an evaluation of our system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hobbs, J.R., Appelt, D., Bear, J., Israel, D., Kameyama, M., Stickel, M., Tyson, M.: FASTUS: A cascaded finite-state transducer for extracting information from natural-language text. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing. MIT Press, Cambridge (1997)

    Google Scholar 

  2. Grishman, R.: Information extraction: Techniques and challenges. In: Pazienza, M.T. (ed.) Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology. LNCS (LNAI), vol. 1299, pp. 10–27. Springer, Heidelberg (1997)

    Google Scholar 

  3. Grefenstette, G., Tapanainen, P.: What is a word, what is a sentence? Problems of tokenization. In: 3rd International Conference on Computational Lexicography (COMPLEX 1994), Budapest, pp. 79–87 (1994)

    Google Scholar 

  4. Kruger, A., Giles, C.L., Coetzee, F.M., Glover, E., Flake, G.W., Lawrence, S., Omlin, C.: DEADLINER: Building a new niche search engine. In: Proc. Ninth International Conference on Information and Knowledge Management (CIKM 2000), Washington, DC (2000)

    Google Scholar 

  5. McCallum, A., Nigam, K., Rennie, J., Seymore, K.: A machine learning approach to building domain-specific search engines. In: 16th International Joint Conference on Artificial Intelligence, IJCAI 1999 (1999)

    Google Scholar 

  6. Freitag, D.: Machine learning for information extraction in informal domains. Machine Learning 39, 169–202 (2000)

    Article  MATH  Google Scholar 

  7. Abney, S.: Partial parsing via finite-state cascades. In: ESSLLI 1996 Workshop on Robust Parsing, Prague, pp. 8–15 (1996)

    Google Scholar 

  8. Schiller, A.: Multilingual finite-state noun phrase extraction. In: Proc. ECAI 1996 Workshop on Extended Finite State Models of Language (1996)

    Google Scholar 

  9. Friburger, N., Maurel, D.: Finite-state transducer cascade to extract proper names in texts. In: Watson, B.W., Wood, D. (eds.) CIAA 2001. LNCS, vol. 2494, pp. 115–124. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  10. Soderland, S.: Learning information extraction rules for semi-structured and free text. Machine Learning 34, 233–272 (1999)

    Article  MATH  Google Scholar 

  11. McCallum, A., Freitag, D., Pereira, F.: Maximum entropy markov models for information extraction and segmentation. In: Proc. 17th International Conference on Machine Learning (ICML 2000), pp. 591–598. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  12. Seymore, K., McCallum, A., Rosenfeld, R.: Learning hidden markov model structure for information extraction. In: AAAI 1999 Workshop on Machine Learning for Information Extraction (1999)

    Google Scholar 

  13. Chieu, H.L., Ng, H.T.: A maximum entropy approach to information extraction from semi-structured and free text. In: Proc. 18th National Conference on Artificial Intelligence (AAAI 2002), Edmonton, pp. 786–791 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schneider, KM. (2004). Using Information Extraction to Build a Directory of Conference Announcements. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2004. Lecture Notes in Computer Science, vol 2945. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24630-5_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24630-5_65

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21006-1

  • Online ISBN: 978-3-540-24630-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics