Skip to main content

A Bootstrapping Approach for Geographic Named Entity Annotation

  • Conference paper
Information Retrieval Technology (AIRS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3411))

Included in the following conference series:

  • 430 Accesses

Abstract

Geographic named entities can be classified into many sub-types that are useful for applications such as information extraction and question answering. In this paper, we present a bootstrapping algorithm for the task of geographic named entity annotation. In the initial stage, we annotate a raw corpus using seeds. From the initial annotation, boundary patterns are learned and applied to the corpus again to annotate new candidates. Type verification is adopted to reduce over-generation. One sense per discourse principle increases positive instances and also corrects mistaken annotations. As the bootstrapping loop proceeds, the annotated instances are increased gradually and the learned boundary patterns become gradually richer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Voorhees, E.M.: Overview of the TREC 2001 Question Answering Track. In: Proceedings of the 10th Text Retrieval Conference (TREC 2001), Gaithersburg, MD, pp. 42–51 (2001)

    Google Scholar 

  2. Zhou, G., Su, J.: Named Entity Recognition using an HMM-based Chunk Tagger. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, USA, pp. 473–480 (2002)

    Google Scholar 

  3. Gale, W.A., Church, K.W., Yarowsky, D.: One Sense Per Discourse. In: Proceedings of the 4th DARPA Speech and Natural Language Workshop, pp. 233–237 (1992)

    Google Scholar 

  4. Li, H., Srihari, R.K., Niu, C., Li, W.: InfoXtract location normalization: a hybrid approach to geographic references in information extraction. In: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, Alberta, Canada, pp. 39–44 (2003)

    Google Scholar 

  5. Smith, D.A., Mann, G.S.: Bootstrapping toponym classifiers. In: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, Alberta, Canada, pp. 45–49 (2003)

    Google Scholar 

  6. Collins, M., Singer, Y.: Unsupervised Models for Named Entity Classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC), College Park, MD, pp. 100–110. Association for Computational Linguistics (1999)

    Google Scholar 

  7. Yangarber, R., Lin, W., Grishman, R.: Unsupervised Learning of Generalized Names. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan, pp. 1135–1141 (2002)

    Google Scholar 

  8. Uryupina, O.: Semi-supervised learning of geographical gazetteers from the internet. In: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, Alberta, Canada, pp. 18–25 (2003)

    Google Scholar 

  9. Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Method. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 189–196 (1995)

    Google Scholar 

  10. Niu, C., Li, W., Ding, J., Srihari, R.K.: A Bootstrapping Approach to Named Entity Classification Using Successive Learners. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL), Sapporo, Japan, pp. 335–342 (2003)

    Google Scholar 

  11. Lin, W., Yangarber, R., Grishman, R.: Bootstrapped Learning of Semantic Classes from Positive and Negative Examples. In: Proceedings of the ICML 2003 Workshop on The Continuum from Labeled to Unlabeled Data, Washington, DC (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, S., Lee, G.G. (2005). A Bootstrapping Approach for Geographic Named Entity Annotation. In: Myaeng, S.H., Zhou, M., Wong, KF., Zhang, HJ. (eds) Information Retrieval Technology. AIRS 2004. Lecture Notes in Computer Science, vol 3411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31871-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31871-2_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25065-4

  • Online ISBN: 978-3-540-31871-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics