Skip to main content

TAME II: A Modern Geographic Text Annotation Tool

  • Conference paper
  • First Online:
Web and Wireless Geographical Information Systems (W2GIS 2024)

Abstract

Toponym resolution can be defined as the process of mapping each toponyms in a document or corpus unambiguously to an associated spatial footprint corresponding to the location intended by the writer of a document. Various toponym resolution methods have been proposed that use the latitude and longitude of the centroids as the spatial representation, but arguably large cities regions or countries are poorly represented by a single point.

Two decades ago, Leidner presented TAME, the first geo-annotation tool described in the literature, but it (a) lacked extensibility (such as support for multiple users and multiple gazetteers), (b) it is not publicly available, and (c) it did not support polygon footprints. As a consequence, to date, in all annotated text collections, place names are associated exclusively using gazetteers with centroid geographic footprint representations.

In this paper, we present TAME II, a more flexible system for creating labeled corpora for the training of and evaluation of toponym resolvers, which supports multiple kinds of geographic footprint types (polygons, centroids and bounding rectangles. It has been implemented as a modern Web-based application that is available for the public on the Web. To the best of our knowledge, TAME II is the first text annotation tool that supports gazetteers with polygon footprints, and the first Web-based tool generally available.

The authors gratefully acknowledge the funding provided by the Free State of Bavaria under its “Hitech Agenda”. All views are the authors’ and do not necessarily reflect the views of any funding agencies or affiliated institutions. We would like to thank Tim Menzner for helping with the cloud deployment of our system and the feedback of our three anonymous reviewers that improved the presentation of our paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A gazetteer is a geographic database consisting of triples (tfg), where t is a toponym (place name), f is a feature type (such as dwelling, mountain, lake, human artifact etc.), and g is a geographic footprint, such as a centroid, bounding box or polygon representing the place in geographic space [9].

  2. 2.

    not an acronym.

  3. 3.

    CoNLL is the Conference for Natural Language Learning; in its 2003 edition the CoNLL dataset was released as part of a shared task for named entity tagging. MUC4 is the dataset reseased for the 4th Message Understanding Contests. The versions prefixed with “TR-” were additionally annotated with centroid footprint information as part of the first author’s Ph.D. thesis, i.e. in addition to markup where a location name begins and ends, it was included what places with that name exist on earth, which latitude/longitude they correspond to, and which one is the one most likely intended by the author of the story.

  4. 4.

    see https://timeml.github.io/site/publications/specs.html for the specification (accessed 2024-01-08).

  5. 5.

    see https://flask.palletsprojects.com/en/3.0.x/ (accessed 2024-01-10) and

    https://github.com/pallets/flask (accessed 2024-01-10).

  6. 6.

    see https://github.com/pytries/marisa-trie (accessed 2024-01-10).

References

  1. Cadorel, L., Overal, D., Tettamanzi, A.G.B.: Fuzzy representation of vague spatial descriptions in real estate advertisements. In: Proceedings of Workshop on Location-Based Recommendations, Geosocial Networks and Geoadvertising Held at the 6th ACM SIGSPATIAL International. LocalRec ’22, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3557992.3565994

  2. Cardoso, A.B., Martins, B., Estima, J.: A novel deep learning approach using contextual embeddings for toponym resolution. ISPRS Int. J. Geo-Inf. 11(1), 28 (2022). https://doi.org/10.3390/ijgi11010028

  3. Cunningham, H., Humphreys, K., Gaizauskas, R., Wilks, Y.: GATE - a general architecture for text engineering. In: Fifth Conference on Applied Natural Language Processing: Descriptions of System Demonstrations and Videos, pp. 29–30. Association for Computational Linguistics, Washington, DC, USA (1997). https://doi.org/10.3115/974281.974299

  4. DeLozier, G., Baldridge, J., London, L.: Gazetteer-independent toponym resolution using geographic word profiles. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2382–2388. AAAI ’15, AAAI Press (2015)

    Google Scholar 

  5. Fredkin, E.: Trie memory. Commun. ACM 3(9), 490–499 (1960). https://doi.org/10.1145/367390.367400

    Article  Google Scholar 

  6. Goldberg, D.W.: Geocoding. In: Castree, N., Goodchild, M.F., Kobayashi, A., Liu, W., Marston, R.A. (eds.) International Encyclopedia of Geography. People, the Earth, Environment and Technology, vol. 15, pp. 1–12. Wiley, New York, NY, USA, 1st edn. (2017).https://doi.org/10.1002/9781118786352.wbieg1051

  7. Grover, C., et al.: Use of the edinburgh geoparser for georeferencing digitized historical collections. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 368(1925), 3875–3889 (2010)

    Article  Google Scholar 

  8. Haltermann, A.: Mordecai: Full text geoparsing and event geocoding. J. Open Source Softw. 2(9), 91 (2017). https://doi.org/10.21105/joss.000911

  9. Hill, L.L.: Georeferencing: The Geographic Associations of Information. MIT Press, Cambridge, MA, USA (2006)

    Book  Google Scholar 

  10. Kamalloo, E., Rafiei, D.: A coherent unsupervised model for toponym resolution. In: Proceedings of the 2018 World Wide Web Conference, pp. 1287–1296. WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2018). https://doi.org/10.1145/3178876.3186027

  11. Leidner, J.L.: An evaluation dataset for the toponym resolution task. Comput. Environ. Urban Syst. 30(4), 400–417 (2006). https://doi.org/10.1016/j.compenvurbsys.2005.07.003, geographic Information Retrieval (GIR)

  12. Leidner, J.L.: Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding. Ph.D. thesis, School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK (2007)

    Google Scholar 

  13. Leidner, J.L.: Toponym Resolution in Text: Annotation. Evaluation and Applications of Spatial Grounding. Universal Press, Boca Raton, FL, USA (2008)

    Google Scholar 

  14. Leidner, J.L.: Georeferencing: From texts to maps. In: Castree, N., Goodchild, M.F., Kobayashi, A., Liu, W., Marston, R.A. (eds.) International Encyclopedia of Geography. People, the Earth, Environment and Technology, vol. 15, pp. 1–10. Wiley, 1st edn. (2017).https://doi.org/10.1002/9781118786352.wbieg0160

  15. Leidner, J.L.: A survey of textual data & geospatial technology. In: Werner, M., Chiang, Y.-Y. (eds.) Handbook of Big Geospatial Data, pp. 429–457. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-55462-0_16

    Chapter  Google Scholar 

  16. Leidner, J.L., Sinclair, G., Webber, B.: Grounding spatial named entities for information extraction and question answering. In: Proceedings of the Workshop on Analysis of Geographic References Held at HLT-NAACL 2003, pp. 31–38. ACL, Edmonton, Alberta, Canada (2003). https://aclanthology.org/W03-0105

  17. Overell, S., Rüger, S.: Using co-occurrence models for placename disambiguation. Int. J. Geogr. Inf. Sci. 22(3), 265–287 (2008). https://doi.org/10.1080/13658810701626236

    Article  Google Scholar 

  18. Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: ISO-TimeML: an international standard for semantic annotation. In: Calzolari, N., (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, Malta (2010)

    Google Scholar 

  19. Speriosu, M., Baldridge, J.: Text-driven toponym resolution using indirect supervision. In: Schütze, H., Fung, P., Poesio, M. (eds.) Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1466–1476. Association for Computational Linguistics, Sofia, Bulgaria (2013)

    Google Scholar 

  20. Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based tool for NLP-assisted text annotation. In: Segond, F. (ed.) Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics, Avignon, France (2012)

    Google Scholar 

  21. Verhagen, M., et al.: Automating temporal annotation with TARSQI. In: Nagata, M., Pedersen, T. (eds.) Proceedings of the ACL, pp. 81–84. Association for Computational Linguistics, Ann Arbor, MI, USA (2005). https://doi.org/10.3115/1225753.1225774

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jochen L. Leidner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Leidner, J.L., Jung, L. (2024). TAME II: A Modern Geographic Text Annotation Tool. In: Lotfian, M., Starace, L.L.L. (eds) Web and Wireless Geographical Information Systems. W2GIS 2024. Lecture Notes in Computer Science, vol 14673. Springer, Cham. https://doi.org/10.1007/978-3-031-60796-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-60796-7_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-60795-0

  • Online ISBN: 978-3-031-60796-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics