Abstract
Toponym resolution can be defined as the process of mapping each toponyms in a document or corpus unambiguously to an associated spatial footprint corresponding to the location intended by the writer of a document. Various toponym resolution methods have been proposed that use the latitude and longitude of the centroids as the spatial representation, but arguably large cities regions or countries are poorly represented by a single point.
Two decades ago, Leidner presented TAME, the first geo-annotation tool described in the literature, but it (a) lacked extensibility (such as support for multiple users and multiple gazetteers), (b) it is not publicly available, and (c) it did not support polygon footprints. As a consequence, to date, in all annotated text collections, place names are associated exclusively using gazetteers with centroid geographic footprint representations.
In this paper, we present TAME II, a more flexible system for creating labeled corpora for the training of and evaluation of toponym resolvers, which supports multiple kinds of geographic footprint types (polygons, centroids and bounding rectangles. It has been implemented as a modern Web-based application that is available for the public on the Web. To the best of our knowledge, TAME II is the first text annotation tool that supports gazetteers with polygon footprints, and the first Web-based tool generally available.
The authors gratefully acknowledge the funding provided by the Free State of Bavaria under its “Hitech Agenda”. All views are the authors’ and do not necessarily reflect the views of any funding agencies or affiliated institutions. We would like to thank Tim Menzner for helping with the cloud deployment of our system and the feedback of our three anonymous reviewers that improved the presentation of our paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A gazetteer is a geographic database consisting of triples (t; f; g), where t is a toponym (place name), f is a feature type (such as dwelling, mountain, lake, human artifact etc.), and g is a geographic footprint, such as a centroid, bounding box or polygon representing the place in geographic space [9].
- 2.
not an acronym.
- 3.
CoNLL is the Conference for Natural Language Learning; in its 2003 edition the CoNLL dataset was released as part of a shared task for named entity tagging. MUC4 is the dataset reseased for the 4th Message Understanding Contests. The versions prefixed with “TR-” were additionally annotated with centroid footprint information as part of the first author’s Ph.D. thesis, i.e. in addition to markup where a location name begins and ends, it was included what places with that name exist on earth, which latitude/longitude they correspond to, and which one is the one most likely intended by the author of the story.
- 4.
see https://timeml.github.io/site/publications/specs.html for the specification (accessed 2024-01-08).
- 5.
see https://flask.palletsprojects.com/en/3.0.x/ (accessed 2024-01-10) and
https://github.com/pallets/flask (accessed 2024-01-10).
- 6.
see https://github.com/pytries/marisa-trie (accessed 2024-01-10).
References
Cadorel, L., Overal, D., Tettamanzi, A.G.B.: Fuzzy representation of vague spatial descriptions in real estate advertisements. In: Proceedings of Workshop on Location-Based Recommendations, Geosocial Networks and Geoadvertising Held at the 6th ACM SIGSPATIAL International. LocalRec ’22, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3557992.3565994
Cardoso, A.B., Martins, B., Estima, J.: A novel deep learning approach using contextual embeddings for toponym resolution. ISPRS Int. J. Geo-Inf. 11(1), 28 (2022). https://doi.org/10.3390/ijgi11010028
Cunningham, H., Humphreys, K., Gaizauskas, R., Wilks, Y.: GATE - a general architecture for text engineering. In: Fifth Conference on Applied Natural Language Processing: Descriptions of System Demonstrations and Videos, pp. 29–30. Association for Computational Linguistics, Washington, DC, USA (1997). https://doi.org/10.3115/974281.974299
DeLozier, G., Baldridge, J., London, L.: Gazetteer-independent toponym resolution using geographic word profiles. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2382–2388. AAAI ’15, AAAI Press (2015)
Fredkin, E.: Trie memory. Commun. ACM 3(9), 490–499 (1960). https://doi.org/10.1145/367390.367400
Goldberg, D.W.: Geocoding. In: Castree, N., Goodchild, M.F., Kobayashi, A., Liu, W., Marston, R.A. (eds.) International Encyclopedia of Geography. People, the Earth, Environment and Technology, vol. 15, pp. 1–12. Wiley, New York, NY, USA, 1st edn. (2017).https://doi.org/10.1002/9781118786352.wbieg1051
Grover, C., et al.: Use of the edinburgh geoparser for georeferencing digitized historical collections. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 368(1925), 3875–3889 (2010)
Haltermann, A.: Mordecai: Full text geoparsing and event geocoding. J. Open Source Softw. 2(9), 91 (2017). https://doi.org/10.21105/joss.000911
Hill, L.L.: Georeferencing: The Geographic Associations of Information. MIT Press, Cambridge, MA, USA (2006)
Kamalloo, E., Rafiei, D.: A coherent unsupervised model for toponym resolution. In: Proceedings of the 2018 World Wide Web Conference, pp. 1287–1296. WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2018). https://doi.org/10.1145/3178876.3186027
Leidner, J.L.: An evaluation dataset for the toponym resolution task. Comput. Environ. Urban Syst. 30(4), 400–417 (2006). https://doi.org/10.1016/j.compenvurbsys.2005.07.003, geographic Information Retrieval (GIR)
Leidner, J.L.: Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding. Ph.D. thesis, School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK (2007)
Leidner, J.L.: Toponym Resolution in Text: Annotation. Evaluation and Applications of Spatial Grounding. Universal Press, Boca Raton, FL, USA (2008)
Leidner, J.L.: Georeferencing: From texts to maps. In: Castree, N., Goodchild, M.F., Kobayashi, A., Liu, W., Marston, R.A. (eds.) International Encyclopedia of Geography. People, the Earth, Environment and Technology, vol. 15, pp. 1–10. Wiley, 1st edn. (2017).https://doi.org/10.1002/9781118786352.wbieg0160
Leidner, J.L.: A survey of textual data & geospatial technology. In: Werner, M., Chiang, Y.-Y. (eds.) Handbook of Big Geospatial Data, pp. 429–457. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-55462-0_16
Leidner, J.L., Sinclair, G., Webber, B.: Grounding spatial named entities for information extraction and question answering. In: Proceedings of the Workshop on Analysis of Geographic References Held at HLT-NAACL 2003, pp. 31–38. ACL, Edmonton, Alberta, Canada (2003). https://aclanthology.org/W03-0105
Overell, S., Rüger, S.: Using co-occurrence models for placename disambiguation. Int. J. Geogr. Inf. Sci. 22(3), 265–287 (2008). https://doi.org/10.1080/13658810701626236
Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: ISO-TimeML: an international standard for semantic annotation. In: Calzolari, N., (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, Malta (2010)
Speriosu, M., Baldridge, J.: Text-driven toponym resolution using indirect supervision. In: Schütze, H., Fung, P., Poesio, M. (eds.) Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1466–1476. Association for Computational Linguistics, Sofia, Bulgaria (2013)
Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based tool for NLP-assisted text annotation. In: Segond, F. (ed.) Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics, Avignon, France (2012)
Verhagen, M., et al.: Automating temporal annotation with TARSQI. In: Nagata, M., Pedersen, T. (eds.) Proceedings of the ACL, pp. 81–84. Association for Computational Linguistics, Ann Arbor, MI, USA (2005). https://doi.org/10.3115/1225753.1225774
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Leidner, J.L., Jung, L. (2024). TAME II: A Modern Geographic Text Annotation Tool. In: Lotfian, M., Starace, L.L.L. (eds) Web and Wireless Geographical Information Systems. W2GIS 2024. Lecture Notes in Computer Science, vol 14673. Springer, Cham. https://doi.org/10.1007/978-3-031-60796-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-60796-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-60795-0
Online ISBN: 978-3-031-60796-7
eBook Packages: Computer ScienceComputer Science (R0)