TAME II: A Modern Geographic Text Annotation Tool

Leidner, Jochen L.; Jung, Luca

doi:10.1007/978-3-031-60796-7_7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14673))

Included in the following conference series:

International Symposium on Web and Wireless Geographical Information Systems

289 Accesses
1 Altmetric

Abstract

Toponym resolution can be defined as the process of mapping each toponyms in a document or corpus unambiguously to an associated spatial footprint corresponding to the location intended by the writer of a document. Various toponym resolution methods have been proposed that use the latitude and longitude of the centroids as the spatial representation, but arguably large cities regions or countries are poorly represented by a single point.

Two decades ago, Leidner presented TAME, the first geo-annotation tool described in the literature, but it (a) lacked extensibility (such as support for multiple users and multiple gazetteers), (b) it is not publicly available, and (c) it did not support polygon footprints. As a consequence, to date, in all annotated text collections, place names are associated exclusively using gazetteers with centroid geographic footprint representations.

In this paper, we present TAME II, a more flexible system for creating labeled corpora for the training of and evaluation of toponym resolvers, which supports multiple kinds of geographic footprint types (polygons, centroids and bounding rectangles. It has been implemented as a modern Web-based application that is available for the public on the Web. To the best of our knowledge, TAME II is the first text annotation tool that supports gazetteers with polygon footprints, and the first Web-based tool generally available.

The authors gratefully acknowledge the funding provided by the Free State of Bavaria under its “Hitech Agenda”. All views are the authors’ and do not necessarily reflect the views of any funding agencies or affiliated institutions. We would like to thank Tim Menzner for helping with the cloud deployment of our system and the feedback of our three anonymous reviewers that improved the presentation of our paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A survey on geocoding: algorithms and datasets for toponym resolution

Article 10 June 2024

A Survey of Textual Data & Geospatial Technology

MAWI: Mapping the Unmapped in Wikipedia via Geographic Information Extraction

Notes

1.
A gazetteer is a geographic database consisting of triples (t; f; g), where t is a toponym (place name), f is a feature type (such as dwelling, mountain, lake, human artifact etc.), and g is a geographic footprint, such as a centroid, bounding box or polygon representing the place in geographic space [9].
2.
not an acronym.
3.
CoNLL is the Conference for Natural Language Learning; in its 2003 edition the CoNLL dataset was released as part of a shared task for named entity tagging. MUC4 is the dataset reseased for the 4th Message Understanding Contests. The versions prefixed with “TR-” were additionally annotated with centroid footprint information as part of the first author’s Ph.D. thesis, i.e. in addition to markup where a location name begins and ends, it was included what places with that name exist on earth, which latitude/longitude they correspond to, and which one is the one most likely intended by the author of the story.
4.
see https://timeml.github.io/site/publications/specs.html for the specification (accessed 2024-01-08).
5.
see https://flask.palletsprojects.com/en/3.0.x/ (accessed 2024-01-10) and
https://github.com/pallets/flask (accessed 2024-01-10).
6.
see https://github.com/pytries/marisa-trie (accessed 2024-01-10).

References

Cadorel, L., Overal, D., Tettamanzi, A.G.B.: Fuzzy representation of vague spatial descriptions in real estate advertisements. In: Proceedings of Workshop on Location-Based Recommendations, Geosocial Networks and Geoadvertising Held at the 6th ACM SIGSPATIAL International. LocalRec ’22, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3557992.3565994
Cardoso, A.B., Martins, B., Estima, J.: A novel deep learning approach using contextual embeddings for toponym resolution. ISPRS Int. J. Geo-Inf. 11(1), 28 (2022). https://doi.org/10.3390/ijgi11010028
Cunningham, H., Humphreys, K., Gaizauskas, R., Wilks, Y.: GATE - a general architecture for text engineering. In: Fifth Conference on Applied Natural Language Processing: Descriptions of System Demonstrations and Videos, pp. 29–30. Association for Computational Linguistics, Washington, DC, USA (1997). https://doi.org/10.3115/974281.974299
DeLozier, G., Baldridge, J., London, L.: Gazetteer-independent toponym resolution using geographic word profiles. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2382–2388. AAAI ’15, AAAI Press (2015)
Google Scholar
Fredkin, E.: Trie memory. Commun. ACM 3(9), 490–499 (1960). https://doi.org/10.1145/367390.367400
Article Google Scholar
Goldberg, D.W.: Geocoding. In: Castree, N., Goodchild, M.F., Kobayashi, A., Liu, W., Marston, R.A. (eds.) International Encyclopedia of Geography. People, the Earth, Environment and Technology, vol. 15, pp. 1–12. Wiley, New York, NY, USA, 1st edn. (2017).https://doi.org/10.1002/9781118786352.wbieg1051
Grover, C., et al.: Use of the edinburgh geoparser for georeferencing digitized historical collections. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 368(1925), 3875–3889 (2010)
Article Google Scholar
Haltermann, A.: Mordecai: Full text geoparsing and event geocoding. J. Open Source Softw. 2(9), 91 (2017). https://doi.org/10.21105/joss.000911
Hill, L.L.: Georeferencing: The Geographic Associations of Information. MIT Press, Cambridge, MA, USA (2006)
Book Google Scholar
Kamalloo, E., Rafiei, D.: A coherent unsupervised model for toponym resolution. In: Proceedings of the 2018 World Wide Web Conference, pp. 1287–1296. WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2018). https://doi.org/10.1145/3178876.3186027
Leidner, J.L.: An evaluation dataset for the toponym resolution task. Comput. Environ. Urban Syst. 30(4), 400–417 (2006). https://doi.org/10.1016/j.compenvurbsys.2005.07.003, geographic Information Retrieval (GIR)
Leidner, J.L.: Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding. Ph.D. thesis, School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK (2007)
Google Scholar
Leidner, J.L.: Toponym Resolution in Text: Annotation. Evaluation and Applications of Spatial Grounding. Universal Press, Boca Raton, FL, USA (2008)
Google Scholar
Leidner, J.L.: Georeferencing: From texts to maps. In: Castree, N., Goodchild, M.F., Kobayashi, A., Liu, W., Marston, R.A. (eds.) International Encyclopedia of Geography. People, the Earth, Environment and Technology, vol. 15, pp. 1–10. Wiley, 1st edn. (2017).https://doi.org/10.1002/9781118786352.wbieg0160
Leidner, J.L.: A survey of textual data & geospatial technology. In: Werner, M., Chiang, Y.-Y. (eds.) Handbook of Big Geospatial Data, pp. 429–457. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-55462-0_16
Chapter Google Scholar
Leidner, J.L., Sinclair, G., Webber, B.: Grounding spatial named entities for information extraction and question answering. In: Proceedings of the Workshop on Analysis of Geographic References Held at HLT-NAACL 2003, pp. 31–38. ACL, Edmonton, Alberta, Canada (2003). https://aclanthology.org/W03-0105
Overell, S., Rüger, S.: Using co-occurrence models for placename disambiguation. Int. J. Geogr. Inf. Sci. 22(3), 265–287 (2008). https://doi.org/10.1080/13658810701626236
Article Google Scholar
Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: ISO-TimeML: an international standard for semantic annotation. In: Calzolari, N., (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, Malta (2010)
Google Scholar
Speriosu, M., Baldridge, J.: Text-driven toponym resolution using indirect supervision. In: Schütze, H., Fung, P., Poesio, M. (eds.) Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1466–1476. Association for Computational Linguistics, Sofia, Bulgaria (2013)
Google Scholar
Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based tool for NLP-assisted text annotation. In: Segond, F. (ed.) Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics, Avignon, France (2012)
Google Scholar
Verhagen, M., et al.: Automating temporal annotation with TARSQI. In: Nagata, M., Pedersen, T. (eds.) Proceedings of the ACL, pp. 81–84. Association for Computational Linguistics, Ann Arbor, MI, USA (2005). https://doi.org/10.3115/1225753.1225774

Download references

Author information

Authors and Affiliations

Information Access Research Group, Center for Research in Responsible Artificial Intelligence (CRAI), Coburg University of Applied Sciences and Arts, Friedrich-Streib-Straße 2, 96459, Coburg, Bavaria, Germany
Jochen L. Leidner & Luca Jung
Department of Computer Science, University of Sheffield, Regents Court, 211, Portobello, Sheffield S1 4DP, UK
Jochen L. Leidner

Authors

Jochen L. Leidner
View author publications
You can also search for this author in PubMed Google Scholar
Luca Jung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jochen L. Leidner .

Editor information

Editors and Affiliations

University of Applied Sciences and Arts Western Switzerland, Yverdon-les-Bains, Switzerland
Maryam Lotfian
University of Naples Federico II, Naples, Italy
Luigi Libero Lucio Starace

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leidner, J.L., Jung, L. (2024). TAME II: A Modern Geographic Text Annotation Tool. In: Lotfian, M., Starace, L.L.L. (eds) Web and Wireless Geographical Information Systems. W2GIS 2024. Lecture Notes in Computer Science, vol 14673. Springer, Cham. https://doi.org/10.1007/978-3-031-60796-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-60796-7_7
Published: 09 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-60795-0
Online ISBN: 978-3-031-60796-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics