ABSTRACT
Geoparsing is an important task in geographic information retrieval. A geoparsing system, known as a geoparser, takes some texts as the input and outputs the recognized place mentions and their location coordinates. In June 2019, a geoparsing competition, Toponym Resolution in Scientific Papers, was held as one of the SemEval 2019 tasks. The winning teams developed neural network based geoparsers that achieved outstanding performances (over 90% precision, recall, and F1 score for toponym recognition). This exciting result brings the question "are we there yet?", namely have we achieved high enough performances to possibly consider the problem of geoparsing as solved? One limitation of this competition is that the developed geoparsers were tested on only one dataset which has 45 research articles collected from the particular domain of Bio-medicine. It is known that the same geoparser can have very different performances on different datasets. Thus, this work performs a systematic evaluation of these state-of-the-art geoparsers using our recently developed benchmarking platform EUPEG that has eight annotated datasets, nine baseline geoparsers, and eight performance metrics. The evaluation result suggests that these new geoparsers indeed improve the performances of geoparsing on multiple datasets although some challenges remain.
- Beatrice Alex, Claire Grover, Richard Tobin, and Jon Oberlander. 2019. Geoparsing historical and contemporary literary text set in the City of Edinburgh. Language Resources and Evaluation 0, 0 (2019), 1--25.Google Scholar
- Grant DeLozier, Jason Baldridge, and Loretta London. 2015. Gazetteer-Independent Toponym Resolution Using Geographic Word Profiles. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, Palo Alto, CA, USA, 2382--2388.Google ScholarDigital Library
- Milan Gritta, Mohammad Taher Pilehvar, and Nigel Collier. 2018. Which Melbourne? Augmenting Geocoding with Maps. In Proceedings of the 56th Annual Meeting of the ACL, Vol. 1. ACL, Stroudsburg, PA, USA, 1285--1296.Google Scholar
- Milan Gritta, Mohammad Taher Pilehvar, Nut Limsopatham, and Nigel Collier. 2018. What's missing in geographical parsing? Language Resources and Evaluation 52, 2 (2018), 603--623.Google ScholarDigital Library
- Claire Grover, Richard Tobin, Kate Byrne, Matthew Woollard, James Reid, Stuart Dunn, and Julian Ball. 2010. Use of the Edinburgh geoparser for georeferencing digitized historical collections. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 368, 1925 (2010), 3875--3889.Google ScholarCross Ref
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google Scholar
- Yingjie Hu. 2018. EUPEG: Towards an Extensible and Unified Platform for Evaluating Geoparsers. In Proceedings of the 12th Workshop on Geographic Information Retrieval (GIR'18). ACM, New York, NY, USA, Article 3, 2 pages.Google ScholarDigital Library
- Yiting Ju, Benjamin Adams, Krzysztof Janowicz, Yingjie Hu, Bo Yan, and Grant McKenzie. 2016. Things and strings: improving place name disambiguation from short texts by combining entity co-occurrence with topic modeling. In European Knowledge Acquisition Workshop. Springer, Cham, 353--367.Google ScholarDigital Library
- Morteza Karimzadeh, Scott Pezanowski, Alan M MacEachren, and Jan O Wallgrün. 2019. GeoTxt: A scalable geoparsing system for unstructured text geolocation. Transactions in GIS 23, 1 (2019), 118--136.Google ScholarCross Ref
- Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. In Proceedings of NAACL-HLT 2016. ACL, Stroudsburg, PA, USA, 260--270.Google ScholarCross Ref
- Haonan Li, Minghan Wang, Timothy Baldwin, Martin Tomko, and Maria Vasardani. 2019. UniMelb at SemEval-2019 Task 12: Multi-model combination for toponym resolution. In Proceedings of the 13th International Workshop on Semantic Evaluation. ACL, Stroudsburg, PA, USA, 1313--1318.Google ScholarCross Ref
- Ludovic Moncla, Walter Renteria-Agualimpia, Javier Nogueras-Iso, and Mauro Gaio. 2014. Geocoding for texts with fine-grain toponyms: an experiment on a geoparsed hiking descriptions corpus. In Proceedings of the 22nd acm sigspatial international conference on advances in geographic information systems. ACM, Dallas, Texas, 183--192.Google ScholarDigital Library
- Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the NAACL-HLT. Association for Computational Linguistics, New Orleans, Louisiana, 2227--2237.Google ScholarCross Ref
- Ross S Purves, Paul Clough, Christopher B Jones, Mark H Hall, Vanessa Murdock, et al. 2018. Geographic Information Retrieval: Progress and Challenges in Spatial Search of Text. Foundations and Trends® in Information Retrieval 12, 2--3 (2018), 164--318.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon et al. (Ed.). NIPS Foundation, Inc., San Diego, USA, 5998--6008.Google ScholarDigital Library
- Jan Oliver Wallgrün, Morteza Karimzadeh, Alan M MacEachren, and Scott Pezanowski. 2018. GeoCorpora: building a corpus to test and train microblog geoparsers. International Journal of Geographical Information Science 32, 1 (2018), 1--29.Google ScholarDigital Library
- Jimin Wang and Yingjie Hu. 2019. Enhancing spatial and textual analysis with EUPEG: an extensible and unified platform for evaluating geoparsers. Transactions in GIS 0, 0 (2019), accepted.Google Scholar
- Xiaobin Wang, Chunping Ma, Huafei Zheng, Chu Liu, Pengjun Xie, Linlin Li, and Luo Si. 2019. DM_NLP at SemEval-2018 Task 12: A Pipeline System for Toponym Resolution. In Proceedings of the 13th International Workshop on Semantic Evaluation. ACL, Stroudsburg, PA, USA, 917--923.Google ScholarCross Ref
- Davy Weissenbacher, Arjun Magge, Karen O'Connor, Matthew Scotch, and Graciela Gonzalez. 2019. Semeval-2019 task 12: Toponym resolution in scientific papers. In Proceedings of the 13th International Workshop on Semantic Evaluation. ACL, Stroudsburg, PA, USA, 907--916.Google ScholarCross Ref
- Davy Weissenbacher, Tasnia Tahsin, Rachel Beard, Mari Figaro, Robert Rivera, Matthew Scotch, and Graciela Gonzalez. 2015. Knowledge-driven geospatial location resolution for phylogeographic models of virus migration. Bioinformatics 31, 12 (2015), i348--i356.Google ScholarCross Ref
- Vikas Yadav and Steven Bethard. 2018. A Survey on Recent Advances in Named Entity Recognition from Deep Learning models. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 2145--2158.Google Scholar
- Vikas Yadav, Egoitz Laparra, Ti-Tai Wang, Mihai Surdeanu, and Steven Bethard. 2019. University of arizona at semeval-2019 task 12: Deep-affix named entity recognition of geolocation entities. In Proceedings of the 13th International Work-shop on Semantic Evaluation. ACL, Stroudsburg, PA, USA, 1319--1323.Google ScholarCross Ref
Index Terms
- Are we there yet?: evaluating state-of-the-art neural network based geoparsers using EUPEG as a benchmarking platform
Recommendations
Location Extraction from Social Media: Geoparsing, Location Disambiguation, and Geotagging
Location extraction, also called “toponym extraction,” is a field covering geoparsing, extracting spatial representations from location mentions in text, and geotagging, assigning spatial coordinates to content items. This article evaluates five “best-...
Gazetteer enrichment for addressing urban areas: a case study
The advent of volunteered geographical information VGI has contributed to the growth of the amount of user-contributed spatial data around the world. Spatial data acquired from crowdsourcing environments may contain valuable information which can be ...
EUPEG: Towards an Extensible and Unified Platform for Evaluating Geoparsers
GIR'18: Proceedings of the 12th Workshop on Geographic Information RetrievalGeoparsing, namely recognizing and geo-locating place mentions from unstructured texts, is a critical task in geographic information retrieval (GIR). While a number of geoparsers have been developed, they were often tested on different datasets using ...
Comments