ABSTRACT
We study the problem of deriving geolocations for Wikipedia pages. To this end, we introduce a general four-step process to location derivation, and consider different instantiations of this process, leveraging both textual and categorical data. Extensive experimentation shows that our methods provide good precision-recall trade-offs and improvements over text-only methods. Hence, our system can be used to augment the geographic information of Wikipedia, and to enable more effective geographic information retrieval.
Supplemental Material
- E. Amitay, N. Har'El, R. Sivan, and A. Soffer. 2004. Web-a-where: geotagging web content. In SIGIR.Google Scholar
- Z. Cheng, J. Caverlee, and K. Lee. 2010. You are where you tweet: a content-based approach to geo-locating twitter users. In CIKM.Google Scholar
- D. Dias, I. Anastácio, and B. Martins. 2012. A language modeling approach for georeferencing textual documents. In CERI.Google Scholar
- J. Ding, L. Gravano, and N. Shivakumar. 2000. Computing geographical scopes of web resources. (2000).Google Scholar
- F. Melo and B. Martins. 2015. Geocoding textual documents through the usage of hierarchical classifiers. In GIR.Google Scholar
- M.A. Radke, N. Gautam, A. Tambi, U.A Deshpande, and Z. Syed. 2018. Geotagging Text Data on the Web?A Geometrical Approach. IEEE Access, Vol. 6 (2018).Google Scholar
- S. Roller, M. Speriosu, S. Rallapalli, B. Wing, and J. Baldridge. 2012. Supervised text-based geolocation using language models on an adaptive grid. In ACL.Google Scholar
- B. Wing and J. Baldridge. 2011. Simple supervised document geolocation with geodesic grids. In ACL.Google Scholar
- B. Wing and J. Baldridge. 2014. Hierarchical discriminative classification for text-based geolocation. In EMNLP.Google Scholar
Index Terms
- Deriving Geolocations in Wikipedia
Recommendations
Geographic Information Retrieval Using Wikipedia Articles
WWW '23: Proceedings of the ACM Web Conference 2023Assigning semantically relevant, real-world locations to documents opens new possibilities to perform geographic information retrieval. We propose a novel approach to automatically determine the latitude-longitude coordinates of appropriate Wikipedia ...
Using co-occurrence models for placename disambiguation
This paper describes the generation of a model capturing information on how placenames co-occur together. The advantages of the co-occurrence model over traditional gazetteers are discussed and the problem of placename disambiguation is presented as a ...
Geographic co-occurrence as a tool for gir.
GIR '07: Proceedings of the 4th ACM workshop on Geographical information retrievalIn this paper we describe the development of a geographic co-occurrence model and how it can be applied to geographic information retrieval. The model consists of mining co-occurrences of placenames from Wikipedia, and then mapping these placenames to ...
Comments