skip to main content
10.1145/1869790.1869801acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

An efficient location extraction algorithm by leveraging web contextual information

Published: 02 November 2010 Publication History

Abstract

A typical location extraction approach consists of two steps, location name detection and location entity disambiguation. Promising results have been obtained in the last decade based on natural language processing technologies. However, there are still two challenges which requires further investigation: 1)How to leverage the prior and contextual evidence to improve the location extraction performance, and 2) How to utilize the interdependence information between the named entity recognition step and disambiguation step. In this paper, we propose an iterative detection-ranking framework to address these problems as well as a set of novel features to mine contextual information from web resources. Experimental results show that our solution outperforms the state-of-the-art approaches, including Metacarta GeoTagger and Yahoo Placemaker.

References

[1]
AutoNav, http://www.autonavi.com/en.
[2]
GeoTagger, http://www.metacarta.com/products-platform-geotag.htm.
[3]
PlaceMaker, http://developer.yahoo.com/geo/placemaker/.
[4]
USGS, http://geonames.usgs.gov/.
[5]
E. Amitay, N. Har'El, R. Sivan, and A. Soffer. Web-a-where: Geotagging web content. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 273--280, 2004.
[6]
T. J. Brunner and R. S. Purves. Spatial autocorrelation and toponym ambiguity. In Proceeding of the 2nd international workshop on Geographic information retrieval, pages 25--26, 2008.
[7]
R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In Proceedings of EACL, volume 6, pages 9--16, 2006.
[8]
J. Ding, L. Gravano, N. Shivakumar, and G. Inc. Computing geographical scopes of web resources. In Proceedings of the 26th International Conference on Very Large Databases, VLDB 2000, pages 545--556, 2000.
[9]
J. Finkel, T. Grenager, and C. Manning. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 363--370, 2005.
[10]
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, page 142, 2002.
[11]
J. Leveling., S. Hartrumpf., and D. Veiel. University of Hagen at GeoCLEF 2005: Using semantic networks for interpreting geographical queries. In Working Notes for the GeoCLEF 2005 Workshop, 2005.
[12]
H. Li, R. K. Srihari, C. Niu, and W. Li. Location normalization for information extraction. In Proceedings of the 19th international conference on Computational linguistics, pages 1--7, 2002.
[13]
H. Li, R. K. Srihari, C. Niu, and W. Li. Infoxtract location normalization: a hybrid approach to geographic references in information extraction. In Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, pages 39--44, 2003.
[14]
M. Nissim, C. Matheson, and J. Reid. Recognising geographical entities in Scottish historical documents. In Proceedings of the Workshop on Geographic Information Retrieval at SIGIR 2004, 2004.
[15]
S. Overell. Geographic Information Retrieval: Classification, Disambiguation and Modelling. PhD thesis, Imperial College London, 2009.
[16]
S. Overell and S. Ruger. Using co-occurrence models for placename disambiguation. International Journal of Geographical Information Science, 22(3):265--287, 2008.
[17]
T. Poibeau and L. Kosseim. Name extraction from non-journalistic texts. In Computational Linguistics in the Netherlands, pages 144--157, 2001.
[18]
T. Qin, T. Liu, X. Zhang, D. Wang, and H. Li. Global ranking using continuous conditional random fields. In Proceedings of NIPS, volume 8, 2008.
[19]
E. Rauch, M. Bukatin, and K. Baker. A confidence-based framework for disambiguating geographic terms. In Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, pages 50--54, 2003.
[20]
D. A. Smith and G. S. Mann. Bootstrapping toponym classifiers. In HLT-NAACL Workshop on Analysis of Geographic References, pages 45--49, 2003.
[21]
R. Volz, J. Kleb, and W. Mueller. Towards ontology-based disambiguation of geographical identifiers. In Proceedings of the 16th international conference on World Wide Web, 2007.
[22]
C. Wang, X. Xie, L. Wang, Y. Lu, and W. ying Ma. Detecting geographic locations from web resources. In Proceeding of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005.
[23]
L. Wang, C. Wang, X. Xie, J. Forman, Y. Lu, W.-Y. Ma, and Y. Li. Detecting dominant locations from search queries. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 424--431, 2005.

Cited By

View all
  • (2023)Toponym Identification in Epidemiology Articles – A Deep Learning ApproachComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-24340-0_3(26-37)Online publication date: 26-Feb-2023
  • (2019)Address Entities Extraction using Named Entity Recognition2019 7th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)10.1109/FiCloudW.2019.00016(13-17)Online publication date: Aug-2019
  • (2017)Large-Scale Location Prediction for Web PagesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.270263129:9(1902-1915)Online publication date: 1-Sep-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GIS '10: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
November 2010
566 pages
ISBN:9781450304283
DOI:10.1145/1869790
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2010

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

GIS '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Toponym Identification in Epidemiology Articles – A Deep Learning ApproachComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-24340-0_3(26-37)Online publication date: 26-Feb-2023
  • (2019)Address Entities Extraction using Named Entity Recognition2019 7th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)10.1109/FiCloudW.2019.00016(13-17)Online publication date: Aug-2019
  • (2017)Large-Scale Location Prediction for Web PagesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.270263129:9(1902-1915)Online publication date: 1-Sep-2017
  • (2017)Location detection and disambiguation from twitter messagesJournal of Intelligent Information Systems10.1007/s10844-017-0458-349:2(237-253)Online publication date: 1-Oct-2017
  • (2016)Automated Geocoding of Textual Documents: A Survey of Current ApproachesTransactions in GIS10.1111/tgis.1221221:1(3-38)Online publication date: 17-Jun-2016
  • (2016)Get into the spirit of a location by mining user-generated traveloguesNeurocomputing10.1016/j.neucom.2015.04.129204:C(61-69)Online publication date: 5-Sep-2016
  • (2015)You Are Where You GoProceedings of the Eighth ACM International Conference on Web Search and Data Mining10.1145/2684822.2685287(295-304)Online publication date: 2-Feb-2015
  • (2015)Detecting and Disambiguating Locations Mentioned in Twitter MessagesComputational Linguistics and Intelligent Text Processing10.1007/978-3-319-18117-2_24(321-332)Online publication date: 2015
  • (2014)Using minimaps to enable toponym resolution with an effective 100% rate of recallProceedings of the 8th Workshop on Geographic Information Retrieval10.1145/2675354.2675698(1-8)Online publication date: 4-Nov-2014
  • (2014)Geocoding for texts with fine-grain toponymsProceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/2666310.2666386(183-192)Online publication date: 4-Nov-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media