skip to main content
10.1145/2093973.2094033acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
poster

Unveiling locations in geo-spatial documents

Published: 01 November 2011 Publication History

Abstract

Resolving geo-identities of addresses in emerging economies where users rely primarily on short messaging as the means of querying, poses several daunting challenges: lack of proper addressing schemes, non-availability of cartographic information and non-standardized nomenclature of geo-spatial entities such as streets and avenues, to name a few.
In this work, we propose a simple and elegant approach to solve this problem for emerging economies. By treating address texts as short documents and exploiting latent proximity information contained in them --- for example, landmark like references, similarity of address texts etc --- we transform the problem of resolving geo-identity to a search problem on short annotated geo-spatial documents, collected through extensive survey of six cities in India. Our solution spans all the phases of building a geo-identity resolution system, even though our emphasis is on the collection and organization of the corpus to facilitate a search engine backend for the task. Through experimentation based on a representative test set collected from the real world, we demonstrate how this approach achieves over 94% accuracy in resolution and an order of magnitude reduction in system state (memory) with nearly zero false-negatives - a significant improvement over the state of the art in emerging markets.

References

[1]
http://lucene.apache.org/solr/.
[2]
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman, Indexing by latent semantic analysis, vol. 41, 1990, pp. 391--407.
[3]
Martin Ester, Hans peter Kriegel, Jorg S, and Xiaowei Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, AAAI Press, 1996, pp. 226--231.
[4]
Q. Gan, J. Attenberg, A. Markowetz, and T. Suel, Analysis of geographic queries in a search engine log, Proc. of LocWeb (2008).
[5]
Ahmed Hassan, Rosie Jones, and Fernando Diaz, A case study of using geographic cues to predict query news intent, Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (New York, NY, USA), GIS '09, ACM, 2009, pp. 33--41.
[6]
T. Hoffman, Probabilistic latent semantic analysis, Proc. of UAI '99, 1999.
[7]
Rosie Jones, Benjamin Rey, and Omid Madani, Generating query substitutions, In WWW, 2006, pp. 387--396.
[8]
Donald Metzler, Susan Dumais, and Christopher Meek, Similarity measures for short segments of text, In Proc. of ECIR-07, 2007.
[9]
M. Sahami and T. D. Heilman, A web-based kernel function for measuring the similarity of short text snippets, Proc. of WWW, 2006.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GIS '11: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
November 2011
559 pages
ISBN:9781450310314
DOI:10.1145/2093973

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2011

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Poster

Conference

GIS '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 92
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media