Abstract
There is a significant commercial and research interest in location-based web search engines. Given a number of search keywords and one or more locations that a user is interested in, a location-based web search retrieves and ranks the most textually and spatially relevant web pages. In this type of search, both the spatial and textual information should be indexed. Currently, no efficient index structure exists that can handle both the spatial and textual aspects of data simultaneously and accurately. Existing approaches either index space and text separately or use inefficient hybrid index structures with poor performance. Moreover, most of these approaches cannot accurately rank web-pages based on a combination of space and text and are not easy to integrate into existing search engines. In this paper, we propose a new index structure called Spatial-Keyword Inverted File to handle location-based web searches in an integrated/efficient manner. To seamlessly find and rank relevant documents, we develop a new distance measure called spatial tf-idf. We propose four variants of spatial-keyword relevance scores and two algorithms to perform top-k searches. As verified by experiments, our proposed techniques outperform existing index structures in terms of search performance and accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhou, Y., et al.: Hybrid index structures for location-based web search. In: CIKM (2005)
Hariharan, R., et al.: Processing spatial-Keyword (SK) queries in geographic information retrieval (GIR) systems. In: SSDBM (2007)
De Felipe, I., et al.: Keyword search on spatial databases. In: ICDE (2008)
Zobel, J., et al.: Inverted files for text search engines. ACM Comput. (2006)
Baeza-Yates, R., et al.: Modern information retrieval. Addison-Wesley, Reading (1999)
Chen, Y.: Efficient query processing in geographic web search engines. In: SIGMOD (2006)
McCurley, K.S., et al.: Geospatial mapping and Navigation of the Web. In: WWW (2001)
Salton, G., et al.: Term-Weighting approaches in automatic text retrieval (1988)
Cong, G.: Efficient retrieval of the top-k most relevant spatial web objects. In: PVLDB (2009)
Vaid, S., et al.: Spatio-textual indexing for geographical search on the web. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 218–235. Springer, Heidelberg (2005)
Amitay, E., et al.: Web-a-where: geotagging web content. In: SIGIR (2004)
Ding, J., et al.: Computing geographical scopes of web resources. In: VLDB (2000)
Gao, W., et al.: Geographically focused collaborative crawling. In: WWW (2006)
Zobel, J., et al.: Adding compression to a full-text retrieval system. Sof. Prac. Exp. (1995)
Haveliwala, T., et al.: Topic-sensitive PageRank. In: WWW (2002)
Manning, C., et al.: Introduction to information retrieval. Cambridge University Press, Cambridge (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khodaei, A., Shahabi, C., Li, C. (2010). Hybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15364-8_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-15364-8_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15363-1
Online ISBN: 978-3-642-15364-8
eBook Packages: Computer ScienceComputer Science (R0)