Skip to main content

Hybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents

  • Conference paper
Database and Expert Systems Applications (DEXA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6261))

Included in the following conference series:

Abstract

There is a significant commercial and research interest in location-based web search engines. Given a number of search keywords and one or more locations that a user is interested in, a location-based web search retrieves and ranks the most textually and spatially relevant web pages. In this type of search, both the spatial and textual information should be indexed. Currently, no efficient index structure exists that can handle both the spatial and textual aspects of data simultaneously and accurately. Existing approaches either index space and text separately or use inefficient hybrid index structures with poor performance. Moreover, most of these approaches cannot accurately rank web-pages based on a combination of space and text and are not easy to integrate into existing search engines. In this paper, we propose a new index structure called Spatial-Keyword Inverted File to handle location-based web searches in an integrated/efficient manner. To seamlessly find and rank relevant documents, we develop a new distance measure called spatial tf-idf. We propose four variants of spatial-keyword relevance scores and two algorithms to perform top-k searches. As verified by experiments, our proposed techniques outperform existing index structures in terms of search performance and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhou, Y., et al.: Hybrid index structures for location-based web search. In: CIKM (2005)

    Google Scholar 

  2. Hariharan, R., et al.: Processing spatial-Keyword (SK) queries in geographic information retrieval (GIR) systems. In: SSDBM (2007)

    Google Scholar 

  3. De Felipe, I., et al.: Keyword search on spatial databases. In: ICDE (2008)

    Google Scholar 

  4. Zobel, J., et al.: Inverted files for text search engines. ACM Comput. (2006)

    Google Scholar 

  5. Baeza-Yates, R., et al.: Modern information retrieval. Addison-Wesley, Reading (1999)

    Google Scholar 

  6. Chen, Y.: Efficient query processing in geographic web search engines. In: SIGMOD (2006)

    Google Scholar 

  7. McCurley, K.S., et al.: Geospatial mapping and Navigation of the Web. In: WWW (2001)

    Google Scholar 

  8. Salton, G., et al.: Term-Weighting approaches in automatic text retrieval (1988)

    Google Scholar 

  9. Cong, G.: Efficient retrieval of the top-k most relevant spatial web objects. In: PVLDB (2009)

    Google Scholar 

  10. Vaid, S., et al.: Spatio-textual indexing for geographical search on the web. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 218–235. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Amitay, E., et al.: Web-a-where: geotagging web content. In: SIGIR (2004)

    Google Scholar 

  12. Ding, J., et al.: Computing geographical scopes of web resources. In: VLDB (2000)

    Google Scholar 

  13. Gao, W., et al.: Geographically focused collaborative crawling. In: WWW (2006)

    Google Scholar 

  14. Zobel, J., et al.: Adding compression to a full-text retrieval system. Sof. Prac. Exp. (1995)

    Google Scholar 

  15. Haveliwala, T., et al.: Topic-sensitive PageRank. In: WWW (2002)

    Google Scholar 

  16. Manning, C., et al.: Introduction to information retrieval. Cambridge University Press, Cambridge (2008)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Khodaei, A., Shahabi, C., Li, C. (2010). Hybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15364-8_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15364-8_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15363-1

  • Online ISBN: 978-3-642-15364-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics