Skip to main content

Performance Evaluation of Improved Web Search Algorithms

  • Conference paper
High Performance Computing for Computational Science – VECPAR 2010 (VECPAR 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6449))

Abstract

In this paper we propose an evaluation method for parallel algorithms that can be used independently of the used parallel programming library and architecture. We propose to predict the execution costs using a simple but efficient framework that consists in modeling the strategies via a BSP architecture, and estimating the real costs using as input real query traces over real or stochastically generated data. In particular we apply this method on a 2D inverted file index used to resolve web search queries. We present results for OR queries, for which we compare different ranking and caching strategies, and show how our framework works. In addition, we present and evaluate intelligent ranking and caching algorithms for AND queries.

Partially supported by Universidad de Buenos Aires’ UBACYT project X436.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)

    Google Scholar 

  2. Feuerstein, E., Marin, M., Mizrahi, M., Gil-Costa, V., Baeza-Yates, R.: Two-dimensional distributed inverted files. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 206–213. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  3. Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: WWW 2009, pp. 431–440. ACM Press, New York (2009)

    Google Scholar 

  4. Long, X., Suel, T.: Three-level caching for efficient query processing in large web search engines. In: WWW 2005, pp. 257–266 (2005)

    Google Scholar 

  5. Marin, M., Gil Costa, V.: High-performance distributed inverted files. In: CIKM 2007, pp. 935–938. ACM Press, New York (2007)

    Google Scholar 

  6. Marin, M., Gil Costa, V., Gomez-Pantoja, C.: New caching techniques for web search engines. In: ACM HPDC (2010)

    Google Scholar 

  7. Puppin, D., Silvestri, F., Perego, R., Baeza-Yates, R.: Tuning the capacity of search engines: Load-driven routing and incremental caching to reduce and balance the load. ACM Transactions on Information Systems (TOIS) 28, 2 (2010)

    Article  Google Scholar 

  8. Ramachandran, V., Grayson, B., Dahlin, M.: Emulations between QSM, BSP and LogP: a framework for general-purpose parallel algorithm design. J. Parallel Distrib. Comput. 63, 1175–1192 (2003)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Feuerstein, E., Gil-Costa, V., Mizrahi, M., Marin, M. (2011). Performance Evaluation of Improved Web Search Algorithms. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds) High Performance Computing for Computational Science – VECPAR 2010. VECPAR 2010. Lecture Notes in Computer Science, vol 6449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19328-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19328-6_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19327-9

  • Online ISBN: 978-3-642-19328-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics