Abstract
In this paper we propose an evaluation method for parallel algorithms that can be used independently of the used parallel programming library and architecture. We propose to predict the execution costs using a simple but efficient framework that consists in modeling the strategies via a BSP architecture, and estimating the real costs using as input real query traces over real or stochastically generated data. In particular we apply this method on a 2D inverted file index used to resolve web search queries. We present results for OR queries, for which we compare different ranking and caching strategies, and show how our framework works. In addition, we present and evaluate intelligent ranking and caching algorithms for AND queries.
Partially supported by Universidad de Buenos Aires’ UBACYT project X436.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Feuerstein, E., Marin, M., Mizrahi, M., Gil-Costa, V., Baeza-Yates, R.: Two-dimensional distributed inverted files. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 206–213. Springer, Heidelberg (2009)
Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: WWW 2009, pp. 431–440. ACM Press, New York (2009)
Long, X., Suel, T.: Three-level caching for efficient query processing in large web search engines. In: WWW 2005, pp. 257–266 (2005)
Marin, M., Gil Costa, V.: High-performance distributed inverted files. In: CIKM 2007, pp. 935–938. ACM Press, New York (2007)
Marin, M., Gil Costa, V., Gomez-Pantoja, C.: New caching techniques for web search engines. In: ACM HPDC (2010)
Puppin, D., Silvestri, F., Perego, R., Baeza-Yates, R.: Tuning the capacity of search engines: Load-driven routing and incremental caching to reduce and balance the load. ACM Transactions on Information Systems (TOIS) 28, 2 (2010)
Ramachandran, V., Grayson, B., Dahlin, M.: Emulations between QSM, BSP and LogP: a framework for general-purpose parallel algorithm design. J. Parallel Distrib. Comput. 63, 1175–1192 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feuerstein, E., Gil-Costa, V., Mizrahi, M., Marin, M. (2011). Performance Evaluation of Improved Web Search Algorithms. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds) High Performance Computing for Computational Science – VECPAR 2010. VECPAR 2010. Lecture Notes in Computer Science, vol 6449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19328-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-19328-6_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19327-9
Online ISBN: 978-3-642-19328-6
eBook Packages: Computer ScienceComputer Science (R0)