Abstract
One of the major problems for modern search engines is to keep up with the tremendous growth in the size of the web and the number of queries submitted by users. The amount of data being generated today can only be processed and managed with specialized technologies.
BlockMax WAND and the more recent Variable BlockMax WAND represent the most advanced query processing algorithms that make use of dynamic pruning techniques, which allow them to retrieve the top k most relevant documents for a given query without any effectiveness degradation of its ranking. In this paper, we describe a new technique for the BlockMax WAND family of query processing algorithm, which improves block skipping in order to increase its efficiency. We show that our optimization is able to improve query processing speed on short queries by up to 37% with negligible additional space overhead.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.: Efficient query evaluation using a two-level retrieval process. In: Proceedings of the 12th International Conference on Information and Knowledge Management, pp. 426–434 (2003)
Callan, J., Hoy, M., Yoo, C., Zhao, L.: Clueweb09 data set (2009). http://lemurproject.org/clueweb09/
Daoud, C.M., de Moura, E.S., da Costa Carvalho, A.L., da Silva, A.S., de Oliveira, D.F., Rossi, C.: Fast top-k preserving query processing using two-tier indexes. Inf. Process. Manage. 52, 855–872 (2016)
Daoud, C.M., de Moura, E.S., de Oliveira, D.F., da Silva, A.S., Rossi, C., da Costa Carvalho, A.L.: Waves: a fast multi-tier top-k query processing algorithm. Inf. Retr. J. 20, 292–316 (2017)
Dimopoulos, C., Nepomnyachiy, S., Suel, T.: Optimizing top-k document retrieval strategies for block-max indexes. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining, pp. 113–122 (2013)
Ding, S., Suel, T.: Faster top-k document retrieval using block-max indexes. In Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 993–1002 (2011)
Kane, A., Tompa, F.W.: Split-lists and initial thresholds for wand-based search. In: Proceedings of the 41st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 877–880 (2018)
Mallia, A., Ottaviano, G., Porciani, E., Tonellotto, N., Venturini, R.: Faster blockmax WAND with variable-sized blocks. In: Proceedings of the 40th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 625–634 (2017)
Ottaviano, G., Venturini, R.: Partitioned Elias-Fano indexes. In: Proceedings of the 37th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 273–282 (2014)
Rojas, O., Gil-Costa, V., Marin, M.: Efficient parallel block-max wand algorithm. In: Proceedings of the 19th International Conference on Parallel Processing, pp. 394–405 (2013)
Silvestri, F.: Sorting out the document identifier assignment problem. In: Proceedings of the 29th European Conference on IR Research, pp. 101–112 (2007)
Vigna, S.: Quasi-succinct indices. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining, pp. 83–92 (2013)
Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: Proceedings of the 18th International Conference on World Wide Web, pp. 401–410 (2009)
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 38(2), 6 (2006)
Acknowledgments
Antonio Mallia’s research was partially supported by NSF Grant IIS-1718680 “Index Sharding and Query Routing in Distributed Search Engines”.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Mallia, A., Porciani, E. (2019). Faster BlockMax WAND with Longer Skipping. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds) Advances in Information Retrieval. ECIR 2019. Lecture Notes in Computer Science(), vol 11437. Springer, Cham. https://doi.org/10.1007/978-3-030-15712-8_52
Download citation
DOI: https://doi.org/10.1007/978-3-030-15712-8_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15711-1
Online ISBN: 978-3-030-15712-8
eBook Packages: Computer ScienceComputer Science (R0)