Abstract
In large web search engines the performance of Information Retrieval systems is a key issue. Block-based compression methods are often used to improve the search performance, but current self-indexing techniques are not adapted to such data structure and provide sub-optimal performance. In this paper, we present SkipBlock, a self-indexing model for block-based inverted lists. Based on a cost model, we show that it is possible to achieve significant improvements on both search performance and structure’s space storage.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Anh, V.N., Moffat, A.: Index compression using 64-bit words. Softw. Pract. Exper. 40(2), 131–147 (2010)
Boldi, P., Vigna, S.: Compressed perfect embedded skip lists for quick inverted-index lookups. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 25–28. Springer, Heidelberg (2005)
Büttcher, S., Clarke, C.L.A.: Index compression is good, especially for random access. In: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management - CIKM 2007, pp. 761–770. ACM Press, New York (2007)
Chierichetti, F., Lattanzi, S., Mari, F., Panconesi, A.: On placing skips optimally in expectation. In: WSDM 2008: Proceedings of the International Conference on Web Search and Web Data Mining, pp. 15–24. ACM, New York (2008)
Dean, B.C., Jones, Z.H.: Exploring the duality between skip lists and binary search trees. In: ACM-SE 45: Proceedings of the 45th Annual Southeast Regional Conference, pp. 395–399. ACM, New York (2007)
Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: Proceedings of IEEE International Conference on Data Engineering, pp. 370–379 (1998)
Messeguer, X.: Skip trees, an alternative data structure to skip lists in a concurrent approach. In: ITA, pp. 251–269 (1997)
Moffat, A., Zobel, J.: Self-indexing inverted files for fast text retrieval. ACM Trans. Inf. Syst. 14(4), 349–379 (1996)
Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. Commun. ACM 33(6), 668–676 (1990)
Zukowski, M., Heman, S., Nes, N., Boncz, P.: Super-scalar ram-cpu cache compression. In: ICDE 2006: Proceedings of the 22nd International Conference on Data Engineering, p. 59. IEEE Computer Society, Washington, DC (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Campinas, S., Delbru, R., Tummarello, G. (2011). SkipBlock: Self-indexing for Block-Based Inverted List. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_55
Download citation
DOI: https://doi.org/10.1007/978-3-642-20161-5_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20160-8
Online ISBN: 978-3-642-20161-5
eBook Packages: Computer ScienceComputer Science (R0)