ABSTRACT
Processing queries in Web search engines demands the efficient use of hardware resources to cope with the scale and dynamics of user traffic. This paper focuses on the multithreaded processing of queries that requires (1) accessing a large inverted index data structure to obtain a set of documents, (2) rank them by executing the WAND operator in order to obtain the top K most pertinent documents for the query, and (3) resolve the insertion of new documents on the inverted index concurrently with the execution of queries. We propose an efficient strategy to assign threads to queries and index update operations which is suitable to support updates on the index concurrently with query processing. The core of our proposal is a simple classification technique devised to quickly assign threads to query operations.
- V. N. Anh and A. Moffat. Inverted index compression using word-aligned binary codes. Inf. Retr., 8(1):151--166, 2005. Google ScholarDigital Library
- D. Arroyuelo, S. González, M. Oyarzún, and V. Sepulveda. Document identifier reassignment and run-length-compressed inverted indexes for improved search performance. In SIGIR, pages 173--182, 2013. Google ScholarDigital Library
- R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval: The Concepts and Technology behind Search (ACM Press Books). Addison-Wesley Professional, 2011. Google ScholarDigital Library
- C. Bonacic, C. García, M. Marin, M. Prieto-Matias, and F. Tirado. Building efficient multi-threaded search nodes. In CIKM, pages 1249--1258, 2010. Google ScholarDigital Library
- A. Z. Broder, D. Carmel, M. Herscovici, A. Soffer, and J. Zien. Efficient query evaluation using a two-level retrieval process. In CIKM, pages 426--434, 2003. Google ScholarDigital Library
- S. Ding and T. Suel. Faster top-k document retrieval using block-max indexes. In SIGIR, pages 993--1002, 2011. Google ScholarDigital Library
- C. Macdonald, N. Tonellotto, and I. Ounis. Learning to predict response times for online query scheduling. In SIGIR, pages 621--630, 2012. Google ScholarDigital Library
- O. Rojas, V. Gil-Costa, and M. Marin. Efficient parallel block-max wand algorithm. In Euro-Par, pages 394--405. 2013. Google ScholarDigital Library
- H. Yan, S. Ding, and T. Suel. Inverted index compression and query processing with optimized document ordering. In WWW, pages 401--410, 2009. Google ScholarDigital Library
- J. Zobel and A. Moffat. Inverted files for text search engines. ACM Comput. Surv., 38(2), July 2006. Google ScholarDigital Library
Index Terms
- Multithreaded Processing in Dynamic Inverted Indexes for Web Search Engines
Recommendations
Inverted indexes vs. bitmap indexes in decision support systems
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementBitmap indexes are widely used in Decision Support Systems (DSSs) to improve query performance. In this paper, we evaluate the use of compressed inverted indexes with adapted query processing strategies from Information Retrieval as an alternative. In a ...
Document identifier reassignment and run-length-compressed inverted indexes for improved search performance
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrievalText search engines are a fundamental tool nowadays. Their efficiency relies on a popular and simple data structure: the inverted indexes. Currently, inverted indexes can be represented very efficiently using index compression schemes. Recent ...
Comments