ABSTRACT
With the growing popularity of the world-wide-web and the increasing accessibility of smart devices, data is being generated at a faster rate than ever before. This presents scalability challenges to web-scale search systems -- how can we efficiently index, store and retrieve such a vast amount of data? A large amount of prior research has attempted to address many facets of this question, with the invention of a range of efficient index storage and retrieval frameworks that are able to efficiently answer most queries. However, the current literature generally focuses on improving the mean or median query processing time in a given system. In the proposed PhD project, we focus on improving the efficiency of high percentile tail latencies in large scale IR systems while minimising end-to-end effectiveness loss. Although there is a wealth of prior research involving improving the efficiency of large scale IR systems, the most relevant prior work involves predicting long-running queries and processing them in various ways to avoid large query processing times. Prediction is often done through pre-trained models based on both static and dynamic features from queries and documents. Many different approaches to reducing the processing time of long running queries have been proposed, including parallelising queries that are predicted to run slowly, scheduling queries based on their predicted run time, and selecting or modifying the query processing algorithm depending on the load of the system. Considering the specific focus on tail latencies in large-scale IR systems, the proposed research aims to: (i) study what causes large tail latencies to occur in large-scale web search systems, (ii) propose a framework to mitigate tail latencies in multi-stage retrieval through the prediction of a vast range of query-specific efficiency parameters, (iii) experiment with mixed-mode query semantics to provide efficient and effective querying to reduce tail latencies, and (iv) propose a time-bounded solution for Document-at-a-Time (DaaT) query processing which is suitable for current web search systems. As a preliminary study, Crane et al. compared some state-of-the-art query processing strategies across many modern collections. They found that although modern DaaT dynamic pruning strategies are very efficient for ranked disjunctive processing, they have a much larger variance in processing times than Score-at-a-Time (SaaT) strategies which have a similar efficiency profile regardless of query length or the size of the required result set. Furthermore, Mackenzie et al. explored the efficiency trade-offs for paragraph retrieval in a multi-stage question answering system. They found that DaaT dynamic pruning strategies could efficiently retrieve the top-1,000 candidate paragraphs for very long queries. Extending on prior work, Mackenzie et al. showed how a range of per-query efficiency settings can be accurately predicted such that 99.99 percent of queries are serviced in less than 200 ms without noticeable effectiveness loss. In addition, a reference list framework was used for training models such that no relevance judgements or annotations were required. Future work will focus on improving the candidate generation stage in large-scale multi-stage retrieval systems. This will include further exploration of index layouts, traversal strategies, and query rewriting, with the aim of improving early stage efficiency to reduce the system tail latency, while potentially improving end-to-end effectiveness.
- D. Broccolo, C. Macdonald, S. Orlando, I. Ounis, R. Perego, F. Silvestri, and N. Tonellotto 2013. Load-sensitive selective pruning for distributed search Proc. CIKM. 379--388.Google Scholar
- C. L. A. Clarke, J. S. Culpepper, and A. Moffat. 2016. Assessing efficiency--effectiveness tradeoffs in multi-stage retrieval systems without using relevance judgments. Information Retrieval Vol. 19, 4 (2016), 351--377. Google ScholarDigital Library
- M. Crane, J. S. Culpepper, J. Lin, J. Mackenzie, and A. Trotman 2017. A comparison of Document-at-a-Time and Score-at-a-Time query evaluation Proc. WSDM. 201--210.Google Scholar
- J. S. Culpepper, C. L. A. Clarke, and J. Lin 2016. Dynamic cutoff prediction in multi-stage retrieval systems Proc. ADCS. 17--24.Google Scholar
- S-W. Hwang, S. Kim, Y. He, S. Elnikety, and S. Choi 2016. Prediction and predictability for search query acceleration. ACM Trans. Web, Vol. 10, 3 (Aug. 2016), 19:1--19:28.Google ScholarDigital Library
- M. Jeon, S. Kim, S-W. Hwang, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. 2014. Predictive parallelization: taming tail latencies in web search Proc. SIGIR. 253--262.Google Scholar
- C. Macdonald, N. Tonellotto, and I. Ounis. 2012. Learning to predict response times for online query scheduling Proc. SIGIR. 621--630.Google Scholar
- J. Mackenzie, R-C. Chen, and J. S. Culpepper 2016. RMIT at the TREC 2016 LiveQA Track. In Proc. TREC-25.Google Scholar
- J. Mackenzie, J. S. Culpepper, R. Blanco, M. Crane, C. L. A. Clarke, and J. Lin. 2017. Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems. (2017). showeprint[arXiv]1704.03970 [cs.IR]Google Scholar
- L. Tan and C. L. A. Clarke 2015. A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference. TKDE, Vol. 27, 11 (2015), 2865--2877. Google ScholarDigital Library
- N. Tonellotto, C. Macdonald, and I. Ounis. 2013. Efficient and effective retrieval using selective pruning Proc. WSDM. 63--72. endthebibliographyGoogle Scholar
Index Terms
- Managing Tail Latencies in Large Scale IR Systems
Recommendations
Scalable and efficient processing of top-k multiple-type integrated queries
AbstractIn this paper, we define a new class of queries, the top-k multiple-type integrated query (simply, top-k MULTI query). It deals with multiple data types and finds the information in the order of relevance between the query and the object. Various ...
Hybrid Dynamic Pruning for Efficient and Effective Query Processing
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementThe performance of query processing has always been a concern in the field of information retrieval. Dynamic pruning algorithms have been proposed to improve query processing performance in terms of efficiency and effectiveness. However, a single ...
Managing tail latency in large scale information retrieval systems
As both the availability of internet access and the prominence of smart devices continue to increase, data is being generated at a rate faster than ever before. This massive increase in data production comes with many challenges, including efficiency ...
Comments