abstract

Managing Tail Latencies in Large Scale IR Systems

Author:
Joel Mackenzie

RMIT University, Melbourne, Australia

RMIT University, Melbourne, Australia
View Profile

SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalAugust 2017Pages 1369https://doi.org/10.1145/3077136.3084152

Published:07 August 2017Publication History

SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 1369

ABSTRACT

With the growing popularity of the world-wide-web and the increasing accessibility of smart devices, data is being generated at a faster rate than ever before. This presents scalability challenges to web-scale search systems -- how can we efficiently index, store and retrieve such a vast amount of data? A large amount of prior research has attempted to address many facets of this question, with the invention of a range of efficient index storage and retrieval frameworks that are able to efficiently answer most queries. However, the current literature generally focuses on improving the mean or median query processing time in a given system. In the proposed PhD project, we focus on improving the efficiency of high percentile tail latencies in large scale IR systems while minimising end-to-end effectiveness loss. Although there is a wealth of prior research involving improving the efficiency of large scale IR systems, the most relevant prior work involves predicting long-running queries and processing them in various ways to avoid large query processing times. Prediction is often done through pre-trained models based on both static and dynamic features from queries and documents. Many different approaches to reducing the processing time of long running queries have been proposed, including parallelising queries that are predicted to run slowly, scheduling queries based on their predicted run time, and selecting or modifying the query processing algorithm depending on the load of the system. Considering the specific focus on tail latencies in large-scale IR systems, the proposed research aims to: (i) study what causes large tail latencies to occur in large-scale web search systems, (ii) propose a framework to mitigate tail latencies in multi-stage retrieval through the prediction of a vast range of query-specific efficiency parameters, (iii) experiment with mixed-mode query semantics to provide efficient and effective querying to reduce tail latencies, and (iv) propose a time-bounded solution for Document-at-a-Time (DaaT) query processing which is suitable for current web search systems. As a preliminary study, Crane et al. compared some state-of-the-art query processing strategies across many modern collections. They found that although modern DaaT dynamic pruning strategies are very efficient for ranked disjunctive processing, they have a much larger variance in processing times than Score-at-a-Time (SaaT) strategies which have a similar efficiency profile regardless of query length or the size of the required result set. Furthermore, Mackenzie et al. explored the efficiency trade-offs for paragraph retrieval in a multi-stage question answering system. They found that DaaT dynamic pruning strategies could efficiently retrieve the top-1,000 candidate paragraphs for very long queries. Extending on prior work, Mackenzie et al. showed how a range of per-query efficiency settings can be accurately predicted such that 99.99 percent of queries are serviced in less than 200 ms without noticeable effectiveness loss. In addition, a reference list framework was used for training models such that no relevance judgements or annotations were required. Future work will focus on improving the candidate generation stage in large-scale multi-stage retrieval systems. This will include further exploration of index layouts, traversal strategies, and query rewriting, with the aim of improving early stage efficiency to reduce the system tail latency, while potentially improving end-to-end effectiveness.

References

D. Broccolo, C. Macdonald, S. Orlando, I. Ounis, R. Perego, F. Silvestri, and N. Tonellotto 2013. Load-sensitive selective pruning for distributed search Proc. CIKM. 379--388.Google Scholar
C. L. A. Clarke, J. S. Culpepper, and A. Moffat. 2016. Assessing efficiency--effectiveness tradeoffs in multi-stage retrieval systems without using relevance judgments. Information Retrieval Vol. 19, 4 (2016), 351--377. Google ScholarDigital Library
M. Crane, J. S. Culpepper, J. Lin, J. Mackenzie, and A. Trotman 2017. A comparison of Document-at-a-Time and Score-at-a-Time query evaluation Proc. WSDM. 201--210.Google Scholar
J. S. Culpepper, C. L. A. Clarke, and J. Lin 2016. Dynamic cutoff prediction in multi-stage retrieval systems Proc. ADCS. 17--24.Google Scholar
S-W. Hwang, S. Kim, Y. He, S. Elnikety, and S. Choi 2016. Prediction and predictability for search query acceleration. ACM Trans. Web, Vol. 10, 3 (Aug. 2016), 19:1--19:28.Google ScholarDigital Library
M. Jeon, S. Kim, S-W. Hwang, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. 2014. Predictive parallelization: taming tail latencies in web search Proc. SIGIR. 253--262.Google Scholar
C. Macdonald, N. Tonellotto, and I. Ounis. 2012. Learning to predict response times for online query scheduling Proc. SIGIR. 621--630.Google Scholar
J. Mackenzie, R-C. Chen, and J. S. Culpepper 2016. RMIT at the TREC 2016 LiveQA Track. In Proc. TREC-25.Google Scholar
J. Mackenzie, J. S. Culpepper, R. Blanco, M. Crane, C. L. A. Clarke, and J. Lin. 2017. Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems. (2017). showeprint[arXiv]1704.03970 [cs.IR]Google Scholar
L. Tan and C. L. A. Clarke 2015. A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference. TKDE, Vol. 27, 11 (2015), 2865--2877. Google ScholarDigital Library
N. Tonellotto, C. Macdonald, and I. Ounis. 2013. Efficient and effective retrieval using selective pruning Proc. WSDM. 63--72. endthebibliographyGoogle Scholar

Index Terms

Managing Tail Latencies in Large Scale IR Systems
1. Information systems
  1. Information retrieval

Recommendations

Scalable and efficient processing of top-k multiple-type integrated queries
Abstract
In this paper, we define a new class of queries, the top-k multiple-type integrated query (simply, top-k MULTI query). It deals with multiple data types and finds the information in the order of relevance between the query and the object. Various ...
Read More
Hybrid Dynamic Pruning for Efficient and Effective Query Processing
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

The performance of query processing has always been a concern in the field of information retrieval. Dynamic pruning algorithms have been proposed to improve query processing performance in terms of efficiency and effectiveness. However, a single ...
Read More
Managing tail latency in large scale information retrieval systems

As both the availability of internet access and the prominence of smart devices continue to increase, data is being generated at a rate faster than ever before. This massive increase in data production comes with many challenges, including efficiency ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
August 2017
1476 pages
ISBN:9781450350228
DOI:10.1145/3077136
General Chairs:
Noriko Kando
National Institute of Informatics
,
Tetsuya Sakai
Waseda University
,
Hideo Joho
University of Tsukuba
,
Program Chairs:
Hang Li
Huawei Noah's Ark Lab
,
Arjen P. de Vries
Radboud University
,
Ryen W. White
Microsoft Cortana
Copyright © 2017 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 August 2017
Check for updates
Author Tags
efficiency
scalability
tail latency
Qualifiers
- abstract
Conference

Acceptance Rates
SIGIR '17 Paper Acceptance Rate78of362submissions,22%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 149
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Managing Tail Latencies in Large Scale IR Systems

SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scalable and efficient processing of top-k multiple-type integrated queries

Hybrid Dynamic Pruning for Efficient and Effective Query Processing

Managing tail latency in large scale information retrieval systems