Bouncer: Admission Control with Response Time Objectives for Low-latency Online Data Systems

Published: 09 June 2024 Publication History


Internet companies rely on low-latency online data systems to provide quick responses to users. These systems employ complementary overload management techniques to offer a continued, acceptable service throughout traffic surges, where "acceptable" partly means that serviced queries meet or track closely their response time objectives. Thus, in this paper we present Bouncer, an admission control policy aimed to keep admitted queries under or near their service level objectives (SLOs) on percentile response times. Bouncer decides to accept or reject incoming queries based on inexpensive estimates of such percentiles. It can assign separate SLOs to different classes of queries in the workload, and implements early rejections to let clients react promptly and help data systems avoid doing useless work. We propose two starvation avoidance strategies that supplement Bouncer's basic formulation and prevent query types from receiving no service. Our evaluation, in simulation and on a production-grade distributed graph database, shows that Bouncer and its starvation-avoiding variants 1) let admitted queries meet or stay close to their SLOs when other in-house policies do not, and 2) report fewer overall rejections and a small overhead, while letting the system reach high utilization. We observe that the proposed strategies can prevent query starvation, but with a modest increase in rejections and with SLO violation counts for serviced queries that may be acceptable in practice.


