skip to main content
10.1145/2745844.2745873acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
research-article

Reducing Latency via Redundant Requests: Exact Analysis

Published:15 June 2015Publication History

ABSTRACT

Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a request on multiple servers and wait for the first completion (discarding all remaining copies of the request). However there is no exact analysis of systems with redundancy.

This paper presents the first exact analysis of systems with redundancy. We allow for any number of classes of redundant requests, any number of classes of non-redundant requests, any degree of redundancy, and any number of heterogeneous servers. In all cases we derive the limiting distribution on the state of the system.

In small (two or three server) systems, we derive simple forms for the distribution of response time of both the redundant classes and non-redundant classes, and we quantify the "gain" to redundant classes and "pain" to non-redundant classes caused by redundancy. We find some surprising results. First, the response time of a fully redundant class follows a simple Exponential distribution and that of the non-redundant class follows a Generalized Hyperexponential. Second, fully redundant classes are "immune" to any pain caused by other classes becoming redundant.

We also compare redundancy with other approaches for reducing latency, such as optimal probabilistic splitting of a class among servers (Opt-Split) and Join-the-Shortest-Queue (JSQ) routing of a class. We find that, in many cases, redundancy outperforms JSQ and Opt-Split with respect to overall response time, making it an attractive solution.

References

  1. I. Adan and G. Weiss. A skill based parallel service system under FCFS-ALIS - steady state, overloads, and abandonments. Stochastic Systems, 4(1):250--299, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  2. G. Ananthanarayanan, A. Ghodsi, S. Shenker, and I. Stoica. Effective straggler mitigation: Attack of the clones. In NSDI, pages 185--198, April 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. Baccelli and A. Makowski. Simple computable bounds for the fork-join queue. Technical Report RR-0394, Inria, 1985.Google ScholarGoogle Scholar
  4. F. Baccelli, A. M. Makowski, and A. Shwartz. The fork-join queue and related systems with synchronization constraints: Stochastic ordering and computable bounds. Advances in Applied Probability, 21:629--660, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  5. A. Bassamboo, R. S. Randhawa, and J. A. V. Mieghem. A little flexibility is all you need: On the value of flexible resources in queueing systems. Operations Research, 60:1423--1435, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Borst, O. Boxma, and M. V. Uitert. The asymptotic workload behavior of two coupled queues. Queueing Systems, 43(1--2):81--102, January 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. F. Botta, C. M. Harris, and W. G. Marchal. Characterizations of generalized hyperexponential distribution functions. Communications in Statistics, Stochastic Models, 3(1):115--148, 1987.Google ScholarGoogle ScholarCross RefCross Ref
  8. O. Boxma, G. Koole, and Z. Liu. Queueing-theoretic solution methods for models of parallel and distributed systems. In Performance Evaluation of Parallel and Distributed Systems Solution Methods. CWI Tract 105 & 106, pages 1--24, 1994.Google ScholarGoogle Scholar
  9. H. Casanova. Benefits and drawbacks of redundant batch requests. Journal of Grid Computing, 5(2):235--250, February 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. J. W. Cohen and O. J. Boxma. Boundary Value Problems in Queueing System Analysis. North-Holland Publishing Company, 1983.Google ScholarGoogle Scholar
  11. J. Dean and L. A. Barroso. The tail at scale. Communications of the ACM, 56(2):74--80, February 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Fayolle and R. Iasnogorodski. Two coupled processors: The reduction to a Riemann-Hilbert problem. Zeitschrift fur Wahrscheinlichkeitstheorie und vervandte Gebiete, 47(3):325--351, 1979.Google ScholarGoogle Scholar
  13. L. Flatto. Two parallel queues created by arrivals with two demands II. SIAM Journal on Applied Mathematics, 45(5):1159--1166, October 1985.Google ScholarGoogle ScholarCross RefCross Ref
  14. L. Flatto and S. Hahn. Two parallel queues created by arrivals with two demands I. SIAM Journal on Applied Mathematics, 44(5):250--255, October 1984.Google ScholarGoogle ScholarCross RefCross Ref
  15. K. Gardner, S. Zbarsky, S. Doroudi, M. Harchol-Balter, E. Hyytiä, and A. Scheller-Wolf. Queueing with redundant requests: First exact analysis. Technical Report Carnegie Mellon University-CS-14--143R, January 2015.Google ScholarGoogle Scholar
  16. M. Harchol-Balter, C. Li, T. Osogami, A. Scheller-Wolf, and M. Squillante. Cycle stealing under immediate dispatch task assignment. In Annual Symposium on Parallel Algorithms and Architectures, pages 274--285, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Hooghiemstra, M. Keane, and S. V. de Ree. Power series for stationary distributions of coupled processor models. SIAM Journal on Applied Mathematics, 48(5):861--878, October 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Huang, W. Hung, and K. G. Shin. FS2: dynamic data replication in free disk space for improving disk performance and energy consumption. In Proc. of SOSP'05, pages 263--276, December 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Joshi, Y. Liu, and E. Soljanin. Coding for fast content download. In Allerton Conference'12, pages 326--333, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  20. G. Joshi, Y. Liu, and E. Soljanin. On the delay-storage trade-off in content download from coded distributed storage systems. IEEE Journal on Selected Areas in Communications, 32(5):989--997, May 2014.Google ScholarGoogle ScholarCross RefCross Ref
  21. J. Keilson and L. Servi. A distributional form of Little's Law. Operations Research Letters, 7(5):223--227, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Kim and A. K. Agrawala. Analysis of the fork-join queue. IEEE Transactions on Computers, 38(2):1041--1053, February 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. G. Konheim, I. Meilijson, and A. Melkman. Processor-sharing of two parallel lines. Journal of Applied Probability, 18(4):952--956, December 1981.Google ScholarGoogle ScholarCross RefCross Ref
  24. G. Koole and R. Righter. Resource allocation in grid computing. Journal of Scheduling, 11:163--173, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Nelson and A. N. Tantawi. Approximate analysis of fork/join synchronization in parallel queues. IEEE Transactions on Computers, 37(6):739--743, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. Osogami, M. Harchol-Balter, and A. Scheller-Wolf. Analysis of cycle stealing with switching times and thresholds. In SIGMETRICS, pages 184--195, June 2003.Google ScholarGoogle Scholar
  27. N. B. Shah, K. Lee, and K. Ramchandran. The MDS queue: Analysing latency performance of codes and redundant requests. Technical Report arXiv:1211.5405, November 2012.Google ScholarGoogle Scholar
  28. N. B. Shah, K. Lee, and K. Ramchandran. When do redundant requests reduce latency? Technical Report arXiv:1311.2851, June 2013.Google ScholarGoogle Scholar
  29. A. L. Stolyar and T. Tezcan. Control of systems with flexible multi-server pools: a shadow routing approach. Queueing Systems, 66:1--51, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Tsitsiklis and K. Xu. On the power of (even a little) resource pooling. Stochastic Systems, 2:1--66, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  31. J. Tsitsiklis and K. Xu. Queueing system topologies with limited flexibility. In SIGMETRICS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Visschers, I. Adan, and G. Weiss. A product form solution to a system with multi-type jobs and multi-type servers. Queueing Systems, 70:269--298, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Vulimiri, P. B. Godfrey, R. Mittal, J. Sherry, S. Ratnasamy, and S. Shenker. Low latency via redundancy. In CoNEXT, pages 283--294, December 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. Wang, G. Joshi, and G. Wornell. Efficient task replication for fast response times in parallel computation. Technical Report arXiv:1404.1328, April 2014. Google ScholarGoogle Scholar
  35. C. Xia, Z. Liu, D. Towsley, and M. Lelarge. Scalability of fork/join queueing networks with blocking. In SIGMETRICS, pages 133--144, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reducing Latency via Redundant Requests: Exact Analysis

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGMETRICS '15: Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems
            June 2015
            488 pages
            ISBN:9781450334860
            DOI:10.1145/2745844

            Copyright © 2015 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 15 June 2015

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            SIGMETRICS '15 Paper Acceptance Rate32of239submissions,13%Overall Acceptance Rate459of2,691submissions,17%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader