On the many-server fluid limit for a service system with routing based on delayed information

https://doi.org/10.1016/j.orl.2021.03.001Get rights and content

Abstract

Pender, Rand and Wesson recently established a delay differential equation limit for a parallel service system with routing based on delayed information. We provide an interpretation of their scaling under which their limit can be regarded as an instance of a law of large numbers in the familiar many-server heavy-traffic scaling. It requires scaling in the probabilistic routing function. We also obtain related many-server heavy-traffic delay-differential-equation limits for more general models.

Introduction

Pender, Rand and Wesson [14] established a delay differential equation (deterministic fluid) limit for a parallel service system with routing based on delayed information about the system state. That limit is useful because it helps quantify and understand the impact of the delay. We introduce an interpretation of the scaling used in [14], which shows that their limit can be regarded as being consistent with a law of large numbers in the familiar many-server heavy-traffic scaling in [12] and many other papers. With many-server scaling, an additional scaling of the probabilistic routing function is required. (See (2)–(5) for the definition of the routing and (7)–(8) for the proposed scaling.) Lemma 3.1 shows that the many-server scaling here with the scaling of the routing in (8) produces the same scaling used in [14]. Hence, the limit obtained by the scaling here is equivalent to the limit in [14], so we are primarily translating into the many-server heavy-traffic framework. The paper [14] nicely shows the implications of the limit by analyzing the limiting delay differential equation.

Related early work involving delayed information can be found in [4], [5] on the study of rate control in communication networks and in citations to these papers. More recently, information delay has played an important role in high-speed financial trading; e.g., [1], [7].

In Section 2 we introduce our model, which is a modification of the model in [14], covering the model with finitely many servers in each group and customer abandonment (analog of the Erlang A model). In Section 3 we introduce the scaling. In Section 4 we present the limit. In Section 5 we extend the results to the more general time-varying non-Markov GtGIst+GI model in [8], [9]. In Section 6 we conclude with additional discussion.

Section snippets

The model

The queueing model has Poisson arrivals at rate λ, routing to one of N multi-server service groups based on the queue lengths (numbers in the service group, either waiting or being served) in these N service groups in the past. For simplicity, suppose that the system starts empty at time 0 and was empty in the past before time 0. A somewhat more general initial condition is considered in [14]; it is not difficult to treat that extension. Let the total number of arrivals over the interval [0,t]

Many-server scaling

We now introduce what we regard as a natural many-server scaling. We construct a sequence of parallel-server models indexed by positive integers η, as in [14]. For each η, there is a parallel-server model consisting of N queues with si(η) servers in service group i, 1iN. Consistent with [12], we let the arrival rate and number of servers in each service group grow, but we hold the service-time distribution and the information delay fixed. In particular, we let the parameters in model η be (λ(η

Convergence as η

As noted in [14], the functional strong law of large numbers for a Poisson process can be applied to obtain the desired limit. The first step has η1Π(ηt)tasη,uniformly in t over bounded intervals with probability 1, where Π is a unit-rate Poisson process, as in (10).

Lemma 3.1 with (11) yields the limit in [14], extended to the multi-server model. We use function space notation as in [16]; i.e., the limit in (12) means that Q̄(η)(t)q(t) as η uniformly in t in any bounded subinterval of [0,

Extension to time-varying non-Markov models

In this section we briefly indicate how to obtain corresponding results for the time-varying non-Markov parallel network of N GIst+GI models with a single Gt arrival process by modifying the results in [8], [9]. We first consider how to treat the extension of the fluid model in [8].

Initially assume that each arrival is routed to each of the N queues with probability 1N. Then each of the N queues is a GtGIst+GI model studied in [8]. For each of these models, the performance of the fluid

Discussion

In this final section we discuss the applied relevance of the results and extensions.

Acknowledgment

I thank Jamol Pender for helpful comments and suggestions.

References (16)

  • FendickK.W. et al.

    Analysis of a rate-based control strategy with delayed feedback

    Perform. Eval.

    (1992)
  • LiuY. et al.

    A many-server fluid limit for the GtGIst+GI queueing model experiencing periods of overloading

    Oper. Res. Lett.

    (2012)
  • BudishE. et al.

    The high-frequence trading arms race: frequent batch auctions as a market design response

    Q. J. Econ.

    (2015)
  • DoldoP. et al.

    Breaking the symmetry in queues with delayed information

    (2020)
  • DongJ. et al.

    The impact of delay announcements on hospital network coordination and waiting times

    Manage. Sci.

    (2018)
  • FendickK.W. et al.

    Asymptotic analysis of adaptive rate control for diverse sources with delayed feedback

    IEEE Trans. Inform. Theory

    (1994)
  • GurvichI. et al.

    Scheduling flexible servers with convex delay costs in many-server service systems

    Manuf. Serv. Oper. Manag.

    (2009)
  • LewisM.E.

    Flash Boys, A Wall Street Revolt

    (2014)
There are more references available in the full text version of this article.
View full text