Keywords

1 Introduction

Our study is motivated by a problem in the operations of the specialist out-patient clinics (SOPCs) in Hong Kong public hospitals. The SOPCs need to reserve capacity for urgent appointment request while maintaining high utilization of the expensive specialists’ time [2]. When new cases are referred to the SOPCs in public hospitals, they are triaged, and the patients’ first consultation appointments are made based on their clinical conditions. Patients are classified as categories of urgent, semi-urgent, and routine, and according to a service level requirement, the urgent and semi-urgent (regarded as high-priority class) patients need to be arranged to have their first medical consultations within 2 weeks and 8 weeks respectively as far as possible. In an SOPC, a quota of service capacity is generally reserved for high priority patients, and the remaining capacity for the routine patients. Since a health care system involving many resources is too complicated to be adjusted on a daily basis, a quota is used to control the workload. In practice, it is observed that routine patients often suffer from prolonged waiting time while the quota for high priority patients are not fully utilized. Therefore, an important operational problem is to determine the minimum capacity (i.e., quota) reserved for the high priority class so that the service level requirement is met and at the same time the waiting time for routine patients can be reduced.

To address the problem, we study a G/D/1 queueing system that models the service operations for high priority class. We assume that the service time is deterministic. (In the SOPC problem, the daily quota is fixed though the consultation times for patients may slightly vary; refer to [11] for similar practices.) The service level is measured by the probability of customers waiting longer than a pre-specified duration. The problem in question is to find the capacity allocation to meet a required service level. It should be noted that this model applies not only to the SOPC operations, but also to problems in other health care services and other service systems in similar situations.

We propose two solutions: a parametric approach based on the effective bandwidth theory (e.g., [3]), and a data-driven approach based on the sample average approximation (e.g., [1]). The former approach builds on the vast amount of classical studies on the G/D/1 (or M/D/1) queue and its waiting time distribution; refer to, e.g., [3, 9]. The other approach relates to more recent researches in developing the so-called date-driven approach for operations management problems, e.g., [5, 8].

The effective bandwidth theory provides an explicit approximation for the tail probability of the waiting time distribution under suitable assumptions on the arrival and service processes. Hence, the performance measures under study can be estimated via estimating the key parameters such as the arrival rate of the system. It can also be used for theoretical analysis. In contrast, using our data-driven approach, specifically the sample average approximation, we are able to construct the performance measures such as the waiting time and its tail probability directly from the primitive data such as the interarrival times. It is simple and easy to implement, and can incorporate bootstrapping to improve the accuracy. Moreover, it can be applied to more general scenarios, for example, to a system with time varying arrivals or general interarrival time distribution. The numerical study also demonstrates that it provides a accurate result than the parametric approach.

The rest of the paper is organized as follows. We describe the model in Sect. 2. The two solution methods, a parametric approach and a data-driven approach, are introduced and analyzed in Sects. 3 and 4, respectively. We then conduct a numerical study in Sect. 5. All the proofs are relegated to the appendix.

2 Model Description

We consider a classical G/G/1 queueing system. Customers arrive at the system following a renewal process with arrival rate \(\lambda \), and receive service following the order of arrival. Denote the interarrival time between consecutive arrivals by \(A_m\), \(m= 1, 2, \cdots \). Denote the service time of the m-th customer by \(X_m\), \(m = 1, 2, \cdots \), and the service rate by c, where \(c > 0\). Assume that the interarrival time sequence and service time sequence are mutually independent i.i.d. random sequences, and they all have finite moments.

Let \(w_m\) be the waiting time of the m-th arrival, \(m=1,2,\cdots \). Following the well-known Lindley equation (e.g., [7]), the waiting time for the m-th arrival can be written as,

$$\begin{aligned} w_m=(w_{m-1}+X_{m-1}-A_{m-1})^+. \end{aligned}$$
(1)

Under the normal traffic condition, i.e., \(\lambda < c\), the system will enter into the steady-state and the waiting time distributions will approach a stationary distribution as \(m\rightarrow \infty \).

The maximum customer waiting time specified in the service level requirement is denoted by \(b > 0\), and the probability of waiting no more than b denoted by \(1- \alpha \). Therefore, we will find the minimum service capacity c that satisfies

$$\begin{aligned} Pr(w>b)\le \alpha , \end{aligned}$$
(2)

where the random variable w represents the stationary waiting time and its distribution depends on c.

We introduce two useful tools for our analysis below, namely, the effective bandwidth approach and the sample average approximation.

Lemma 1

(Effective bandwidth, see Kelly [3]). Let A and X be random variables representing the inter-arrival time and service time distributions, and \(\kappa \) a positive constant satisfying \(E(e^{\kappa X})E(e^{-\kappa A})=1\). Then, there exist constants \(a_1,a_2\le 1\) such that

$$a_1e^{-\kappa b}\le Pr(w>b)\le a_2 e^{-\kappa b}.$$

For the result in Lemma 1, Kingman [4] first specified the constants as

$$a_1=\inf _{x>0} \frac{\int _{x}^{+\infty }dF(y)}{\int _{x}^{+\infty }e^{\kappa (y-x)} dF(y)}, \text{ and } a_2=1 ;$$

and later Ross [6] improved the bounds as

$$\begin{aligned}&a_1 = \frac{e^{-\kappa b}}{\sup _{r\ge 0}E[e^{\kappa (U-r)}|U>r]} \text{ and } a_2 = \frac{e^{-\kappa b}}{\inf _{r\ge 0}E[e^{\kappa (U-r)}|U>r]}. \end{aligned}$$
(3)

In the above, we let \(U=X-A\) and its cumulative distribution function be F(x).

The second tool is the sample average approximation of the tail distribution of the stationary waiting time w. We have the following result.

Lemma 2

(Chang [1]). The tail probability of the stationary waiting time can be estimated as:

$$\begin{aligned} Pr(w>b) = \lim _{m\rightarrow \infty } Pr(w_m>b) =\lim _{m\rightarrow \infty }E[I_{(w_{m}>b)}] = \lim _{n\rightarrow \infty } \frac{\sum _{k=1}^nI_{(w_{k}>b)}}{n} \end{aligned}$$
(4)

where \(I_{(x)}\) is an indicator function, which is equal to 1 if the condition x is true and equal to 0 otherwise.

3 Parametric Approach via Effective Bandwidth

Consider the M/D/1 system, where the arrival is Poisson and the service time for each customer is equal to the constant 1/c. Applying Lemma 1 to the M/D/1 system, we can bound the tail distribution of the stationary waiting time w as,

$$\begin{aligned} \frac{1-e^{-\lambda /c}}{1-e^{-(\kappa +\lambda )/c}} e^{-\kappa b} \le Pr(w>b) \le e^{-\kappa b}. \end{aligned}$$
(5)

The proof of the above bounds is relegated to the appendix.

Observe that as \(b\rightarrow \infty \), both bounds approach 0. In addition, it is interesting to note that the gap between the coefficients in the above upper and lower bounds diminishes when the traffic intensity \(\lambda /c\) approach 1; that is,

$$\begin{aligned} 1 - \frac{1-e^{-\lambda /c}}{1-e^{-(\kappa +\lambda )/c}} \rightarrow 0, \text{ as } \frac{\lambda }{c} \rightarrow 1. \end{aligned}$$
(6)

(Refer to the appendix for detailed justification of the above result as well.) Given these observations, we will choose the upper bound as an estimation of the tail probability of the stationary waiting time w. Hence, enforcing the service requirement in (2) is reduced to requiring the above upper bound being equal to \(\alpha \), that is,

$$\begin{aligned} e^{-\kappa b} = \alpha , \text{ or } \kappa =- \frac{ \log (\alpha )}{b}. \end{aligned}$$
(7)

We solve the equation, \(e^{{\kappa }/{c} }E(e^{-\kappa A})=1\) in Lemma 1 and obtain the required capacity c as,

$$\begin{aligned} c = \frac{\kappa }{\log (1+\kappa /\lambda )}. \end{aligned}$$
(8)

Using (7) and (8), we propose the following procedure to estimate the required capacity, at each time when new data becomes available.

Repeat for each time \(t=1,\ldots ,T\):

  1. 1.

    Let \(n_t\) be the number of arrivals by time t; collect data of inter-arrival times \(A_i\), \(i=1,\cdots ,n_{t}\);

  2. 2.

    Calculate the sample mean as \(\lambda _t=1/\sum _{i=1}^{n_t}A_i\), and then the required service capacity as \(c_t=\kappa / \log (1+\kappa /\lambda _t)\).

An observation from the above procedure is that the estimate \(c_t\) converges to the target given in (8) as time \(t \rightarrow \infty \). However, it should be noted that \(c_t\) does not give an unbiased estimate of c at any finite time t. Specifically, denote \(f(\lambda )\equiv \kappa / \log (1+\kappa /\lambda )\). Clearly, it is concave in \(\lambda \). Therefore, we have

$$ E(c_t) = Ef({\lambda _t}) \le f(E({\lambda _t})) = c.$$

The data-driven approach introduced in the next section avoids this pitfall and gives a more direct approximation.

4 Data-Driven Approach via Sample Average Approximation

The data-driven approach applies to the more general G/D/1 system, compared to the parametric approach in Sect. 3. As the service time is deterministic, the Lindley equation in (1) can be used to estimate a customer’s waiting time \(w_m\) upon her arrival (i.e., whenever \(A_m\) is observed). Therefore, Lemma 2 suggests an estimate of the probability \(Pr(w >b)\) by \(Pr_t := {\sum _{m=1}^{n_t}I_{(w_{m}>b)}} / {n_t}\) at any time t, where \(n_t\) denote the number of arrivals by time t. Recall, given the (primitive) arrival data by time t, the performance measures, i.e., \(w_1, \cdots , w_{n_t}\) and \(Pr_t\), are functions of the suppressed parameter c. Thus, we propose the following procedure to estimate the required service capacity.

Repeat for each time \(t=1,\ldots ,T\):

  1. 1.

    Collect data of inter-arrival times \(A_i\), \(i=1,\cdots ,n_{t}\);

  2. 2.

    Search the minimum c, denoted by \(c_t\), that ensures \(Pr_t \le \alpha \).

The procedure can be improved by applying bootstrapping to generate a sufficiently long sample path, \(\{{\tilde{A}}_i, i=1,\cdots ,N \}\). Here, N is a sufficiently large integer, and each interarrival time, \({\tilde{A}}_i\), is drawn with equal probability from the set of interarrival times, \(\{A_i, i=1,\cdots , n_t \}\), that is available by time t. Then, in the above procedure, calculate the probability (estimate) \(Pr_t\) as:

$$Pr_t = \frac{\sum _{i=1}^{N}I_{(w_{i}>b)}}{N},$$

where the waiting time \(w_m\) is now determined by (1) with \(A_i\) replaced by \({\tilde{A}}_i\).

5 Numerical Studies

We conduct a numerical experiment to compare the parametric approach with the data-driven approach using combinations of parameters in Table 1. The numerical experiment will run \(4\times 4\times 4\) times, covering all combinations of the parameters.

Table 1. Parameters

5.1 Convergence Analysis

First we investigate the rate of convergence of each approach. Since both parametric and data-driven approaches can be applied to M/D/1 system, we compare their numerical results for this case.

Given an arrival rate \(\lambda \) and a capacity c, we can use effective bandwidth theory, i.e., Lemma 1 along with (3), to calculate the upper bound and lower bound of the tail probability. Hence, we can determine the capacity c by binary search so that the tail probability meets the service level requirement with the specified values of b and \(\alpha \). Table 2 shows the (relative) error between the upper bound and lower bound for different arrival rates and capacities (\(\lambda \) and c) for \(\alpha =0.05\). The error, or ratio, is given as \((UB-LB)/LB\). Consider the case that the average arrival rate is \(\lambda = 20\) and it is required that customers be served within 7 days (\(b=7\)). When the service capacity ranges from 20.05 to 20.4 (or, the traffic intensity varies from 99.75% to 98.03%), the error is always kept under 3%, which is consistent with the observation given in (6). A similar observation can be found for the case of \(\lambda =10\) and \(b=7\). Hence, these observations indicate that both bounds as given in Lemma 1 can be used as an effective approximation for the tail probability of the stationary waiting time of customers.

Next we use the bounds given in (3) as the benchmark for evaluating the data-driven approach and parametric estimation described in Sects. 3 and 4.

Table 2. Errors of bounds using effective bandwidth for Poisson arrival process

Define the relative gap, \({|c_{\mathrm {d}}-c|}/{|c_{\mathrm {e}}-c|}\), where \(c_{\mathrm {d}}\) and \(c_{\mathrm {e}}\) are the capacity estimated using the data-driven and parametric approaches, and c is the optimal service capacity computed using the given parameters (\(\lambda \), b and \(\alpha \)). In other words, it is the ratio of the gap between the data-driven estimated capacity and the optimal capacity to the gap between the effective bandwidth estimation and the optimal capacity. Below, as we do not have the exact optimal capacity c, we will use its two bounds and the average of both.

We use the values of the three parameters \(\lambda \), b and \(\alpha \) in Table 1, fixing one of them at a time and varying the other two. For each combination of the parameter values, we calculate the relative gap.

The Impact of \(\varvec{\lambda }\) . Figure 1 plots the average of relative gaps for all combinations of parameter values given a fixed value of \(\lambda \). We can see that for any \(\lambda > 0\), the relative gap (ratio) is less than 100%, which indicates that the data-driven approach outperforms the parametric approach. In addition, as the arrival rate \(\lambda \) increases the performance difference of the two approaches diminishes.

Fig. 1.
figure 1

The impact of \(\lambda \)

The Impact of b and \(\alpha \) . Figures 2 and 3 plot the average of relative gaps for all combinations of parameter values given a fixed value of b and a fixed value of \(\alpha \), respectively. Figure 2 shows that as the waiting time target b decreases, the relative gap is greatly reduced; that is, the data-driven approach performs much better than the parametric approach for a shorter waiting time target b. As the relative gap is below 100% for a large range of values of b, the data-driven approach outperforms the parametric approach. Figure 3 shows a similar pattern as Fig. 2 except that the relative gap decreases at large values of \(\alpha \).

Fig. 2.
figure 2

The impact of b

Fig. 3.
figure 3

The impact of \(\alpha \)

Extreme Cases. Figures 1, 2 and 3 demonstrate that the data-driven approach performs much better than the parametric approach at small values of the parameters \(\lambda \), b and \(\alpha \). Figure 4 shows the estimated capacities by the two approaches for small values of \(\lambda \), b and \(\alpha \).

In the figure, ‘PE’ represents ‘parametric estimation’, ‘BUB’ and ‘BLB’ are the estimated upper bound and lower bound by effective bandwidth for a given (actual) arrival rate. From the figure, the gap between the upper bound and lower bound with actual arrival rate as benchmark is very small, hence they provide a reliable estimate when all parameters are known. We also observe that both parametric and data-driven estimations converge to the theoretical optimal capacity as time becomes longer, and that the data-driven approach converges slightly faster.

Fig. 4.
figure 4

Extreme case analysis

5.2 Advantages of Data-Driven Approach over Parametric Approach for G/D/1 Queue

The numerical result in Sect. 5.1 for the M/D/1 system has shown that the data-driven approach performs better than the parametric approach in all combinations of our parameter values. We examine if the finding still holds for a more general queueing system, for example, G/D/1. We investigate the performance of parametric estimation in this case by using Lemma 1.

Stability and Accuracy of Estimating the Parameter \(\kappa \) . For the parametric estimation, in order to estimate the tail probability of waiting time, we need to first estimate the parameter \(\kappa \) in the upper bound and lower bound as given in (5). For a general arrival process, by Lemma 1 the upper bound and lower bound for the tail probability of waiting time depend on the (empirical) cumulative density function of the number of daily arrivals. We use exponential distribution for the inter-arrival time, and the numerical results are listed in Table 3. The computation is conducted as follows: Generate a large number of exponentially distributed samples and use empirical cumulative distribution function to approximate its distribution, and then use (3) to derive the estimated parameter \(\kappa _{\mathrm {e}}\). \(\kappa \) is calculated by (3) based on the assumption of exponential interarrival time. Theoretically the two numbers should be the same because they are based on the same distribution and parameter values. As shown in Table 3, however, \(\kappa _{\mathrm {e}}\) (by parametric estimation) does not converge to the real optimal \(\kappa \), which indicates that the estimation formulated by effective bandwidth does not work any more when the interarrival time is fitted by the empirical distribution.

Table 3. Gaps of bounds when the inter-arrival time has a general distribution

Parametric Estimation for the Case of Log-Normal Distribution. Consider the case with log-normal interarrival time that has parameters \((\mu _{\mathrm {l}},\sigma _{\mathrm {l}})\). For comparison, also scale its mean to be \(1/\lambda \). We apply the effective bandwidth theory explained in Lemma 1 to show the gap between two bounds.

As we can see from Table 4, the gaps of two bounds are all greater than 10%, and in some cases (e.g., the 1st and the 4th cases) the upper bound can not be found in reasonable time. Therefore, the parametric approach based on effective bandwidth does not give a reliable estimate here.

Table 4. Gaps of bounds when the inter-arrival time has a lognormal distribution

According to the above two numerical experiments, when the arrival process is more general (i.e., not necessarily Poisson), the parametric approach does not give an accurate approximation for the tail probability of waiting time while the data-driven approach works well.