Keywords

1 Introduction

In this era of instant gratification, consumers demand access to goods and services in real-time. Business ventures, therefore, have to schedule their delivery of goods and services often without the complete knowledge of the future requests or their order of arrival. To assist with this, we need robust and competitive online algorithms that immediately and irrevocably allocate resources to requests in real-time with minimal cost.

In this paper, we consider the celebrated k-server problem. Initially, we are given a set of locations \(\mathcal {K}\) for the k servers. These are locations in a discrete metric space. Requests are generated in the same metric space by an adversary and are revealed one at a time. When a request \(r \in \sigma \) is revealed, we have to immediately move one of the k servers to serve this request and incur a cost equal to the distance between their locations. The objective is to design an algorithm that allocates servers to requests that is competitive with respect to the minimum-cost offline solution.

For any algorithm \(\mathcal {A}\), initial configuration of servers \(\mathcal {K}\), and a sequence of requests \(\sigma \), let \(w_{\mathcal {A}}(\sigma ,\mathcal {K})\) be the cost incurred when the algorithm \(\mathcal {A}\) assigns the requests to servers. Let \(w_{\mathrm {OPT}}(\sigma ,\mathcal {K})\) be the smallest cost solution generated by an offline algorithm that has complete knowledge of the request sequence \(\sigma \) and assigns servers to requests based on their arrival order. We say that \(\mathcal {A} \) is \(\alpha \)-competitive if, for a constant \(\varPhi _0\ge 0 \), the cost incurred by our algorithm satisfies,

$$w_{\mathcal {A}}(\sigma ,\mathcal {K})\le \alpha w_{\mathrm {OPT}}(\sigma , \mathcal {K}) + \varPhi _0$$

for any request set and their arrival order.

In the adversarial model, there is an adversary who knows the server locations and the assignments made by the algorithm and generates a sequence to maximize \(\alpha \). In the random arrival model [1], the adversary chooses the locations of the requests in \(\sigma \) before the algorithm executes but their arrival order is a permutation chosen uniformly at random from the set of all possible permutations of the requests. In practical situations, it may be useful to assume that the requests are arriving i.i.d. from a known or an unknown distribution \(\mathcal {D}\). Under these (known and unknown) models, the adversary is weaker than in the random arrival model and therefore, the competitive ratio for the random arrival model is an upper bound on its competitive ratio in the known and the unknown distribution models; see [2] for an algorithm in these models. We refer to the k-server problem under the known and unknown distribution as well as the random arrival model as the stochastic k-server problem.

For the stochastic k-server problem, the competitive ratio is expressed with respect to the expected costs. More specifically, \(\mathcal {A}\) is \(\alpha \)-competitive if,

$$\begin{aligned} \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]\le \alpha \mathbb {E}[w_{\mathrm {OPT}}(\sigma , \mathcal {K})] + \varPhi _0. \end{aligned}$$

Previous Work. The k-server problem is central to the theory of online algorithms. The problem was first posted by Manasse et al. [3]. In the adversarial model, the best-known deterministic algorithm for this problem is the \((2k-1)\)-competitive work function algorithm [4]. It is known that no deterministic algorithm can achieve a competitive ratio better than k and is conjectured that in fact there is a k-competitive algorithm for this problem. This conjecture is popularly called the k-server conjecture.

Bansal et al. [5] presented an \(O(\log ^{O(1)} n \log k)\)-competitive randomized algorithm for the k-server problem in the oblivious adversary model. This model is similar to the adversarial model; however, the adversary does not have any knowledge of the random choices made by the algorithm. There is also an online algorithm [6] for the closely related online metric matching problem. This algorithm is known to achieve an optimal competitive ratio of \(2H_n -1\) in the random arrival model. To the best of our knowledge, the stochastic k-server problem has not been studied before.

In the stochastic models, we can view any initial set of requests to be a sample that is chosen uniformly at random from \(\sigma \). In this paper, using this sample, we approximate the k-median of the remaining requests leading to an improved algorithm for the stochastic k-server problem. Independent and uniform random samples have been used before in the context of sub-linear time algorithms for the k-median problem; see [7, 8]. In [7], it has been shown that a random sample of size \(\tilde{O}(\varDelta /\epsilon ^2)\) can be used to approximate the k-median within a constant factor of the optimal k-median with an additional additive cost of \(\epsilon n\). Meyerson et al. [8] show that if all the optimal k-median clusters are dense (\({\ge }\epsilon n/k\)), then a very small random sample of size \({\approx }k/\epsilon \) can be used to approximate the k-median within a constant factor. In this paper, we will present and analyze a k-median based deterministic algorithm for the stochastic k-server problem.

Our Results. First, we present a simple algorithm which we refer to as the zoned algorithm for the k-server problem in the known distribution model, i.e., the request locations are i.i.d from a distribution \(\mathcal {D}\) on the discrete metric space. The zoned algorithm will associate one server to each of the centers of the k-median of \(\mathcal {D}\) and when a request arrives, the algorithm simply assigns the server associated with the closest k-median center to this request. The cost of serving any request is lower bounded by the average k-median cost of the distribution \(\mathcal {D}\). Using triangle inequality, we can bound the cost incurred by zoned algorithm by twice the cost incurred by any optimal online algorithm for this problem.

Next, for the unknown distribution model and the random arrival model, we present an adaptive version of the zoned algorithm. Let \(\sigma =\langle r_1,\ldots , r_n\rangle \) be the request sequence. Our algorithm will batch the requests into \(\log n\) groups where the \((i+1)^{st}\) group (denoted by \(\sigma _{i+1}\)) contains requests \(\langle r_{2^i +1},\ldots , r_{2^{i+1}}\rangle \). To process requests group \((i+1)\), we apply the zoned algorithm using the k-median centers of the first \(2^i\) requests.

In the random arrival model, the first t requests is a uniformly chosen random subset of size t. Using existing bounds [7], a random subset of size \(\tilde{O}(\varDelta /\epsilon ^2)\) can be used to estimate the average k-median cost within a constant factor with an additional additive cost of \(\epsilon \). Unfortunately, despite having a large random subset (\(\sigma _0\cup \sigma _1,\ldots \cup \sigma _{\log n -1}\)) of n/2 requests (i.e., \(\epsilon \approx \sqrt{\varDelta /n}\)) to estimate the k-median of n/2 requests of \(\sigma _{\log n}\), we can only bound the average k-median cost within a constant factor with an additional additive cost of \(\epsilon n/2 \approx \sqrt{\varDelta n}\). Lower bounds on uniform random sample based estimation of k median seem to suggest that there is very little scope for improving this analysis for small-sized random subsets; see [7] for details on the upper and lower bound. In our case, since the sample size is large (\({=}\,n/2\)), we present a different analysis to show that the k-median of this large random subset is a good proxy for the k-median of \(\sigma _{i+1}\). Using this analysis, we show that the overall cost incurred in serving requests of \(\sigma _{i+1}\) has an additive cost of only \(k\varDelta \) (independent of n) and the total additive cost over all the \(\log n\) groups to be \(O(k\varDelta \log n)\) leading to the following theorem (in Sect. 4.2):

Theorem 1

Let \(\sigma \) be a sequence of n requests from a discrete metric space (Xd). For any \(\alpha > 1.5\), the expected cost of the adaptive zoned algorithm (Algorithm 2) for serving \(\sigma \) is upper bounded as follows.

$$\begin{aligned} \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]\le 2\alpha n\,\mathrm {medavg}(\sigma )+ \bigg (\frac{8}{e}\Big (\frac{2\alpha +1}{2\alpha -3}\Big )^2+ 1\bigg )k\varDelta \log n +\varPhi _0. \end{aligned}$$

Here \(\mathrm {medavg}(\sigma )\) is the average k-median cost of all the requests in \(\sigma \) and is formally defined in Sect. 2.

In the random arrival model, the cost of serving the ith request can be lower bounded by the average k-median cost of the remaining (unprocessed) \(n-i+1\) requests. We show (in Sect. 4.3) that the lower bound on the cost of serving all the requests can still be related to the average k-median cost \(\mathrm {medavg}(\sigma )\) within an additive cost of \(O(k\varDelta \log n)\).

Theorem 2

Let (Xd) be any discrete metric space and \(\sigma \) be a multi-set of n points from X. Let \(\mathcal {A}\) be any online algorithm to serve \(\sigma \) under the random arrival model, with initial configuration of servers \(\mathcal {K}\). Let \(\varDelta \) be the diameter of X and for any \(0<\delta <1\), the expected cost of \(\mathcal {A}\) is

$$\begin{aligned} \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})] \ge&\frac{n-1}{2}\delta \,\mathrm {medavg}(\sigma )-\frac{2\delta }{(1-\delta )^2}(k+2)\varDelta \log n. \end{aligned}$$

Combining the two theorems, it follows that the adaptive zoned algorithm performs within a constant factor of the cost incurred by the best online algorithm along with an additional additive cost of \(O(k\varDelta \log n)\).

In Sect. 2, we present the basic terminology required for our algorithm and its analysis. In Sect. 3, we present the zoned algorithm for the known distribution model along with its analysis. In Sect. 4, we present the adaptive zoned algorithm in the random arrival model. We present our analysis of the upper bound of the cost in Sect. 4.2 and lower bound of the cost in Sect. 4.3.

2 Preliminaries

Let P be a multi-set of n points in a given discrete metric space (Xd). For any point \(p \in P\) and a set \(K \subset X\), we define d(pK) to be the distance of p to its nearest neighbor in K. We define the distance of the set P to the set K denoted by d(PK).

$$d(P,K)= \sum _{p\in P}d(p, K).$$

The average distance of P from K, denoted as \(d_{\text {avg}}(P,K)\) is \(d_{\text {avg}}(P,K)=\frac{1}{|P|}\sum _{x\in P}d(x,K)\). We define the k-median of P to be a set of k points \(K^* \subseteq X\), given by

$$\begin{aligned} K^* =\mathop {\hbox {arg min}}_{K\subset X, |K|=k} d(P,K). \end{aligned}$$

We refer to \(K^*\) as the k-median centers of P. The cost of the k-median \(K^*\) denoted by \(\mathrm {med}(P)\) is \( \mathrm {med}(P) = d(P,K^*)\), and the average cost of this k-median, denoted by \(\mathrm {medavg}(P)\), is given by

$$\mathrm {medavg}(P)= \frac{\mathrm {med}(P)}{|P|}.$$

In several instances, we denote the k-median of a set A by \(K^A\).

The definition of k-median \(K^*\) extends easily to the case where we are given a probability distribution \(\mathcal {D}(\cdot )\) on the discrete metric space X:

$$\begin{aligned} K^*= \mathop {\hbox {arg min}}_{K \subset X, |K|=k}\sum _{x \in X} \mathcal {D}(x) d(x,K), \end{aligned}$$

and let \(\mathrm {medavg}(\mathcal {D},X)=\sum _{x \in X} \mathcal {D}(x) d(x,K^*)\).

Theorem 3

(Chernoff bounds). Suppose \(X_1,\cdots ,X_n\) are independent random binary variables, X denotes their sum, and \(\mu =\mathbb {E}[X]\). Then

$$\begin{aligned} \mathbb {P}[X \ge (1+\delta )\mu ] \le e^{-\delta ^2\mu /3},&\quad 0<\delta <1,\end{aligned}$$
(1a)
$$\begin{aligned} \mathbb {P}[X \ge (1+\delta )\mu ] \le e^{-\delta \mu /3},&\quad 1<\delta , \end{aligned}$$
(1b)
$$\begin{aligned} \mathbb {P}[X \le (1-\delta )\mu ] \le e^{-\delta ^2\mu /2},&\quad 0<\delta <1. \end{aligned}$$
(1c)

3 Zoned Algorithm

We begin by introducing the zoned algorithm for the k-server problem in the known distribution (Algorithm 1) and the random arrival model (Algorithm 2). The core idea behind the algorithm is that the discrete metric space \((X,d)\) can be partitioned into k zones each with a single server. For any request in a given zone, the corresponding server of this zone will serve it. For the known distribution, the partition is induced by the k-median of the distribution and is presented below.

figure a

The following lemma bounds the cost of any online algorithm with the k-median cost.

Lemma 1

Given a discrete metric space (Xd), and any request sequence \(\sigma \) of n locations chosen i.i.d. from a known distribution \(\mathcal {D}\) on X, the expected cost of any online algorithm \(\mathbb {A}\) is

$$ \mathbb {E}[w_{\mathbb {A}}(\sigma ,\mathcal {K})]\ge n \cdot \mathrm {medavg}(\mathcal {D}, X). $$

Proof

When a request arrives the algorithm must assign a server to it, so the cost of the algorithm must be at least the distance from the closest server to the request. The expected cost is therefore bounded below by the expected distance of a request to its nearest server. This is minimized if the servers have the configuration of the k-median centers \(K^*\) with the expected distance of \(\mathrm {medavg}(\mathcal {D},X)\). The result then follows by linearity of expectation.

Theorem 4

The zoned algorithm \(\mathcal {A}\) has an expected cost that is at most twice the cost incurred by any optimal online algorithm in the known distribution model.

Proof

Let the initial configuration of the k servers be \(\mathcal {K}\). For every request \(r \in \sigma \), let \(k \in K^*\) be its closest center. The zoned algorithm will move the server \(\phi (k)\) to the request point r. By the triangle inequality this distance is less than if we had moved the server to \(c\) first, and then to r after. Every request under such a modification therefore incurs at most two costs, movement from \(c\) and movement to \(c\). The expected distance of any request to its closest center is \(\mathrm {medavg}(\mathcal {D},X)\), so using the modification, by linearity of expectation, and by Lemma 1:

$$\begin{aligned} \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})] \le 2n \cdot \mathrm {medavg}(\mathcal {D},X) \ + \varPhi _0 \le 2w_\mathbb {A}(\sigma ,\mathcal {K}) + \varPhi _0 \end{aligned}$$

for n requests, any online algorithm \(\mathbb {A}\), and where \(\varPhi _0\) is the cost of the matching of \(\mathcal {K}\) and \(K^*\).

4 Random Arrival Model

4.1 Adaptive Zoned Algorithm

We present and analyze a slightly modified zoned algorithm for the random arrival model. We partition the request sequence \(\sigma \) into \(\log n\) groups \(\sigma =\sigma _0\,^{\frown }\sigma _1, \ldots ,\,^{\frown }\sigma _{\log n}\) where \(|\sigma _0| =1\) and \(|\sigma _i|=2^{i-1}\). For the request in \(\sigma _i\) we apply the zoned algorithm by using the k-median of the requests in \(\sigma _0\cup \sigma _1,\ldots \cup \sigma _{i-1}\). Note that after serving requests in \(\sigma _i\), we need to recompute the k-median of all the requests seen so far and move the k-servers to these locations (implicitly through the mapping \(\phi \)). This results in a reconfiguration cost which is bounded by \(O(k\varDelta \log n)\). The algorithm is presented next.

figure b

4.2 Upper Bound

We use the following simple lemma in bounding the cost of the adaptive zoned algorithm.

Lemma 2

Let P be any set of n points and let Q be a subset of P that is chosen uniformly at random from all possible subsets of size t. Then,

$$ \mathbb {E}[d(Q,K^P)] = t \cdot \mathrm {medavg}(P). $$

Proof

Let \(\mathcal {S}_{t}\) be the set of all subsets of P with cardinality exactly t. The number of such subsets is given by \(|\mathcal {S}_{t}| =\) \(\left( {\begin{array}{c}n\\ t\end{array}}\right) \). Therefore, the expected value of \(d(Q,K^P)\) can be bounded by

$$\mathbb {E}[d(Q,K^P)] = \sum _{Q\in \mathcal {S}_{t}}\frac{1}{\left( {\begin{array}{c}n\\ t\end{array}}\right) }\sum _{q\in Q} d(q, K^P).$$

Every point \(q \in P\) appears in exactly \(\left( {\begin{array}{c}n-1\\ t-1\end{array}}\right) \) subsets of \(\mathcal {S}_{t}\). Therefore, we can rewrite the expected value as

$$ \mathbb {E}[d(Q,K^P)] = \frac{\left( {\begin{array}{c}n-1\\ t-1\end{array}}\right) }{\left( {\begin{array}{c}n\\ t\end{array}}\right) } \sum _{q \in P} d(q, K^P) = \frac{t}{n} \sum _{q\in P} d(q,K^P) = t\cdot \mathrm {medavg}(P).$$

Lemma 3

Let P be a set of n points with diameter \(\varDelta \) where n is a power of 2. For a random permutation of the points of P, let A and B correspond to the first n/2 and the last n/2 points of this permutation. Let \(K^P\) be the k-median centers of P. Given that A has been observed and \(K^A\) is the k-median centers of A, and for any \(\alpha > 1.5\), the expected value of distance \(d(B, K^{A})\) is at most \( \alpha n\,\mathrm {medavg}(P)+ \frac{4}{e}\Big (\frac{2\alpha +1}{2\alpha -3}\Big )^2k\varDelta . \)

Proof

Since A is the first n/2 points of a random permutation of P, A can be considered to be a subset chosen uniformly at random from the set of all subsets of size n/2. For the optimal k-median centers of A, namely \(K^A\), we will bound the expected cost of \(d(B, K^A)\). To help with the analysis, we partition P into k clusters based on assigning each point \(p \in P\) to its closest k-median center in \(K^P\) (the optimal k-median of P). Let \(C^{P}_j\) denote the jth cluster with \(k^{P}_j \in K^P\) as its median center and let \(s_j = |C^{P}_j|\) be the size of this cluster. Let \(A_j = A\cap C^{P}_j\) and \(B_j = B\cap C^{P}_j\). By using triangle inequality (as in [8]), we can bound the distance of median center of cluster \(C^{P}_j\), i.e., \(k^{P}_j\) to its closest median center in \(K^A\)

$$\begin{aligned} d(k^{P}_j,K^A)\le \min _{x\in A_j}\left( d(k^{P}_j,x)+d(x,K^A)\right) \le \frac{1}{|A_j|}\sum _{x\in A_j}\left( d(k^{P}_j,x)+d(x,K^A)\right) . \end{aligned}$$

The last inequality follows from the fact that the minimum of a set of numbers is bounded from above by its average. Using this bound, we can bound the distance from any point \(y\in C^{P}_j\) to its closest median center in \(K^A\) by

$$\begin{aligned} d(y,K^A)\le d(y,k^{P}_j)+d(k^{P}_j,K^A) \le d(y,k^{P}_j)+\frac{1}{|A_j|}\sum _{x\in A_j}(d(k^{P}_j,x)+d(x,K^A)). \end{aligned}$$

Therefore, the total distance of points in \(B_j\) to their closest center in \(K^A\) is bounded by

$$\begin{aligned} d(B_j,K^A) \le \sum _{y\in B_j}d(y,k^{P}_j)+ \frac{|B_j|}{|A_j|}\sum _{x\in A_j}\left( d(k^{P}_j,x)+d(x,K^A)\right) . \end{aligned}$$
(2)

Next, we will bound the expected value of \(d(B_j, K^A)\). Note that the expected value \(\mathbb {E}[|A_j|]=\frac{s_j}{2}= \mathbb {E}[|B_j|]\). Consider the event \(\mathcal {E}:\,(|A_j|>(1-\delta )\mathbb {E}[|A_j|])\), where \(0<\delta <1\). Applying Chernoff bound (Theorem 3, inequality (1c)), event \(\mathcal {E}\) occurs with probability at least \(1-\exp \big (-\frac{\delta ^2s_j}{4}\big )\). When event \(\mathcal {E}\) occurs,

$$\frac{|B_j|}{|A_j|} = \frac{s_j-| A_j|}{|A_j|}< \frac{s_j(1-\frac{1-\delta }{2})}{(1-\delta )\frac{s_j}{2}}= \frac{1+\delta }{1-\delta }.$$

and we can bound \(d(B_j,K^A)\) by

$$\sum _{y\in B_j}d(y,k^{P}_j)+ \frac{1+\delta }{1-\delta }\sum _{x\in A_j}\left( d(k^{P}_j,x)+d(x,K^A)\right) .$$

When \(\mathcal {E}\) does not occur (which has a probability of at most \(\exp \big (-\frac{\delta ^2s_j}{4}\big )\)), i.e., \(|A_j| \le (1-\delta )\mathbb {E}[|A_j|]\), we use a trivial upper bound of \(d(B_j,K^A)\le s_j\varDelta \). Applying this, we have the following upper bound for \(\mathbb {E}[d(B_j,K^A)]\), where the expectation is over all possible permutations of P.

$$\begin{aligned} \mathbb {E}[d(B_j,K^A)]&=\Pr (\mathcal {E})\mathbb {E}[d(B_j,K^A)\mid \mathcal {E}] + \Pr (\overline{\mathcal {E}})\mathbb {E}[d(B_j,K^A)\mid \overline{\mathcal {E}}] \nonumber \\&< \mathbb {E}\bigg [\sum _{y\in B_j}d(y,k^{P}_j)+ \frac{1+\delta }{1-\delta }\sum _{x\in A_j}\left( d(k^{P}_j,x)+d(x,K^A)\right) \bigg ] \nonumber \\&+\exp \Big (-\frac{\delta ^2s_j}{4}\Big ) s_j\varDelta . \end{aligned}$$
(3)

We set \(\delta \,{:}= \,2\sqrt{\frac{\log (s_{j}/\tau )}{s_{j}}}\). When \(s_{j}\ge \tau \), we can reduce the last term in (3) to \(\exp \big (-\frac{\delta ^2s_{j}}{4}\big )s_{j}\varDelta \le \tau \varDelta \). When \(s_{j}<\tau \), we can simply bound \(d(B_j,K^A) \le |B_j|\varDelta < \tau \varDelta .\) Also, we note that \(\frac{1+\delta }{1-\delta }\) is a monotonically increasing function of \(\delta \) for \(0<\delta <1\). For \(x>0\), \(\delta =2\sqrt{\frac{\log (s_{j}/\tau )}{s_{j}}}\) attains the maximum value of \(\frac{2}{\sqrt{e\tau }}\) at \(s_{j}=e\tau \). Therefore, \(\frac{1+\delta }{1-\delta }\le \frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}\). Using these bounds, we rewrite (3) as

$$\begin{aligned} \mathbb {E}[d(B_j,K^A)]&\le \mathbb {E}\bigg [\sum _{y\in B_j}d(y,k^{P}_j)+ \frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}\sum _{x\in A_j}(d(k^{P}_j,x)+d(x,K^A))\bigg ]+ \tau \varDelta . \end{aligned}$$

Summing over all clusters,

$$\begin{aligned} \mathbb {E}[d(B,K^A)]&\le \mathbb {E}\bigg [\sum _{j=1}^k\sum _{y\in B_j}d(y,k^{P}_j)+ \frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}\sum _{j=1}^k\sum _{x\in A_j}(d(k^{P}_j,x)+d(x,K^A))\bigg ]\\&+ k\tau \varDelta \\&\le \mathbb {E}\bigg [\sum _{y\in B}d(y,K^P)+ \frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}\sum _{x\in A} (d(K^P,x)+d(x,K^A))\bigg ]+ k\tau \varDelta . \end{aligned}$$

Since \(\sum _{x\in A}d(x,K^A)\le \sum _{x\in A}d(x,K^P)\), we have

$$\begin{aligned} \mathbb {E}[d(B,K^A)]&\le \mathbb {E}\bigg [\sum _{y\in B}d(y,K^P)+ 2\frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}\sum _{x\in A} d(x,K^P)\bigg ]+ k\tau \varDelta . \end{aligned}$$

Since A and B are subsets chosen uniformly at random from all possible subsets of P of size n/2, by Lemma 2, we have \(\mathbb {E}\big [\sum _{y\in B}d(y,K^P)\big ]=|B|\,\mathrm {medavg}(P)\) and \(\mathbb {E}\big [\sum _{y\in A}d(y,K^P)\big ]=|A|\,\mathrm {medavg}(P)\). Therefore,

$$\begin{aligned} \mathbb {E}[d(B,K^A)]&\le |B|\,\mathrm {medavg}(P)+ 2\frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}|A|\,\mathrm {medavg}(P) + k\tau \varDelta \, \end{aligned}$$

and,

$$\begin{aligned} \mathbb {E}[d(B,K^A)]&\le \Big (\frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}+\frac{1}{2}\Big )n\,\mathrm {medavg}(P) + k\tau \varDelta . \end{aligned}$$

Setting \(\alpha =\frac{\sqrt{e\tau }+2}{\sqrt{e\tau }-2}+\frac{1}{2}\) we arrive at the final expression.

Proof

(Proof of Theorem 1). Let \(\sigma =\sigma ^{(0)}\,^{\frown }\sigma ^{(1)}\,^{\frown }\sigma ^{(2)} \,^{\frown }\cdots \), where \(\,^{\frown }\) denotes concatenation, so that \(|\sigma ^{(0)}|=1\) and \(|\sigma ^{(i)}|=2^{i-1}\). Let \(T^{(i)} = \sigma ^{(0)}\,^{\frown }\sigma ^{(1)} \,^{\frown }\cdots \,^{\frown }\sigma ^{(i)}\) and \(K^{(i)}\) denote the k-median centers of \(T^{(i)}\). Here, we use the terms multi-set and sequence interchangeably. By algorithm \(\mathcal {A}\), each sequence \(\sigma ^{(i)}\) is served by \(K^{(i-1)}\). Noting that \(T^{(i)}=T^{(i-1)}\,^{\frown }\sigma ^{(i)}\) and \(|\sigma ^{(i)}|=|T^{(i-1)}|=2^{i-1}\), we apply Lemma 3 with \(P=T^{(i)}\)\(A=T^{(i-1)}\) and \(B=\sigma ^{(i)}\). The expected distance \(d(\sigma ^{(i)},K^{(i-1)})\le \alpha |T^{(i)}|\,\mathrm {medavg}(T^{(i)})+ \frac{4}{e}\Big (\frac{2\alpha +1}{2\alpha -3}\Big )^2k\varDelta \), and the cost to serve \(\sigma ^{(i)}\) is at most twice this cost. In addition, for each \(\sigma ^{(i)}\), the cost to move the severs from their position at the end of \(\sigma ^{(i-1)}\) to \(K^{(i)}\) is at most \(k\varDelta \). Therefore, \(\mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]\) can be bounded as follows.

$$\begin{aligned} \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]&\le 2\sum _i\alpha |T^{(i)}|\,\mathrm {medavg}(T^{(i)})+ \sum _i\bigg (\frac{8}{e}\Big (\frac{2\alpha +1}{2\alpha -3}\Big )^2k\varDelta + k\varDelta \bigg ) + \varPhi _0. \nonumber \end{aligned}$$

Since, \(\mathrm {medavg}(T^{(i)})=\frac{1}{|T^{(i)}|}\sum _{x\in T^{(i)}}d(x,K^{(i)})\le \frac{1}{|T^{(i)}|}\sum _{x\in T^{(i)}}d(x,K^*)\),

$$\begin{aligned} \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]&\le 2\alpha \sum _i\sum _{x\in T^{(i)}}d(x,K^*)+ \bigg (\frac{8}{e}\Big (\frac{2\alpha +1}{2\alpha -3}\Big )^2+ 1\bigg )k\varDelta \log n +\varPhi _0\\&= 2\alpha \sum _{x\in \sigma }d(x,K^*)+ \bigg (\frac{8}{e}\Big (\frac{2\alpha +1}{2\alpha -3}\Big )^2+ 1\bigg )k\varDelta \log n +\varPhi _0\\&= 2\alpha n\,\mathrm {medavg}(\sigma )+ \bigg (\frac{8}{e}\Big (\frac{2\alpha +1}{2\alpha -3}\Big )^2+ 1\bigg )k\varDelta \log n +\varPhi _0. \end{aligned}$$

Hence proved.

4.3 Lower Bound

Let \(\mathcal {S}_{j}\) denote the set of all possible subsets of \(\sigma \) of size j. The expected value of \(\mathrm {medavg}(A)\) for a set A of size m chosen uniformly at random from \(\mathcal {S}_{m}\) is denoted by \(\mathrm {medavg}_{\mathbb {E}}(m)\), and defined as

$$ \mathrm {medavg}_{\mathbb {E}}(m)=\frac{1}{|\mathcal {S}_{m}|}\sum _{A\in \mathcal {S}_{m}}\,\mathrm {medavg}(A). $$

We will first prove the following lemmas which bounds the cost of any online algorithm from below by \(\sum _{i=1}^j\mathrm {medavg}_{\mathbb {E}}(i)\).

Lemma 4

\( \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]\ge \sum _{i=1}^{n}\mathrm {medavg}_{\mathbb {E}}(i). \)

Proof

Supposing \(i-1\) requests have been served, let A denote the subset of S containing the points yet to be served, i.e., \(|A| = n-i+1\). Let K be the current configuration of the servers. Since every element of A has the same probability of being picked, for any k-set K, the expected distance of the \(i^{th}\) request from K is

$$\begin{aligned} \frac{1}{|A|}\sum _{x\in A}d(x,K)\ge \mathrm {medavg}(A). \end{aligned}$$

Since every set \(A\in \mathcal {S}_{n-i+1}\) is equiprobable, the cost of serving the ith request is \(\mathrm {medavg}_{\mathbb {E}}(n-i+1)\) and the result follows.

Lemma 5

For any \(m>m'\), \(\mathrm {medavg}_{\mathbb {E}}(m) \ge \mathrm {medavg}_{\mathbb {E}}(m')\).

Proof

Let \(A\in \mathcal {S}_{m}\) and \(B\subset A\) of size \(m'\). Noting that every element of A occurs in exactly \(\left( {\begin{array}{c}m-1\\ m'-1\end{array}}\right) \) subsets B, we have

$$\begin{aligned} \mathrm {medavg}(A)&=\frac{1}{|A|}\sum _{x\in A}d(x,K^A)=\frac{1}{m}\frac{1}{\left( {\begin{array}{c}m-1\\ m'-1\end{array}}\right) }\sum _{B\subset A,~|B|=m'}d(x,K^A) \\&\ge \frac{1}{m}\frac{1}{\left( {\begin{array}{c}m-1\\ m'-1\end{array}}\right) }\sum _{B\subset A,~|B|=m'}d(x,K^B)\\&=\frac{m'}{m}\frac{1}{\left( {\begin{array}{c}m-1\\ m'-1\end{array}}\right) }\sum _{B\subset A,~|B|=m'}\,\mathrm {medavg}(B). \end{aligned}$$

Since, every B is a subset of exactly \(\left( {\begin{array}{c}n-m'\\ m-m'\end{array}}\right) \) sets of \(\mathcal {S}_{m}\),

$$\begin{aligned} \sum _{A\in \mathcal {S}_{m}}\,\mathrm {medavg}(A)&\ge \left( {\begin{array}{c}n-m'\\ m-m'\end{array}}\right) \frac{m'}{m}\frac{1}{\left( {\begin{array}{c}m-1\\ m'-1\end{array}}\right) }\sum _{B\in \mathcal {S}_{m'}}\,\mathrm {medavg}(B)\\&=\frac{|\mathcal {S}_{m}|}{|\mathcal {S}_{m'}|} \sum _{B\in \mathcal {S}_{m'}}\,\mathrm {medavg}(B). \end{aligned}$$

Lemma 6

\( \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]\ge \left\lceil \frac{n}{2} \right\rceil \mathrm {medavg}_{\mathbb {E}}(\left\lceil \frac{n}{2} \right\rceil ). \)

Proof

From Lemma 5, for any \(m>\frac{n}{2}\), \(\mathrm {medavg}_{\mathbb {E}}(m)\ge \mathrm {medavg}_{\mathbb {E}}(\left\lceil \frac{n}{2} \right\rceil )\). From Lemma 4,

$$ \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]\ge \sum _{i=1}^{n}\mathrm {medavg}_{\mathbb {E}}(i)\ge \sum _{i=1}^{\left\lceil \frac{n}{2} \right\rceil }\mathrm {medavg}_{\mathbb {E}}(n-i+1)\ge \left\lceil \frac{n}{2} \right\rceil \mathrm {medavg}_{\mathbb {E}}(\left\lceil \frac{n}{2} \right\rceil ). $$

Following Meyerson et al. [8], for any k-set K of \(\sigma \), let \(\beta (K,b)=(B_1,B_2,\ldots ,B_b)\) denote the partition of \(\sigma \) induced by K as follows: We order points of \(\sigma \) by their distance to the closest point in K (low to high), and divide this sequence into b bins each containing equal number of points \(\frac{n}{b}\). Henceforth, let \(b=\frac{n(1-\delta )^2}{4(k+2)\log n}\) for \(0<\delta <1\).

Lemma 7

Let A be any random subset of \(\sigma \) of size \(\left\lceil \frac{n}{2} \right\rceil \). With probability at least \(1-\frac{1}{n}\), every bin of \(\beta (K^A,b)\) has at most \(\frac{\delta n}{2b}\) points from A.

Proof

We will show that with high probability the above statement is satisfied for all k-sets K. Since \(K^A\) is one among these sets, the bound follows. Let K be any k-set and \(\beta (K,b)\) be the induced partition. The expected number of points of A belonging to each bin is \(\frac{n}{2b}\). Using Chernoff bound (1c), for \(0<\delta <1\), the probability that a bin has at most \(\frac{\delta n}{2b}\) points is at most \(e^{-\frac{n(1-\delta )^2}{4b}}\). Since there are at most \(n^k\) sets K and b bins, applying union bound the proof follows.

Proof

(Proof of Theorem 2). Let \(A\subset \sigma \) of size \(\frac{n}{2}\), and \(\beta (K^A,b)\) be the partition of \(\sigma \) induced by \(K^A\). Let \(d_{j}^{\min }\) and \(d_{j}^{\max }\) denote the minimum and maximum distances respectively of points from bin \(B_j\). From Lemma 7, with probability at least \(1-\frac{1}{n}\), every bin of \(\beta (K^A,b)\) has at most \(\frac{\delta n}{2b}\) points of A. Therefore,

$$\begin{aligned} \sum _{x\in A}d(x,K^A)\ge \sum _{j=1}^{b}\frac{\delta n}{2b}d_{j}^{\min }\ge \frac{\delta }{2}\sum _{j=1}^{b-1}\frac{n}{b}d_{j}^{\max }. \end{aligned}$$

The last inequality follows from the fact that \(d_{j}^{\min }\ge d_{j-1}^{\max }\). For each j, since \(d_{j}^{\max }\) is the maximum assigned distance, \(\frac{n}{b}d_{j}^{\max }\ge \sum _{y\in B_j}d(y,K^A)\). Finally, we note that \(d_{b}^{\max }\le \varDelta \), the diameter. Combining,

$$\begin{aligned} \sum _{x\in A}d(x,K^A)&\ge \frac{\delta }{2}\sum _{j=1}^{b-1}\sum _{y\in B_j}d(y,K^A) + \frac{\delta }{2}\sum _{y\in B_b}d(y,K^A)-\frac{\delta n}{2b}\varDelta \\&=\frac{\delta }{2}\sum _{j=1}^{b}\sum _{y\in B_j}d(y,K^A)-\frac{\delta n}{2b}\varDelta \\&=\frac{\delta }{2}\sum _{y\in \sigma }d(y,K^A)-\frac{\delta n}{2b}\varDelta . \end{aligned}$$

Therefore, with probability \((1-1/n)\)

$$\begin{aligned} \mathrm {medavg}(A)&\ge \frac{\delta }{n}\sum _{y\in \sigma }d(y,K^A)-\frac{\delta \varDelta }{b} \ge \delta \,\mathrm {medavg}(\sigma )-\frac{\delta \varDelta }{b}. \end{aligned}$$

Hence,

$$\begin{aligned} \mathrm {medavg}_{\mathbb {E}}\Big (\left\lceil \frac{n}{2} \right\rceil \Big )\ge \Big (1-\frac{1}{n}\Big )\Big (\frac{\delta }{n}\sum _{y\in \sigma }d(y,K^A)-\frac{\delta \varDelta }{b}\Big ) + \frac{1}{n}\cdot 0 \end{aligned}$$

From Lemma 6 and the choice of \(b=\frac{n(1-\delta )^2}{4(k+2)\log n}\),

$$\begin{aligned} \mathbb {E}[w_{\mathcal {A}}(\sigma ,\mathcal {K})]&\ge \left\lceil \frac{n}{2} \right\rceil \mathrm {medavg}_{\mathbb {E}}\Big (\left\lceil \frac{n}{2} \right\rceil \Big ) \ge \left\lceil \frac{n}{2} \right\rceil \Big (1-\frac{1}{n}\Big )\Big (\delta \,\mathrm {medavg}(\sigma )-\frac{\delta \varDelta }{b}\Big )\\&\ge \frac{n-1}{2}\Big (\delta \,\mathrm {medavg}(\sigma )-\frac{\delta \varDelta }{b}\Big ) \\&=\frac{n-1}{2}\delta \,\mathrm {medavg}(\sigma )-\frac{2\delta }{(1-\delta )^2}(k+2)\varDelta \log n. \end{aligned}$$

Hence proved.

5 Conclusion

In this paper, we presented and analyzed a simple k-median based algorithm for the stochastic k-server problem. Our result is based on proving a new and sharper approximation bound of the k-median of a large random subset of a point set with respect to the k-median of entire point set.

In the random arrival model, the cost of serving the next request is lower bounded by the average k-median cost of the requests that have not yet been processed. Clearly, the k servers cannot always be in the optimal k-median configuration even for the best online algorithm. Therefore, it is conceivable that one can prove a stronger lower bound. In particular, can we prove a stronger lower or upper bound and reduce the additive error from \(O(k\varDelta \log n)\) to \(O(k\varDelta )\) (completely independent of n)?