On Scheduling Coflows

Ahmadi, Saba; Khuller, Samir; Purohit, Manish; Yang, Sheng

doi:10.1007/s00453-020-00741-3

On Scheduling Coflows

Published: 15 July 2020

Volume 82, pages 3604–3629, (2020)
Cite this article

Algorithmica Aims and scope Submit manuscript

Saba Ahmadi ORCID: orcid.org/0000-0002-1046-9940¹,
Samir Khuller³,
Manish Purohit² &
…
Sheng Yang¹

452 Accesses
4 Citations
Explore all metrics

Abstract

Applications designed for data-parallel computation frameworks such as MapReduce usually alternate between computation and communication stages. Coflow scheduling is a recent popular networking abstraction introduced to capture such application-level communication patterns in datacenters. In this framework, a datacenter is modeled as a single non-blocking switch with m input ports and m output ports. A coflow j is a collection of flow demands $\{d^j_{io}\}_{i \in \{1,\ldots ,m\}, o \in \{1,\ldots ,m\}}$ that is said to be complete once all of its requisite flows have been scheduled. We consider the offline coflow scheduling problem with and without release times to minimize the total weighted completion time. Coflow scheduling generalizes the well studied concurrent open shop scheduling problem and is thus NP-hard. Qiu et al. (in: ACM Symposium on parallelism in algorithms and architectures. ACM, New York, pp 294–303, 2015) obtain the first constant approximation algorithms for this problem via LP rounding and give a deterministic $\frac{67}{3}$-approximation and a randomized $(9 + \frac{16\sqrt{2}}{3}) \approx 16.54$-approximation algorithm. In this paper, we give a combinatorial algorithm that yields a deterministic 5-approximation algorithm for coflow scheduling with release times, and a deterministic 4-approximation for the case without release times. As for concurrent open shop problem with release times, we give a combinatorial 3-approximation algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of Kubernetes scheduling algorithms

Article Open access 13 June 2023

Algorithms for Scheduling Deadline-Sensitive Malleable Tasks

Article 01 April 2024

Task scheduling and VM placement to resource allocation in Cloud computing: challenges and opportunities

Article 08 July 2023

References

Apache Software Foundation: Hadoop. https://hadoop.apache.org
Bansal, N., Khot, S.: Inapproximability of hypergraph vertex cover and applications to scheduling problems. In: International Colloquium on Automata, Languages and Programming, pp. 250–261. Springer (2010)
Chen, Z.L., Hall, N.G.: Supply chain scheduling: conflict and cooperation in assembly systems. Oper. Res. 55(6), 1072–1089 (2007)
MathSciNet MATH Google Scholar
Chowdhury, M., Stoica, I.: Coflow: A networking abstraction for cluster applications. In: ACM Workshop on Hot Topics in Networks, pp. 31–36. ACM (2012)
Chowdhury, M., Stoica, I.: Efficient coflow scheduling without prior knowledge. In: SIGCOMM, pp. 393–406. ACM (2015)
Chowdhury, M., Zhong, Y., Stoica, I.: Efficient coflow scheduling with varys. In: SIGCOMM, SIGCOMM ’14, pp. 443–454. ACM, New York, NY, USA (2014)
Davis, J.M., Gandhi, R., Kothari, V.H.: Combinatorial algorithms for minimizing the weighted sum of completion times on a single machine. Oper. Res. Lett. 41(2), 121–125 (2013)
MathSciNet MATH Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Google Scholar
Garg, N., Kumar, A., Pandit, V.: Order scheduling models: Hardness and algorithms. In: FSTTCS, pp. 96–107. Springer (2007)
Im, S., Moseley, B., Pruhs, K., Purohit, M.: Matroid coflow scheduling. In: International Colloquium on Automata, Languages and Programming (2019)
Khuller, S., Li, J., Sturmfels, P., Sun, K., Venkat, P.: Select and permute: an improved online framework for scheduling to minimize weighted completion time. Theor. Comput. Sci. 795, 420–431 (2019)
MathSciNet MATH Google Scholar
Khuller, S., Purohit, M.: Brief announcement: Improved approximation algorithms for scheduling co-flows. In: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 239–240. ACM, New York, NY, USA (2016)
Leung, J.Y.T., Li, H., Pinedo, M.: Scheduling orders for multiple product types to minimize total weighted completion time. Discrete Appl. Math. 155(8), 945–970 (2007)
MathSciNet MATH Google Scholar
Luo, S., Yu, H., Zhao, Y., Wang, S., Yu, S., Li, L.: Towards practical and near-optimal coflow scheduling for data center networks. IEEE Trans. Parallel Distrib. Syst. 27(11), 3366–3380 (2016)
Google Scholar
Mastrolilli, M., Queyranne, M., Schulz, A.S., Svensson, O., Uhan, N.A.: Minimizing the sum of weighted completion times in a concurrent open shop. Oper. Res. Lett. 38(5), 390–395 (2010)
MathSciNet MATH Google Scholar
Qiu, Z., Stein, C., Zhong, Y.: Minimizing the total weighted completion time of coflows in datacenter networks. In: ACM Symposium on Parallelism in Algorithms and Architectures, pp. 294–303. ACM, New York, NY, USA (2015)
Queyranne, M.: Structure of a simple scheduling polyhedron. Math. Program. 58(1–3), 263–285 (1993)
MathSciNet MATH Google Scholar
Sachdeva, S., Saket, R.: Optimal inapproximability for scheduling problems via structural hardness for hypergraph vertex cover. In: IEEE Conference on Computational Complexity, pp. 219–229. IEEE (2013)
Shafiee, M., Ghaderi, J.: An improved bound for minimizing the total weighted completion time of coflows in datacenters. IEEE/ACM Trans. Netw. 26(4), 1674–1687 (2018)
Google Scholar
Wang, G., Cheng, T.E.: Customer order scheduling to minimize total weighted completion time. Omega 35(5), 623–626 (2007)
Google Scholar
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. HotCloud 10, 10–10 (2010)
Google Scholar
Zhao, Y., Chen, K., Bai, W., Yu, M., Tian, C., Geng, Y., Zhang, Y., Li, D., Wang, S.: Rapier: Integrating routing and scheduling for coflow-aware data center networks. In: IEEE International Conference on Computer Communications, pp. 424–432. IEEE (2015)

Download references

Acknowledgements

This work is supported by NSF Grants CNS 156019 and CCF 1655073 (Eager), and partially supported by an Amazon Grant

Author information

Authors and Affiliations

University of Maryland, College Park, USA
Saba Ahmadi & Sheng Yang
Google, Mountain View, USA
Manish Purohit
Northwestern University, Evanston, USA
Samir Khuller

Authors

Saba Ahmadi
View author publications
You can also search for this author in PubMed Google Scholar
Samir Khuller
View author publications
You can also search for this author in PubMed Google Scholar
Manish Purohit
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saba Ahmadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this paper appears in Proceedings of IPCO 2017.

Appendices

Appendix 1: A Combinatorial 3-Approximation Algorithm for Concurrent Open Shop with Release Times

Theorem 1

Algorithm 1 gives a 3-approximation for concurrent open shop scheduling with release times.

Proof

We use Algorithm 1 to get a permutation $\{1,2,\ldots ,n\}$ for a set of jobs J. If we schedule the jobs according to this permutation sequentially, we’ll get:

$$\begin{aligned} C_{j} \le \max _{i'\le j}r_{i'} + \sum _{k\le j} L_{\mu (j),k} \end{aligned}$$

Lemma 5 with $a=1$ and $b=1$, imply that:

$$\begin{aligned} \sum _j w_j C_j(alg)&\le \left( 1+\frac{1}{\kappa }\right) \sum _{j=1}^n \sum _{i\in M} \alpha _{i,j} r_{j} + 2(\kappa + 1) \sum _{i\in M} \sum _{S\subseteq J} \beta _{i,S} f_i(S) \end{aligned}$$

To minimize the approximation ratio, we substitute $\kappa = \frac{1}{2}$ and obtain

$$\begin{aligned} \sum _j w_j C_j(alg)&\le 3 \left( \sum _{j=1}^n \sum _{i\in M} \alpha _{i,j} r_{j} + \sum _{i\in M} \sum _{S\subseteq J} \beta _{i,S} f_i(S) \right) \le 3 \cdot OPT \end{aligned}$$

where the last inequality follows from weak duality as $\alpha$ and $\beta$ constitute a feasible dual solution. $\square$

Appendix 2: Correction of Algorithm by Qiu et al. [16]

We now give a brief overview of the approximation algorithm given by Qiu, Stein, and Zhong [16].

1.1 Interval-Indexed LP Formulation

In the first step we write an interval-indexed linear programming relaxation for the coflow scheduling problem similar to that for the concurrent open shop problem by Wang and Cheng [20].

Let $\bar{C_j}$ denote the approximated completion time of coflow j obtained by an optimal feasible solution to this LP relaxation. We first order the coflows in non-decreasing order of these approximated completion times, i.e. we have the following.

$$\begin{aligned} \bar{C_1} \le \bar{C_2} \cdots \le \bar{C_n} \end{aligned}$$

(19)

Let $V_j$ denote the maximum load on any port by the first j coflows taken together in the above ordering, i.e.

$$\begin{aligned} V_j = \max \left[ \max _i \left\{ \sum _{k = 1}^j \sum _{o} d^k_{io}\right\} , \max _o \left\{ \sum _{k = 1}^j \sum _{i} d^k_{io}\right\} \right] . \end{aligned}$$

Qiu et al. [16] prove that these $V_j$ values provide a good approximation for the optimal completion times of the coflows. In particular, they show the following where $C_j^*$ denotes the completion time of coflow j in an optimal schedule.

$$\begin{aligned} \sum _{j} w_j V_j \le \frac{16}{3} \sum _j w_j C_j^* \end{aligned}$$

(20)

1.2 Grouping Coflows

Divide time into geometrically increasing intervals as follows - $[1], [2], [3,4], [5,8], [9,16], \ldots$. Let $I_l = (2^{l-2}, 2^{l-1}]$ denote the $l^{\text {th}}$ interval.

Now group the coflows based on the interval where their V values lie and let $S_l$ denote the set of coflows assigned to interval $I_l$. In other words, all coflows $j \in S_l$, we have $2^{l-2} < V_j \le 2^{l-1}$.

1.2.1 Algorithm 1

For $l = 1, 2, \ldots$
- Wait until the last coflow in $S_l$ is released.
- Group all coflows in $S_l$ and schedule as per Algorithm 1 in [16]. This would take time at most $V_k \le 2^{l-1}$ where k is the last job in the group.

1.2.2 Analysis

Qiu et al. claim the following (Proposition 1 in [16]).

Proposition 1

For any coflow j, let $C_j(alg)$ denote the completion time of coflow j as per Algorithm 1. Then we have

$$\begin{aligned} C_j(alg) \le \max _{1 \le g \le j}\{r_g\} + 4 V_j. \end{aligned}$$

Since $\displaystyle C_j^* \ge \max _{1 \le g \le j}\{r_g\}$, Proposition 1 and Equation (20) together imply the following theorem (Theorem 1 in [16]).

Theorem 2

There exists a deterministic polynomial time 67/3 approximation algorithm for coflow scheduling, i.e.

$$\begin{aligned} \sum _{j} w_j C_j(alg) \le \frac{67}{3} \sum _j w_j C_j^*. \end{aligned}$$

1.3 Error

We now show that the Proposition 1 stated above is incorrect. Consequently, Theorem 1 no longer holds. Recall that Algorithm 1 groups jobs based on their V values alone and does not consider their release times.

Consider a simple case where $m = 1$ and we have just one input port and one output port. Say we have two jobs $j_1$ and $j_2$ such that $j_1$ needs to send 3 units of data and $j_2$ needs to send 1 unit of data. Also say $r_{j_1} = 0$ and $r_{j_2} = 100$. By definition, we have $V_{j_1} = 3$ and $V_{j_2} = 4$; note that both the jobs belong to the same interval $I_3 = (2,4]$. Now since both jobs belong to the same interval, Algorithm 1 waits for both the jobs to be released and then schedules them together (after time 100). In this case, the claim in Proposition 1 clearly does not hold for job $j_1$.

Proposition 2 in [16] makes a similar claim for a grouping algorithm using randomized intervals. Again, the above instance serves as a counterexample to the claim. Consequently, Theorem 2 in [16] does not hold.

In the following section, we show that the deterministic grouping algorithm can be modified to yield a $\frac{76}{3}$-approximation algorithm. Note that this is worse than the $\frac{67}{3}$ factor claimed earlier. It is not immediately clear whether the randomized algorithm from [16] can be corrected via a similar modification.

1.4 Corrected Grouping Algorithm

We first solve the interval-indexed LP formulation to obtain approximated completion times $\bar{C_j}$. Without loss of generality, we assume that the coflows are ordered as per Equation (19).

As shown by Leung, Li, and Pinedo (Theorem 13 in [13]), the analysis of Wang and Cheng [20] can be extended to the case of general release times to obtain the following.

$$\begin{aligned} \sum _{j} w_j\bar{C_j} \le \frac{19}{3} \sum _{j} w_jC_j^* \end{aligned}$$

(21)

This is analogous to Lemma 3 in [16] that shows that $\sum _j w_j V_j \le \frac{16}{3} \sum _j w_j C_j^*$ where $V_j$ is the maximum load on any port by the first j coflows taken together (as per the ordering).

Since $\bar{C_j}$ denotes the approximation completion time of coflow j as computed by the valid LP relaxation, we also have the following where $r_j$ denotes the release time of coflow j.

$$\begin{aligned} \bar{C_j}&\ge r_j \end{aligned}$$

(22)

$$\begin{aligned} \bar{C_j}&\ge V_j \end{aligned}$$

(23)

1.4.1 Algorithm

Divide time into geometrically increasing intervals as follows - $[1], [2], [3,4], [5,8], [9,16], \ldots$. Let $I_l = (2^{l-2}, 2^{l-1}]$ denote the $l^{\text {th}}$ interval.

Now group the coflows based on the interval where their ${\bar{C}}$ values lie and let $S_l$ denote the set of coflows assigned to interval $I_l$. So for all coflows $j \in S_l$, we have $2^{l-2} < \bar{C_j} \le 2^{l-1}$.

1.4.2 Algorithm

For $l = 1, 2, \ldots$
- Wait until the last coflow in $S_l$ is released AND all coflows in $S_{l-1}$ have finished. (whichever is later).
- Group all coflows in $S_l$ and schedule as per Algorithm 1 in [16]. This would take time at most $V_k \le 2^{l-1}$ where k is the last job in the group.

1.4.3 Analysis

Let $\tilde{C_l}$ denote the time by which all coflows in $S_{l}$ have been scheduled by the above algorithm.

Claim

$\tilde{C_l} \le 2 \times 2^{l-1} = 2^l$ for every group $S_l$.

Proof

We prove by induction. For group $S_1$, we start executing the schedule at $\max _{j \in S_1} r_j \le \max _{j \in S_1} \bar{C_j} \le 2^{1-1} = 1$ and the schedule takes time at most $V_k \le 2^{1-1} = 1$ where k is the last coflow in the group. So the base case is true.

Now assume that the claim is true for some group $S_l$. As per the algorithm, the coflows in group $S_{l+1}$ start executing at $\tilde{C_l}$ or $\max _{j \in S_{l+1}} r_j$ whichever is later. By induction, we are guaranteed that $\tilde{C_l} \le 2^l$. Also $\max _{j \in S_{l+1}} r_j \le \max _{j \in S_{l+1}} \bar{C_j} \le 2^l$. Thus the coflows in group $S_{l+1}$ start executing latest at time $2^l$. We know that all these coflows require at most $V_k \le \bar{C_k} \le 2^l$ time units to complete. As a result, all the coflows in this group are scheduled by time $2^l + 2^l = 2^{l+1}$.

And thus the claim follows by induction. $\square$

Claim

For any coflow j, let $C_j(alg)$ denote the completion time of coflow j as per the algorithm. Then $C_j(alg) < 4\bar{C_j}$.

Proof

Consider any coflow j, and let l be such that $j \in S_l$. Hence we have $\bar{C_j} > 2^{l-2}$. By the previous claim, we have

$$\begin{aligned} C_j(alg) \le \tilde{C_l} \le 2^l = 4 \times 2^{l-2} < 4\bar{C_j} \end{aligned}$$

$\square$

Corollary 2

There is a deterministic $\frac{76}{3}$-approximation for coflow scheduling with arbitrary release times.

Proof

Claim 1 and Equation (21) together imply a $\frac{76}{3}$-approximation algorithm for coflow scheduling with release times. $\square$

Appendix 3: Counterexample to Claim by Luo et al. [14]

Luo et al. [14] claim a 2-approximation algorithm for the coflow scheduling problem by proving that it is equivalent to concurrent open shop scheduling. One of the key ingredients of their proof is the following claim that is implicit in Lemma 3 in [14].

Claim

(Restated from [14]) Given two coflows $G_k$ and $G_l$, we can find a feasible schedule for both the coflows such that $C_k + C_l = \min \{\varDelta (G_k) + \varDelta (G_k \bigcup G_l), \varDelta (G_l) + \varDelta (G_k \bigcup G_l)\}$.

1.1 Counterexample

We show that Claim 3 is erroneous via a simple counterexample. Consider two coflows on a $3 \times 3$ datacenter as shown in Fig. 5. Note that while coflows $G_1$ and $G_2$ have $\varDelta (G_1) = 1$ and $\varDelta (G_2) = 2$, the combined coflow $G_1 \bigcup G_2$ also has $\varDelta (G_1 \bigcup G_2) = 2$. Consequently, the RHS in Claim 3 equals $\varDelta (G_1) + \varDelta (G_1 \bigcup G_2) = 3$.

On the other hand, as seen in Fig. 5, if coflow $G_1$ is scheduled so that $C_1 = \varDelta (G_1) = 1$, then the matching constraints force coflow $G_2$ to have completion time $C_2 = 3$. On the other hand, delaying one edge of coflow $G_1$, leads to a schedule with $C_1 = C_2 = 2$. In both cases, we have $C_1 + C_2 = 4$ (instead of 3) leading to a contradiction to the claim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahmadi, S., Khuller, S., Purohit, M. et al. On Scheduling Coflows. Algorithmica 82, 3604–3629 (2020). https://doi.org/10.1007/s00453-020-00741-3

Download citation

Received: 05 October 2017
Accepted: 18 June 2020
Published: 15 July 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s00453-020-00741-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Scheduling Coflows

Abstract

Access this article

Similar content being viewed by others

A survey of Kubernetes scheduling algorithms

Algorithms for Scheduling Deadline-Sensitive Malleable Tasks

Task scheduling and VM placement to resource allocation in Cloud computing: challenges and opportunities

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1: A Combinatorial 3-Approximation Algorithm for Concurrent Open Shop with Release Times

Theorem 1

Proof

Appendix 2: Correction of Algorithm by Qiu et al. [16]

1.1 Interval-Indexed LP Formulation

1.2 Grouping Coflows

1.2.1 Algorithm 1

1.2.2 Analysis

Proposition 1

Theorem 2

1.3 Error

1.4 Corrected Grouping Algorithm

1.4.1 Algorithm

1.4.2 Algorithm

1.4.3 Analysis

Claim

Proof

Claim

Proof

Corollary 2

Proof

Appendix 3: Counterexample to Claim by Luo et al. [14]

Claim

1.1 Counterexample

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation