Skip to main content
Log in

Efficient scheduling for multi-stage coflows

  • Regular Paper
  • Published:
CCF Transactions on Networking

Abstract

In data center networks (DCN), large scale flows produced by parallel computing frameworks form many coflows semantically. Most inter-coflow schedulers only focus on the remaining data of coflows and attempt to mimic Shortest Job First (SJF). However, a coflow may consist of multiple stages, where a coflow has different amounts of data to transmit. In this paper, we consider the Multi-stage Inter-Coflow Scheduling problem and try to give an efficient online scheduling scheme. We first explore a short-sighted algorithm, IAO, with the greedy strategy. This gives us an insight into utilizing the network resources. Based on that, we propose a far-sighted heuristic, MLBF, which schedules sub-coflows to occupy network bandwidth in turn. Furthermore, we remove the bijection assumption and propose a new practical heuristic, MPLBF. Through simulations in various network environments, we show that, compared to a state-of-the-art scheduler—Varys, a multi-stage aware scheduler can reduce the coflow completion time by up to 4.81 \(\times\) even though it is short-sighted. Moreover, the far-sighted scheduler MLBF can improve the performance by nearly 7.95 \(\times\) reduction. Last but not least, MPLBF can improve the performance by up to 8.03 \(\times\) reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Al-Fares, M., Radhakrishnan, S., Raghavan, B., Huang, N., Vahdat, A.: Hedera: dynamic flow scheduling fordata center networks. In: Nsdi, Vol. 10, pp. 89–92 (2010)

  • Alizadeh, M., Edsall, T., Dharmapurikar, S., Vaidyanathan, R., Chu, K., Fingerhut, A., Matus, F., Pan, R., Yadav, N., Varghese, G., et al.: Conga: distributed congestion-aware load balancing for datacenters. In: Proceedings of the 2014 ACM Conference on the Special Interest Group on Data Communication (SIGCOMM 2014)

  • Alizadeh, M., Greenberg, A., Maltz, D.A., Padhye, J., Patel, P., Prabhakar, B., Sengupta, S., Sridharan, M.: Data center tcp (dctcp). ACM SIGCOMM Comput. Commun. Rev. 41(4), 63–74 (2011)

    Article  Google Scholar 

  • Alizadeh, M., Yang, S., Sharif, M., Katti, S., McKeown, N., Prabhakar, B., Shenker, S.: pfabric: Minimal near-optimal datacenter transport. ACM SIGCOMM Comput. Commun. Rev. 43, 435–446 (2013)

    Article  Google Scholar 

  • Bai, W., Chen, K., Wang, H., Chen, L., Han, D., Tian, C.: Information-agnostic flow scheduling for commodity data centers. In: Proceedings of the 2015 USENIX Symposium on Networked Systems Design and Implementation (NSDI 2015)

  • Ballani, H., Costa, P., Karagiannis, T., Rowstron, A.: Towards predictable datacenter networks. In: ACM SIGCOMM computer communication review, Vol. 41, pp. 242–253. ACM (2011)

  • Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms. Wiley, Hoboken (2013)

    MATH  Google Scholar 

  • Chowdhury, M., Stoica, I.: Coflow: a networking abstraction for cluster applications. In: Proceedings of the 11th ACM Workshop on Hot Topics in Networks, pp. 31–36. ACM (2012)

  • Chowdhury, M., Stoica, I.: Efficient coflow scheduling without prior knowledge. In: Proceedings of the 2015 ACM Conference on the Special Interest Group on Data Communication (SIGCOMM 2015)

  • Chowdhury, M., Zaharia, M., Ma, J., Jordan, M.I., Stoica, I.: Managing data transfers in computer clusters with orchestra. In: Proceedings of the 2011 ACM Conference on the Special Interest Group on Data Communication (SIGCOMM 2011)

  • Chowdhury, M., Zhong, Y., Stoica, I.: Efficient coflow scheduling with varys. In: ACM SIGCOMM Computer Communication Review, Vol. 44, pp. 443–454. ACM (2014)

  • Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  • Dogar, F.R., Karagiannis, T., Ballani, H., Rowstron, A.: Decentralized task-aware scheduling for data center networks. In: Proceedings of the 2014 ACM Conference on the Special Interest Group on Data Communication (SIGCOMM 2014)

  • Grandl, R., Kandula, S., Rao, S., Akella, A., Kulkarni, J.: G: packing and dependency-aware scheduling for data-parallel clusters. In: Proceedings of OSDI’16: 12th USENIX Symposium on Operating Systems Design and Implementation (2016)

  • Greenberg, Albert, Hamilton, James R., Jain, Navendu, Kandula, Srikanth, Kim, Changhoon, Lahiri, Parantap, Maltz, David A., Patel, Parveen, Sengupta, Sudipta: Vl2: a scalable and flexible data center network. In: Proceedings of the 2009 ACM Conference on the Special Interest Group on Data Communication (SIGCOMM 2009)

  • Guo, J., Liu, F., Wang, T., Lui, J.C.S.: Pricing intra-datacenter networks with over-committed bandwidth guarantee. In: 2017 USENIX annual technical conference (USENIXATC 17), pp. 69–81 (2017)

  • Guo, J., Liu, F., Lui, J.C.S., Jin, H.: Fair network bandwidth allocation in iaas datacenters via a cooperative game approach. IEEE/ACM Trans. Netw. 24(2), 873–886 (2016)

    Article  Google Scholar 

  • Hong, C.-Y., Caesar, M., Godfrey, P.: Finishing flows quickly with preemptive scheduling. In: Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, pp. 127–138. ACM (2012)

  • Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: ACM SIGOPS operating systems review, Vol. 41, pp. 59–72. ACM (2007)

  • Kang, N., Liu, Z., Rexford, J., Walker, D.: Optimizing the one big switch abstraction in software-defined networks. In: Proceedings of the ninth ACM conference on Emerging networking experiments and technologies, pp. 13–24. ACM (2013)

  • Munir, A., Qazi, I.A., Uzmi, Z.A., Mushtaq, A., Ismail, S.N., Iqbal, M.S., Khan, B.: Minimizing flow completion times in data centers. In: INFOCOM, 2013 Proceedings IEEE, pp. 2157–2165. IEEE (2013)

  • Munir, A., Baig, G., Irteza, S.M., Qazi, I.A., Liu, A.X., Dogar, F.R.: Friends, not foes: synthesizing existing transport strategies for data center networks. CM SIGCOMM Comput. Commun. Rev. 44, 491–502 (2015). ACM

    Article  Google Scholar 

  • Murray, D.G., Schwarzkopf, M., Smowton, C., Smith, S., Madhavapeddy, A., Hand, S.: Ciel: a universal execution engine for distributed data-flow computing. In: Proceedings of the 8th ACM/USENIX Symposium on Networked Systems Design and Implementation (NSDI 2011)

  • Niranjan M., Radhika, P., Andreas, F., Nathan, H., Nelson, M., Pardis, R., Sivasankar, S., Vikram, V.A.: Portland: a scalable fault-tolerant layer 2 data center network fabric. In: ACM SIGCOMM Computer Communication Review, Vol. 39, pp. 39–50. ACM (2009)

  • Popa, L., Kumar, G., Chowdhury, M., Krishnamurthy, A., Ratnasamy, S., Stoica, I.: Faircloud: sharing the network in cloud computing. ACM SIGCOMM Comput. Commun. Rev. 42(4), 187–198 (2012)

    Article  Google Scholar 

  • Qiu, Z., Stein, C., Zhong, Y.: Minimizing the total weighted completion time of coflows in datacenter networks. In: Proceedings of the 27th ACM symposium on parallelism in algorithms and architectures (SPAA 2015)

  • Wang, T., Xu, H., Liu, F.: Aemon: information-agnostic mix-flow scheduling in data center networks. In: Proceedings of the First Asia-Pacific Workshop on Networking, pp. 106–112. ACM (2017)

  • Wang, T., Liu, F., Hong, X.: An efficient online algorithm for dynamic sdn controller assignment in data center networks. IEEE/ACM Trans. Netw. 25(5), 2788–2801 (2017)

    Article  Google Scholar 

  • Wilson, C., Ballani, H., Karagiannis, T., Rowtron, A.: Better never than late: meeting deadlines in datacenter networks, Vol. 41, pp. 50–61 (2011) ACM

  • Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets. 10: 95 (2010)

  • Zhang, H., Chen, L., Yi, B., Chen, K., Chowdhury, M., Geng, Y.: Coda: toward automatically identifying and scheduling coflows in the dark. In: Proceedings of the 2016 ACM Conference on the Special Interest Group on Data Communication (SIGCOMM 2016)

  • Zhang, S., Qian, Z., Hao, W., Sanglu, L.: Efficient data center flow scheduling without starvation using expansion ratio. IEEE Trans. Parallel Distrib. Syst. 28(11), 3157–3170 (2017)

    Article  Google Scholar 

  • Zhao, Y., Chen, K., Bai, W., Yu, M., Tian, C., Geng, Y., Zhang, Y., Li, D., Wang, S.: Rapier: integrating routing and scheduling for coflow-aware data center networks. In: Proceedings of the 2015 IEEE Conference on Computer Communications (INFOCOM 2015)

Download references

Acknowledgements

This work was supported in part by National Key R&D Program of China (2017YFB1001801), NSFC (61872175), NSF of Jiangsu Province (BK20181252), CCF-Tencent Open Fund, and Collaborative Innovation Center of Novel Software Technology and Industrialization. On behalf of all authors, the corresponding author states that there is no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, S., Zhang, S., Qian, Z. et al. Efficient scheduling for multi-stage coflows. CCF Trans. Netw. 2, 83–97 (2019). https://doi.org/10.1007/s42045-019-00018-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42045-019-00018-6

Keywords

Navigation