Skip to main content

Advertisement

Log in

Fast and isolation guaranteed coflow scheduling via traffic forecasting in multi-tenant environment

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

It is a challenging task to achieve the minimum average CCT (coflow completion time) and provide isolation guarantees in multi-tenant datacenters without prior knowledge of coflow sizes. State-of-the-art solutions either focus on minimizing the average CCT or providing optimal isolation guarantees. However, achieving the minimum average CCT and isolation guarantees in multi-tenant datacenters is difficult due to the conflicting nature of these objectives. Therefore, we propose FIGCS-TF (Fast and Isolation Guarantees Coflow Scheduling via Traffic Forecasting), a coflow scheduling algorithm that does not require prior knowledge. FIGCS-TF utilizes a lightweight forecasting module to predict the relative scheduling priority of coflows. Moreover, it employs the MDRF (monopolistic dominant resource fairness) strategy for bandwidth allocation, which is based on super-coflows and helps achieve long-term isolation. Through trace-driven simulations, FIGCS-TF demonstrate communication stages that are 1.12\(\times\), 1.99\(\times\), and 5.50\(\times\) faster than DRF (Dominant Resource Fairness), NCDRF (Non-Clairvoyant Dominant Resource Fairness) and Per-Flow Fairness, respectively. In comparison with the theoretically minimum CCT, FIGCS-TF experiences only a 46% increase in average CCT at the top 95th percentile of the dataset. Overall, FIGCS-TF exhibits superior performance in reducing average CCT compared to other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Algorithm 2
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

No datasets were generated or analysed during the current study.

References

  1. Ekanayake J, Gunarathne T, Fox G, Balkir AS, Poulain C, Araujo N, Barga R (2009) DryadLINQ for scientific analyses. In: 2009 Fifth IEEE International Conference on E-Science, pp. 329–336. IEEE, https://doi.org/10.1109/e-Science.2009.53

  2. Apache Spark. http://spark.apache.org/ Accessed 2021-04-04

  3. Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113. https://doi.org/10.1145/1327452.1327492

    Article  Google Scholar 

  4. Shafiee M, Ghaderi J (2022) Scheduling coflows with dependency graph. IEEE/ACM Trans Netw 30(1):450–463. https://doi.org/10.1109/TNET.2021.3116133

    Article  Google Scholar 

  5. Bai W, Chen L, Chen K, Han D, Tian C, Wang H (2017) PIAS: practical information-agnostic flow scheduling for commodity data centers. IEEE/ACM Trans Netw 25(4):1954–1967. https://doi.org/10.1109/TNET.2017.2669216

    Article  Google Scholar 

  6. Zhou P, He X, Luo S, Yu H, Sun G (2020) JPAS: Job-progress-aware flow scheduling for deep learning clusters. J Netw Comput Appl 158:102590–102604. https://doi.org/10.1016/j.jnca.2020.102590

    Article  Google Scholar 

  7. Wang S, Wang S, Zhou D, Yang Y, Zhang W, Huang T, Huo R, Liu Y (2020) Large-Scale and rapid flow size estimation for improving flow scheduling. In: IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 1141–1146. https://doi.org/10.1109/INFOCOMWKSHPS50562.2020.9163019

  8. Li C, Zhang H, Ding W, Zhou T (2021) Fair and near-optimal coflow scheduling without prior knowledge of coflow size. J Supercomput 77(7):7690–7717. https://doi.org/10.1007/s11227-020-03614-2

    Article  Google Scholar 

  9. Tian B, Tian C, Wang B, Li B, He Z, Dai H, Liu K, Dou W, Chen G (2019) Scheduling dependent coflows to minimize the total weighted job completion time in datacenters. Comput Netw 158:193–205. https://doi.org/10.1016/j.comnet.2019.05.010

    Article  Google Scholar 

  10. Zhao Y, Tian C, Fan J, Guan T, Zhang X, Qiao C (2021) Joint reducer placement and coflow bandwidth scheduling for computing clusters. IEEE/ACM Trans Netw 29(1):438–451. https://doi.org/10.1109/TNET.2020.3037064

    Article  Google Scholar 

  11. Tan H, Zhang C, Xu C, Li Y, Han Z, Li X-Y (2021) Regularization-based coflow scheduling in optical circuit switches. IEEE/ACM Trans Netw 29(3):1280–1293. https://doi.org/10.1109/TNET.2021.3058164

    Article  Google Scholar 

  12. Chowdhury M, Liu Z, Ghodsi A, Stoica I (2016) HUG: multi-resource fairness for correlated and elastic demands. In: 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pp. 407–424. USENIX, Santa Clara, California

  13. Wang W, Ma S, Li B, Li B (2017) Coflex: navigating the fairness-efficiency tradeoff for coflow scheduling. In: IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, pp. 1–9. IEEE, Atlanta, GA, USA. https://doi.org/10.1109/INFOCOM.2017.8057172

  14. Wang L, Wang W, Li B (2018) Utopia: near-optimal coflow scheduling with isolation guarantee. In: IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, pp. 891–899. IEEE, Honolulu, HI. https://doi.org/10.1109/INFOCOM.2018.8485970

  15. Wang L, Wang W (2018) Fair coflow scheduling without prior knowledge. In: 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp. 22–32. IEEE, Vienna. https://doi.org/10.1109/ICDCS.2018.00013

  16. Lu Y, Chen G, Luo L, Tan K, Xiong Y, Wang X, Chen E (2017) One more queue is enough: minimizing flow completion time with explicit priority notification. In: IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, pp. 1–9. IEEE, https://doi.org/10.1109/INFOCOM.2017.8056946

  17. Wang S, Li D, Geng J (2020) Geryon: accelerating distributed CNN training by network-level flow scheduling. In: IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, pp. 1678–1687. https://doi.org/10.1109/INFOCOM41043.2020.9155282

  18. Goyal P, Shah P, Zhao K, Nikolaidis G, Alizadeh M, Anderson TE (2022) Backpressure flow control. In: 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pp. 779–805. https://doi.org/10.1145/3375235.3375239

  19. Chowdhury M, Zhong Y, Stoica I (2014) Efficient coflow scheduling with Varys. In: Proceedings of the 2014 ACM Conference on SIGCOMM - SIGCOMM ’14, pp. 443–454. ACM Press, Chicago, Illinois, USA. https://doi.org/10.1145/2619239.2626315

  20. Dogar FR, Karagiannis T, Ballani H, Rowstron A (2014) Decentralized task-aware scheduling for data center networks. In: Proceedings of the 2014 ACM Conference on SIGCOMM - SIGCOMM ’14, pp. 431–442. ACM Press, Chicago, Illinois, USA. https://doi.org/10.1145/2619239.2626322

  21. Luo S, Fan P, Xing H, Yu H (2023) Meeting coflow deadlines in data center networks with policy-based selective completion. IEEE/ACM Trans Netw 31(1):178–191. https://doi.org/10.1109/TNET.2022.3187821

    Article  Google Scholar 

  22. Zhou Q, Wang K, Li P, Zeng D, Guo S, Ye B, Guo M (2019) Fast coflow scheduling via traffic compression and stage pipelining in datacenter. Networks 68(12):1755–1771. https://doi.org/10.1109/TC.2019.2931716

    Article  Google Scholar 

  23. Jajoo A, Hu YC, Lin X (2022) A case for sampling-based learning techniques in coflow scheduling. IEEE/ACM Trans Netw 30(4):1494–1508. https://doi.org/10.1109/TNET.2021.3138923

    Article  Google Scholar 

  24. Li C, Zhang H, Zhou T (2019) Coflow scheduling algorithm based density peaks clustering. Futur Gener Comput Syst 97:805–813. https://doi.org/10.1016/j.future.2019.03.035

    Article  Google Scholar 

  25. Guo C, Lu G, Wang HJ, Yang S, Kong C, Sun P, Wu W, Zhang Y (2010) SecondNet: a data center network virtualization architecture with bandwidth guarantees. In: Proceedings of the 6th International Conference on - Co-NEXT ’10, pp. 1–12. ACM Press, Philadelphia, USA. https://doi.org/10.1145/1921168.1921188

  26. Ballani H, Costa P, Karagiannis T, Rowstron A (2011) Towards predictable datacenter networks. In: Proceedings of the ACM SIGCOMM 2011 Conference on SIGCOMM - SIGCOMM ’11, vol. 41, pp. 242–253. ACM Press, Toronto, Ontario, Canada. https://doi.org/10.1145/2018436.2018465

  27. Popa L, Kumar G, Chowdhury M, Krishnamurthy A, Ratnasamy S, Stoica I (2012) FairCloud: sharing the network in cloud computing. ACM SIGCOMM Comput Commun Rev 42(4):187–198. https://doi.org/10.1145/2377677.2377717

    Article  Google Scholar 

  28. Jeyakumar V, Alizadeh M, Mazieres D, Prabhakar B, Kim C, Greenberg A (2013) EyeQ: practical network performance isolation at the edge. In: 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’13), pp. 297–311. USENIX, Lombard, IL

  29. Wang W, Jin A-L (2016) Friends or foes: revisiting strategy-proofness in cloud network sharing. In: 2016 IEEE 24th International Conference on Network Protocols (ICNP), pp. 1–10. IEEE, Singapore. https://doi.org/10.1109/ICNP.2016.7784425

  30. Zhang T, Shu R, Shan Z, Ren F (2019) Distributed bottleneck-aware coflow scheduling in data centers. IEEE Trans Parallel Distrib Syst 30(7):1565–1579. https://doi.org/10.1109/TPDS.2018.2889685

    Article  Google Scholar 

  31. Ben Yedder H, Ding Q, Zakia U, Li Z, Haeri S, Trajkovic L (2017) Comparison of virtualization algorithms and topologies for data center networks. In: 2017 26th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. https://doi.org/10.1109/ICCCN.2017.8038524

  32. Namyar P, Supittayapornpong S, Zhang M, Yu M, Govindan R (2021) A throughput-centric view of the performance of datacenter topologies. In: Proceedings of the 2021 ACM SIGCOMM 2021 Conference, pp. 349–369. ACM, https://doi.org/10.1145/3452296.3472913

  33. Chowdhury NMMK, Phd. (2015) University of California, Berkeley

  34. Inotify(7) - Linux Manual Page

  35. Coflow Benchmark Based on Facebook Traces (2023)

Download references

Funding

This study was supported by the Natural Science Foundation of Shandong Province, China (Grant No.ZR2022QF143), Natural Science Foundation of Shandong Province, China (Grant No.ZR2021QF130), Shaanxi key Laboratory of Information Communication Network and Security (Xi’an university of Posts and Telecommunications) open project (Grant No.ICNS202202), Hubei Key Laboratory of intelligent Robot (Wuhan Institute of Technology) open project (Grant No.HBIR202201), Wuhan knowledge innovation special project (Grant No.30106230186).

Author information

Authors and Affiliations

Authors

Contributions

C.L. wrote the main manuscript text and designed the model of the manuscript.H.Z., Y.F. and S.H. collected the data, performed the analysis.

Corresponding author

Correspondence to Chenghao Li.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Zhang, H., Yang, F. et al. Fast and isolation guaranteed coflow scheduling via traffic forecasting in multi-tenant environment. J Supercomput 80, 26726–26750 (2024). https://doi.org/10.1007/s11227-024-06457-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06457-3

Keywords