Skip to main content
Log in

Kronos: towards bus contention-aware job scheduling in warehouse scale computers

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

While researchers have proposed many techniques to mitigate the contention on the shared cache and memory bandwidth, none of them has considered the memory bus contention due to split lock. Our study shows that the split lock may cause 9X longer data access latency without saturating the memory bandwidth. To minimize the impact of split lock, we propose Kronos, a runtime system composed of an online bus contention tolerance meter and a bus contention-aware job scheduler. The meter characterizes the tolerance of jobs to the “pressure” of bus contention and builds a tolerance model with the polynomial regression technique. The job scheduler allocates user jobs to the physical nodes in a contention aware manner. We design three scheduling policies that minimize the number of required nodes while ensuring the Service Level Agreement (SLA) of all the user jobs, minimize the number of jobs that suffer from SLA violation without enough nodes, and maximize the overall performance without considering the SLA violation, respectively. Adopting the three policies, Kronos reduces the number of the required nodes by 42.1% while ensuring the SLA of all the jobs, reduces the number of the jobs that suffer from SLA violation without enough nodes by 72.8%, and improves the overall performance by 35.2% without considering SLA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Antonopoulos C D, Nikolopoulos D S, Papatheodorou T S. Scheduling algorithms with bus bandwidth considerations for SMPs. In: Proceedings of 2003 International Conference on Parallel Processing, 2003. 2003, 547–554

  2. Xu D, Wu C, Yew P C. On mitigating memory bandwidth contention through bandwidth-aware scheduling. In: Proceedings of the 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT). 2010, 237–247

  3. Chang J, Sohi G S. Cooperative cache partitioning for chip multiprocessors. In: Proceedings of ACM International Conference on Supercomputing 25th Anniversary Volume. 2007, 402–412

  4. Kim S, Chandra D, Solihin Y. Fair cache sharing and partitioning in a chip multiprocessor architecture. In: Proceedings of the 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. 2004, 111–122

  5. Qureshi M K, Patt Y N. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). 2006, 423–432

  6. Lee G, Tolia N, Ranganathan P, Katz R H. Topology-aware resource allocation for data-intensive workloads. In: Proceedings of the 1st ACM Asia-Pacific Workshop on Workshop on Systems. 2010, 1–6

  7. Lv Q, Shi X, Zhou L. Based on ant colony algorithm for cloud management platform resources scheduling. In: Proceedings of World Automation Congress 2012. 2012, 1–4

  8. Wen X, Huang M, Shi J. Study on resources scheduling based on ACO allgorithm and PSO algorithm in cloud computing. In: Proceedings of 2012 the 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science. 2012, 219–222

  9. Zhu K, Song H, Liu L, Gao J, Cheng G. Hybrid genetic algorithm for cloud computing applications. In: Proceedings of 2011 IEEE Asia-Pacific Services Computing Conference. 2011, 182–187

  10. Enable split locked accesses detection.see https://lwn.net/Articles/784864/website, 2020

  11. Koukis E, Koziris N. Memory and network bandwidth aware scheduling of multiprogrammed workloads on clusters of SMPs. In: Proceedings of the 12th International Conference on Parallel and Distributed Systems-(ICPADS’06). 2006, 10

  12. Pinel F, Pecero J E, Bouvry P, Khan S U. Memory-aware green scheduling on multi-core processors. In: Proceedings of the 2010 39th International Conference on Parallel Processing Workshops. 2010, 485–488

  13. Stone H S, Turek J, Wolf J L. Optimal partitioning of cache memory. IEEE Transactions on Computers, 1992, 41(9): 1054–1068

    Article  Google Scholar 

  14. Sato M, Kotera I, Egawa R, Takizawa H, Kobayashi H. A cache-aware thread scheduling policy for multi-core processors. In: Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks. 2009, 109–114

  15. Fedorova A, Blagodurov S, Zhuravlev S. Managing contention for shared resources on multicore processors. Communications of the ACM, 2010, 53(2): 49–57

    Article  Google Scholar 

  16. Chen Q, Huang Z, Guo M, Zhou J. CAB: Cache aware bi-tier task-stealing in multi-socket multi-core architecture. In: Proceedings of 2011 International Conference on Parallel Processing. 2011, 722–732

  17. Chen Q, Guo M, Huang Z. CATS: Cache aware task-stealing based on online profiling in multi-socket multi-core architectures. In: Proceedings of the 26th ACM International Conference on Supercomputing. 2012, 163–172

  18. Chen Q, Chen Y, Huang Z, Guo M. WATS: Workload-aware task scheduling in asymmetric multi-core architectures. In: Proceedings of 2012 IEEE the 26th International Parallel and Distributed Processing Symposium. 2012, 249–260

  19. Chen Q, Guo M, Guan H. LAWS: locality-aware work-stealing for multi-socket multi-core architectures. In: Proceedings of the 28th ACM International Conference on Supercomputing. 2014, 3–12

  20. Feliu J, Petit S, Sahuquillo J, Duato J. Cache-hierarchy contention-aware scheduling in CMPs. IEEE Transactions on Parallel and Distributed Systems, 2014, 25(3): 581–590

    Article  Google Scholar 

  21. Mars J, Tang L, Hundt R, Skadron K, Soffa M L. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible colocations. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. 2011, 248–259

  22. Yang H, Breslow A, Mars J, Tang L. Bubble-flux: Precise online QoS management for increased utilization in warehouse scale computers. ACM SIGARCH Computer Architecture News, 2013, 41(3): 607–618

    Article  Google Scholar 

  23. Delimitrou C, Kozyrakis C. Paragon: QoS-aware scheduling for heterogeneous datacenters. ACM SIGPLAN Notices, 2013, 48(4): 77–88

    Article  Google Scholar 

  24. Delimitrou C, Kozyrakis C. Quasar: Resource-efficient and QoS-aware cluster management. ACM SIGPLAN Notices, 2014, 49(4): 127–144

    Article  Google Scholar 

  25. Lo D, Cheng L, Govindaraju R, Ranganathan P, Kozyrakis C. Heracles: Improving resource efficiency at scale. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture. 2015, 450–462

  26. Chen S, Delimitrou C, Marténez J F. Parties: QoS-aware resource partitioning for multiple interactive services. In: Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems. 2019, 107–120

  27. Dasari D, Andersson B, Nelis V, Petters S M, Easwaran A, Lee J. Response time analysis of cots-based multicores considering the contention on the shared memory bus. In: Proceedings of 2011 IEEE the 10th International Conference on Trust, Security and Privacy in Computing and Communications. 2011, 1068–1075

  28. Rashid S A, Nelissen G, Tovar E. Cache persistence-aware memory bus contention analysis for multicore systems. In: Proceedings of 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). 2020, 442–447

  29. Dasari D, Nelis V, Akesson B. A framework for memory contention analysis in multi-core platforms. Real-Time Systems, 2016, 52(3): 272–322

    Article  Google Scholar 

  30. Dasari D, Nelis V. An analysis of the impact of bus contention on the WCET in multicores. In: Proceedings of 2012 IEEE the 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems. 2012, 1450–1457

  31. Chen Q, Xue S, Zhao S, Chen S, Wu Y, Xu Y, Song Z, Ma T, Yang Y, Guo M. Alita: comprehensive performance isolation through bias resource management for public clouds. In: Proceedings of 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 2020, 1–13

  32. Bienia C, Kumar S, Singh J P, Li K. The parsec benchmark suite: Characterization and architectural implications. In: Proceedings of the 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT). 2008, 72–81

  33. Cortez E, Bonde A, Muzio A, Russinovich M, Fontoura M, Bianchini R. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In: Proceedings of the 26th Symposium on Operating Systems Principles. 2017, 153–167

  34. Zhang X, Zheng X, Wang Z, Li Q, Fu J, Zhang Y, Shen Y. Fast and scalable VMM live upgrade in large cloud infrastructure. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 2019, 93–105

  35. Kuhn H W. The hungarian method for the assignment problem. Naval Research Logistics Quarterly, 1955, 2(1–2): 83–97

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was partially sponsored by the National R&D Program of China (2018YFB1004800), the National Natural Science Foundation of China (Grant Nos. 62022057, 61632017, 61832006) and Alibaba Group. Quan Chen and Minyi Guo are the corresponding authors. We thank Chao Qian for his collaborative effort during data collection. And we also thank anonymous reviewers provided helpful comments on earlier drafts of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quan Chen.

Additional information

Shuai Xue received his BSc degree from Northwestern Polytechnical University, China. He is currently an MSc student in the field of computer science under supervision of Dr. Quan Chen in Department of Computer Engineering Faculty of Shanghai Jiao Tong University, China. His research interests include high performance computing and resource management in datacenters.

Shang Zhao received his BSc degree from Shanghai Jiao Tong University, China. He is currently a graduate student in the field of computer science under the supervision of Dr. Quan Chen in the Department of Computer Science and Engineering of Shanghai Jiao Tong University, China. His research interests include cloud computing and operating system in datacenter.

Quan Chen is a tenure-track associate professor in the Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. His research interests include High performance computing, Task Scheduling in various architectures, Resource management in Datacenter, Runtime System and Operating System. He got his PhD degree at June 2014 from the Department of Computer Science and Engineering, Shanghai Jiao Tong University, China.

Zhuo Song received his BS and MS degree from Wuhan University, China. He is a Staff Engineer at Alibaba, China and working on network optimization and development including but not limited to drivers, network stack and related subsystems, containers and kernel bypass technologies. Now, he mainly focuses on software and hardware co-design technologies for cloud and OS architecture.

Shanpei Chen received his BS degree in network engineering from Dalian University of Technology, China in 2010, and Master’s degree in computer science from Peking University, China in 2013. Now, he is working for Alibaba, China and focuses on system software technology, including scheduler, resource isolation, performance evaluation and optimization.

Tao Ma received his MS degree from China University of Geosciences, China. He is a Principal Engineer at Alibaba, China. and he has more than 15 years of software development in operating system, computer architecture and database system. His major focus is on the operating systems for data center, cloud infrastructure, cloud native systems and high performance computing.

Yong Yang received his BS degree from Inner Mongolia University of Technology, China. He is a Senior Staff Engineer at Alibaba, China and has rich development experiences in system software, server and storage industry. Currently, his major focus is on infrastructure software development, which includes server operating system and hardware appliance development in cloud. He also had specical interest on system performance and QoS in data center and cloud.

Wenli Zheng is an assistant professor with the Computer Science and Engineering Department of Shanghai Jiao Tong University, China. He got the PhD degree from The Ohio State University, USA in 2016, and has been working in the area of computer systems since 2012. Specifically, his research interests include parallel and distributed computing, as well as power management of computer systems.

Minyi Guo received the PhD degree in computer science from the University of Tsukuba, Japan. He is currently Zhiyuan Chair professor and head of the Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. His present research interests include parallel/distributed computing, compiler optimizations, embedded systems, pervasive computing, big data and cloud computing. He is now on the editorial board of IEEE Transactions on Parallel and Distributed Systems, IEEE Transactions on Cloud Computing and Journal of Parallel and Distributed Computing. Dr. Guo is a fellow of IEEE, and a fellow of CCF.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xue, S., Zhao, S., Chen, Q. et al. Kronos: towards bus contention-aware job scheduling in warehouse scale computers. Front. Comput. Sci. 17, 171101 (2023). https://doi.org/10.1007/s11704-021-0418-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-021-0418-5

Keywords

Navigation