Consistent Low-Latency Scheduling for Microsecond-Scale Tasks in Data Centers

Yu, Qiuyu; Zhang, Tong; Yi, Changyan

doi:10.1007/978-3-031-71470-2_9

Qiuyu Yu¹¹,
Tong Zhang¹¹ &
Changyan Yi¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14999))

Included in the following conference series:

International Conference on Wireless Artificial Intelligent Computing Systems and Applications

198 Accesses

Abstract

In large-scale data centers, many cloud applications with stringent latency requirements exhibit partition-aggregate patterns. Individual jobs necessitate responses from thousands of software services, thereby demanding that the tail latency of each participating task be maintained within tens to hundreds of microseconds to ensure rapid response to user operations. To meet this requirement, researchers have proposed a series of innovative microsecond-level task scheduling algorithms. However, they primarily focus on intra-server scheduling, overlooking the influence of inter-server scheduling. Furthermore, existing algorithms fail to account for the inconsistency in the execution order of different jobs’ tasks on multiple servers. This inconsistency can lead to delayed completion of jobs and delayed response to users. To solve these problems, this paper proposes a two-level consistent low-latency scheduling algorithm CLLSched for microsecond-level tasks, aiming to achieve both low tail completion time and high CPU utilization on servers. CLLSched employs a server selection strategy based on the power-of-k-choice principle. Within each server, CLLSched implements a fine-grained dynamic core allocation strategy based on task types. According to simulation results, compared to the most advanced counterpart RackSched, CLLSched achieves a 2.78x reduction in tail latency for short tasks. Additionally, CLLSched improves CPU utilization by 1.14 times and achieves a 1.23x increase in cluster throughput. More importantly, it substantially reduces job completion times with a minimum reduction of 60%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

FSRmSTS—An Optimize Task Scheduling with a Hybrid Approach: Integrating FCFS, SJF, and RR with Median Standard Time Slice

A Min-cost with Delay Scheduling Method for Large Scale Instance Intensive Tasks

SPORTS: A Semi-partitioned Real-Time Scheduler for Heterogeneous Multicore Platforms

References

Memcached key-value store. http://memcached.org/
Redis data structure store. http://redis.io/
RocksDB. http://rocksdb.org/
Belay, A., Prekas, G., Primorac, M., et al.: The IX operating system: combining low latency, high throughput, and efficiency in a protected dataplane. ACM Trans. Comput. Syst. (TOCS) 34(4), 1–39 (2016)
Article Google Scholar
Prekas, G., et al.: Zygos: achieving low tail latency for microsecond-scale networked tasks. In: 26th Symposium on Operating Systems Principles, pp. 325–341 (2017)
Google Scholar
Qin, H., et al.: Arachne: core-aware thread management. In: 13th USENIX Symposium on Operating Systems Design and Implementation, pp. 145–160 (2018)
Google Scholar
Ousterhout, A., et al.: Shenango: achieving high CPU efficiency for latency-sensitive datacenter workloads. In: 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp. 361–378 (2019)
Google Scholar
Fried, J., et al.: Caladan: mitigating interference at microsecond timescales. In: 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 281–297 (2020)
Google Scholar
Kaffes, K., et al.: Shinjuku: preemptive scheduling for $\mu $second-scale tail latency. In: 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp. 345–360 (2019)
Google Scholar
Gog, I., et al.: Firmament: Fast, centralized cluster scheduling at scale. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 99–115 (2016)
Google Scholar
Chen, S., Delimitrou, C., Martínez, J.F.: Parties: QOS-aware resource partitioning for multiple interactive services. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 107–120 (2019)
Google Scholar
Cho, I., et al.: Overload control for $\upmu $s-scale RPCs with breakwater. In: 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 299–314 (2020)
Google Scholar
Zhu, H., et al.: RackSched: a microsecond-scale scheduler for rack-scale computers. In: 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 1225–1240 (2020)
Google Scholar
Demoulin, H.M., et al.: When idling is ideal: optimizing tail-latency for heavy-tailed datacenter workloads with Perséphone. In: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, pp. 621–637 (2021)
Google Scholar
Mitzenmacher, M.: The power of two choices in randomized load balancing. IEEE Trans. Parallel Distrib. Syst. 12(10), 1094–1104 (2001)
Article Google Scholar
Tan, L., Su, W., Zhang, W., et al.: In-band network telemetry: a survey. Comput. Netw. 186, 107763 (2021)
Article Google Scholar
McClure, S., et al.: Efficient scheduling policies for microsecond-scale tasks. In: 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp. 1–18 (2022)
Google Scholar
Atikoglu, B, Xu, Y.: Workload analysis of a large-scale key-value store. In: Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, pp. 53–64 (2012)
Google Scholar
Cooper, B.F., Silberstein, A.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154 (2010)
Google Scholar

Download references

Acknowledgments

The authors gratefully acknowledge the anonymous reviewers for their constructive comments. This work is supported in part by the Fundamental Research Funds for the Central Universities, NO. NS2023049 and by National Natural Science Foundation of China (NSFC) under Grant No. 62132007.

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Qiuyu Yu, Tong Zhang & Changyan Yi

Authors

Qiuyu Yu
View author publications
You can also search for this author in PubMed Google Scholar
Tong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Changyan Yi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tong Zhang .

Editor information

Editors and Affiliations

Georgia State University, Atlanta, GA, USA
Zhipeng Cai
Old Dominion University, Norfolk, VA, USA
Daniel Takabi
Beijing University of Posts and Telecommunications, Beijing, China
Shaoyong Guo
Shandong University, Qingdao, China
Yifei Zou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, Q., Zhang, T., Yi, C. (2025). Consistent Low-Latency Scheduling for Microsecond-Scale Tasks in Data Centers. In: Cai, Z., Takabi, D., Guo, S., Zou, Y. (eds) Wireless Artificial Intelligent Computing Systems and Applications. WASA 2024. Lecture Notes in Computer Science, vol 14999. Springer, Cham. https://doi.org/10.1007/978-3-031-71470-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-71470-2_9
Published: 13 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71469-6
Online ISBN: 978-3-031-71470-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Consistent Low-Latency Scheduling for Microsecond-Scale Tasks in Data Centers