skip to main content
10.1145/3412841.3441993acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Quantifying context switch overhead of artificial intelligence workloads on the cloud and edges

Published: 22 April 2021 Publication History

Abstract

Context switching is the fundamental technique for providing flexible and efficient utilization of CPU resources in the multitasking system. However, context switching also introduces non-trivial overhead due to its complicated activities, resulting in not only significant performance degradation of applications, but also dramatic system low efficiency. In this paper, we perform a comprehensive and empirical study on the performance and overhead of context switches in modern artificial intelligence workloads, which identifies unrevealed and important impact on cloud servers and edge/IoT devices. Our observations and root cause analysis cast light on optimizing the system stack of modern operating systems to more efficiently support artificial intelligence workloads on the cloud and edge systems.

References

[1]
2005. LMbench. http://lmbench.sourceforge.net/.
[2]
2015. Contextswitch. https://github.com/tsuna/contextswitch.
[3]
2019. AI benchmark. http://ai-benchmark.com/.
[4]
2020. nmon for Linux. http://nmon.sourceforge.net/.
[5]
2020. nmonchart for Linux. http://nmon.sourceforge.net/docs/sampleC.html.
[6]
2020. Task management. https://www.cs.umd.edu/~hollings/cs412/s02/proj1/ia32ch7.pdf.
[7]
2020. Task register. https://www.scs.stanford.edu/05au-cs240c/lab/i386/s07_03.htm.
[8]
N. Asmussen, M. Roitzsch, and H. Härtig. 2019. M3x: Autonomous Accelerators via Context-Enabled Fast-Path Communication. In Proceedings of USENIX Annual Technical Conference (USENIX ATC).
[9]
S. Awamoto, E. Focht, and M. Honda. 2020. Designing a Storage Software Stack for Accelerators. In 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage).
[10]
A. Bourge, O. Muller, and F. Rousseau. 2016. Generating Efficient Context-switch Capable Circuits through Autonomous Design Flow. In Proceedings of ACM Transactions on Reconfigurable Technology and Systems (TRETS). ACM New York, NY, USA.
[11]
Ying Chen, Ning Zhang, Yongchao Zhang, Xin Chen, Wen Wu, and Xuemin Sherman Shen. 2019. TOFFEE: Task offloading and frequency scaling for energy efficiency of mobile devices in mobile edge computing. In Proceedings of IEEE Transactions on Cloud Computing (TC).
[12]
F. David, J. Carlyle, and R. Campbell. 2007. Context Switch Overheads for Linux on ARM Platforms. In Proceedings of the 2007 workshop on Experimental computer science.
[13]
R. Davis, S. Altmeyer, and A. Burns. 2018. Mixed Criticality Systems with Varying Context Switch Costs. In 2018 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE.
[14]
R. Davis, S. Altmeyer, and A. Burns. 2018. Priority Assignment in Fixed Priority Pre-emptive Systems with Varying Context Switch Costs. In Proc. RTSS Workshop on Open Problems. 11--12.
[15]
Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proceedings of ACM MobiSys.
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. arXiv:cs.CV/1512.03385
[17]
F. Hermenier, A. Lèbre, and J. Menaud. 2010. Cluster-wide Context Switch of Virtualized Jobs. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. 658--666.
[18]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:cs.CV/1704.04861
[19]
Hang Huang, Jia Rao, Song Wu, Hai Jin, Kun Suo, and Xiaofeng Wu. 2019. Adaptive resource views for containers. In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC).
[20]
N. Jammula, M. Qureshi, A. Gavrilovska, and J. Kim. 2014. Balancing Context Switch Penalty and Response Time with Elastic Time Slicing. In 2014 21st International Conference on High Performance Computing (HiPC). IEEE, 1--10.
[21]
N. Jammula, M. Qureshi, A. Gavrilovska, and J. Kim. 2015. Reducing Cache-Associated Context-Switch Performance Penalty Using Elastic Time Slicing. TECHNICAL JOURNAL (2015), 23.
[22]
T. Lee, C. Hu, L. Lai, and C. Tsai. 2010. Hardware Context-switch Methodology for Dynamically Partially Reconfigurable Systems. Journal of Information Science and Engineering 26, 4 (2010), 1289--1305.
[23]
Jiaxin Lei, Kun Suo, Hui Lu, and Jia Rao. 2019. Tackling parallelization challenges of kernel network stack for container overlay networks. In Proceedings of 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud).
[24]
C. Li, C. Ding, and K. Shen. 2007. Quantifying The Cost of Context Switch. In Proceedings of the 2007 workshop on Experimental computer science.
[25]
He Li, Kaoru Ota, and Mianxiong Dong. 2018. Learning IoT in edge: Deep learning for the Internet of Things with edge computing. In Proceedings of IEEE Network.
[26]
F. Liu, F. Guo, Y. Solihin, S. Kim, and A. Eker. 2008. Characterizing and Modeling The Behavior of Context Switch Misses. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques. 91--101.
[27]
F. Liu and Y. Solihin. 2010. Understanding the Behavior and Implications of Context Switch Misses. ACM Transactions on Architecture and Code Optimization (TACO) 7, 4 (2010), 1--28.
[28]
Y. Marathe, N. Gulur, Jee H. Ryoo, S. Song, and L. John. 2017. CSALT: Context Switch Aware Large TLB. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. 449--462.
[29]
Roberto Morabito, Vittorio Cozzolino, Aaron Yi Ding, Nicklas Beijar, and Jorg Ott. 2018. Consolidate IoT edge computing with lightweight virtualization. In Processing of IEEE Network.
[30]
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic image synthesis with spatially-adaptive normalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2337--2346.
[31]
Xukan Ran, Haolianz Chen, Xiaodan Zhu, Zhenming Liu, and Jiasi Chen. 2018. Deepdecision: A mobile deep learning framework for edge video analytics. In Proceedings of IEEE INFOCOM.
[32]
Jungmin Son, TianZhang He, and Rajkumar Buyya. 2019. CloudSimSDN-NFV: Modeling and simulation of network function virtualization and service function chaining in edge computing environments. In Proceedings of Software: Practice and Experience.
[33]
Haijian Sun, Fuhui Zhou, and Rose Qingyang Hu. 2019. Joint offloading and computation energy efficiency maximization in a mobile edge computing system. In Proceedings of IEEE Transactions on Vehicular Technology.
[34]
Kun Suo, Jia Rao, Luwei Cheng, and Francis CM Lau. 2016. Time capsule: Tracing packet latency across different layers in virtualized systems. In Proceedings of the 7th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys).
[35]
Kun Suo, Jia Rao, Hong Jiang, and Witawas Srisa-an. 2018. Characterizing and optimizing hotspot parallel garbage collection on multicore systems. In Proceedings of the Thirteenth EuroSys Conference (EuroSys).
[36]
Kun Suo, Yong Shi, Xiaohua Xu, Dazhao Cheng, and Wei Chen. 2020. Tackling Cold Start in Serverless Computing with Container Runtime Reusing. In Proceedings of the Workshop on Network Application Integration/CoDesign (NAI).
[37]
Kun Suo, Yong Zhao, Wei Chen, and Jia Rao. 2018. An Analysis and Empirical Study of Container Networks. In Proceedings of IEEE International Conference on Computer Communications (INFOCOM).
[38]
Kun Suo, Yong Zhao, Wei Chen, and Jia Rao. 2018. vnettracer: Efficient and programmable packet tracing in virtualized networks. In Proceedings of IEEE 38th International Conference on Distributed Computing Systems (ICDCS).
[39]
Kun Suo, Yong Zhao, Jia Rao, Luwei Cheng, Xiaobo Zhou, and Francis CM Lau. 2017. Preserving I/O prioritization in virtualized OSes. In Proceedings of the Symposium on Cloud Computing (SoCC).
[40]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.
[41]
D. Tsafrir. 2007. The Context-switch Overhead Inflicted by Hardware Interrupts (and the Enigma of Do-nothing Loops). In Proceedings of the 2007 workshop on Experimental computer science. 4--es.
[42]
Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, and Jiaya Jia. 2018. Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the European Conference on Computer Vision (ECCV). 405--420.
[43]
Yong Zhao, Kun Suo, Luwei Cheng, and Jia Rao. 2017. Scheduler activations for interference-resilient smp virtual machine scheduling. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware).
[44]
Yong Zhao, Kun Suo, Xiaofeng Wu, Jia Rao, Song Wu, and Hai Jin. 2019. Preemptive multi-queue fair queuing. In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC).
[45]
X. Zhou and P. Petrov. 2006. Rapid and Low-cost Context-switch through Embedded Processor Customization for Real-time and Control Applications. In 2006 43rd ACM/IEEE Design Automation Conference. IEEE, 352--357.

Cited By

View all
  • (2024)AI-Driven Cloud Computing to Revolutionize Industries and Overcome ChallengesEmerging Trends in Cloud Computing Analytics, Scalability, and Service Models10.4018/979-8-3693-0900-1.ch021(395-410)Online publication date: 25-Jan-2024
  • (2024)MESC: Re-thinking Algorithmic Priority and/or Criticality Inversions for Heterogeneous MCSs2024 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS62706.2024.00011(1-14)Online publication date: 10-Dec-2024
  • (2024)An Improved Linux Priority Scheduling Method Based on XGBoost2024 20th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)10.1109/ICNC-FSKD64080.2024.10702280(1-8)Online publication date: 27-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing
March 2021
2075 pages
ISBN:9781450381048
DOI:10.1145/3412841
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. artificial intelligence
  2. cloud
  3. context switch
  4. edge
  5. system

Qualifiers

  • Research-article

Funding Sources

  • Kennesaw State University

Conference

SAC '21
Sponsor:
SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing
March 22 - 26, 2021
Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)6
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)AI-Driven Cloud Computing to Revolutionize Industries and Overcome ChallengesEmerging Trends in Cloud Computing Analytics, Scalability, and Service Models10.4018/979-8-3693-0900-1.ch021(395-410)Online publication date: 25-Jan-2024
  • (2024)MESC: Re-thinking Algorithmic Priority and/or Criticality Inversions for Heterogeneous MCSs2024 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS62706.2024.00011(1-14)Online publication date: 10-Dec-2024
  • (2024)An Improved Linux Priority Scheduling Method Based on XGBoost2024 20th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)10.1109/ICNC-FSKD64080.2024.10702280(1-8)Online publication date: 27-Jul-2024
  • (2024)Coarse-to-Fine: A hierarchical DNN inference framework for edge computingFuture Generation Computer Systems10.1016/j.future.2024.03.009157(180-192)Online publication date: Aug-2024
  • (2023)A Robust Scheduling Algorithm for Overload-Tolerant Real-Time Systems2023 IEEE 26th International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC58943.2023.00013(1-10)Online publication date: May-2023
  • (2022)Preferred Benchmarking Criteria for Systematic Taxonomy of Embedded Platforms (STEP) in Human System Interaction Systems2022 15th International Conference on Human System Interaction (HSI)10.1109/HSI55341.2022.9869470(1-7)Online publication date: 28-Jul-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media