Analysis of Mixed Workloads from Shared Cloud Infrastructure

Klusáček, Dalibor; Parák, Boris

doi:10.1007/978-3-319-77398-8_2

Dalibor Klusáček¹⁶ &
Boris Parák¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10773))

Included in the following conference series:

Workshop on Job Scheduling Strategies for Parallel Processing

604 Accesses
3 Citations

Abstract

Modern computing environments such as clouds, grids or HPC clusters are both complex and costly installations. Therefore, it has always been a major challenge to utilize them properly. Workload scheduling is a critical process in every production system with an unwanted potential to hamper overall performance if the given scheduler is not adequate or properly configured. Therefore, researchers as well as system administrators are frequently using historic workload traces to model/analyze the behavior of real systems in order to improve existing scheduling approaches. In this work we provide such real-life workload traces from the CERIT-SC system. Importantly, our traces describe a “mixed” workload consisting of both cloud VMs and grid jobs executed over a shared computing infrastructure. Provided workloads represent an interesting scheduling problem. First, these mixed workloads involving both “grid jobs” and cloud VMs increase the complexity of required (co)scheduling necessary to efficiently use the underlying physical infrastructure. Second, we also provide a detailed description of the setup of the system, its operational constraints and unresolved issues, putting the observed workloads into a broader context. Last but not least, the workloads are made freely available to the scientific community allowing for further independent research and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PerficientCloudSim: a tool to simulate large-scale computation in heterogeneous clouds

Article 10 September 2020

Trace-Based Workload Generation and Execution

Introduction to the 3rd International Workshop on Cloud Computing and Scientific Applications (CCSA’13)

Notes

1.
Nice overview can be found at: http://bit.ly/2kLf44d.
2.
VM overcommitment factor is computed as \( \text {vCPUs}/\text {CPUs}\).
3.
As discussed in Sect. 2.2, “grid worker” VMs are started/stopped by the system administrator, so the 72% utilization of cloud VMs is computed with respect to the remaining (i.e., available) capacity in the system (see Fig. 6).
4.
It is the dukan cluster which is not part of the CERIT-SC infrastructure but it executes similar workloads from the same user-base.
5.
This log is available at: https://github.com/CERIT-SC/cerit-maintenance.
6.
For example, real schedulers must limit the number of concurrently running licensed applications (jobs using licensed SW) with respect to the number of available software licenses, i.e., even if resources are free some jobs must wait until a license is available. Such information is not usually recorded in the workload.
7.
SWF format: http://www.cs.huji.ac.il/labs/parallel/workload/swf.html.

References

Adaptive Computing Enterprises, Inc.: Torque 6.1.0 Administrator Guide, February 2017. http://docs.adaptivecomputing.com
CERIT Scientific Cloud, February 2017. http://www.cerit-sc.cz
Ernemann, C., Hamscher, V., Yahyapour, R.: Benefits of global Grid computing for job scheduling. In: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, GRID 2004, pp. 374–379. IEEE (2004)
Google Scholar
Feitelson, D.G.: Parallel workloads archive, February 2017. http://www.cs.huji.ac.il/labs/parallel/workload/
Feitelson, D.G., Rudolph, L., Schwiegelshohn, U., Sevcik, K.C., Wong, P.: Theory and practice in parallel job scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-63574-2_14
Chapter Google Scholar
Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R., Shenker, S., Stoica, I.: Mesos: a platform for fine-grained resource sharing in the data center. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI 2011, pp. 295–308, Berkeley, CA, USA. USENIX Association (2011)
Google Scholar
Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.J.: The Grid workloads archive. Future Gener. Comput. Syst. 24(7), 672–686 (2008)
Article Google Scholar
Jackson, D., Snell, Q., Clement, M.: Core algorithms of the Maui scheduler. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 87–102. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45540-X_6
Chapter Google Scholar
Jackson, K.: OpenStack Cloud Computing Cookbook. Packt Publishing, Birmingham (2012)
Google Scholar
Jones, C., Wilkes, J., Murphy, N., Smith, C., Beyer, B.: Service level objectives. In: Beyer, B., Jones, C., Petoff, J., Murphy, N. (eds.), Site Reliability Engineering: How Google Runs Production Systems, Chap. 4. O’Reilly Media (2016). https://landing.google.com/sre/book.html
Introducing JSON, February 2017. http://www.json.org/
Klusáček, D.: Workload traces from CERIT Scientific Cloud, February 2017. http://jsspp.org/workload/
Klusáček, D., Chlumský, V.: Planning and metaheuristic optimization in production job scheduler. In: Desai, N., Cirne, W. (eds.) JSSPP 2015-2016. LNCS, vol. 10353, pp. 198–216. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61756-5_11
Chapter Google Scholar
Klusáček, D., Tóth, Š.: On interactions among scheduling policies: finding efficient queue setup using high-resolution simulations. In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014. LNCS, vol. 8632, pp. 138–149. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09873-9_12
Google Scholar
Klusáček, D., Tóth, Š., Podolníková, G.: Real-life experience with major reconfiguration of job scheduling system. In: Desai, N., Cirne, W. (eds.) JSSPP 2015-2016. LNCS, vol. 10353, pp. 83–101. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61756-5_5
Chapter Google Scholar
Krakov, D., Feitelson, D.G.: High-resolution analysis of parallel job workloads. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2012. LNCS, vol. 7698, pp. 178–195. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35867-8_10
Chapter Google Scholar
Merkel, D.: Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014)
Google Scholar
MetaCentrum, February 2017. http://www.metacentrum.cz/
Montero, R.S., Llorente, I.M., Miloji, D.: OpenNebula: a cloud management tool. IEEE Internet Comput. 15(2), 11–14 (2011)
Article Google Scholar
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib.Syst. 12(6), 529–543 (2001)
Article Google Scholar
Managing virtual machines, February 2017. https://archives.opennebula.org/documentation:rel4.4:vm_guide_2
Reiss, C., Wilkes, J., Hellerstein, J.L.: Google cluster-usage traces: format+schema. Technical report, Google Inc., Mountain View, CA, USA, November 2011. Version 2.1. Posted at https://github.com/google/cluster-data. Accessed 17 Nov 2014
Singh, K.: Ceph Cookbook. Packt Publishing, Birmingham (2016)
Google Scholar
SWIM workload repository, February 2017. https://github.com/SWIMProjectUCB/SWIM/wiki/Workloads-repository
Wolski, R., Brevik, J.: Using parametric models to represent private cloud workloads. IEEE Trans. Serv. Comput. 7(4), 714–725 (2014)
Article Google Scholar

Download references

Acknowledgments

We kindly acknowledge the support and computational resources provided by the MetaCentrum under the program LM2015042 and the CERIT Scientific Cloud under the program LM2015085, provided under the programme “Projects of Large Infrastructure for Research, Development, and Innovations” and the project Reg. No. CZ.02.1.01/0.0/0.0/16_013/0001797 co-funded by the Ministry of Education, Youth and Sports of the Czech Republic. We also highly appreciate the access to CERIT Scientific Cloud workload traces.

Author information

Authors and Affiliations

CESNET a.l.e., Brno, Czech Republic
Dalibor Klusáček & Boris Parák

Authors

Dalibor Klusáček
View author publications
You can also search for this author in PubMed Google Scholar
Boris Parák
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dalibor Klusáček .

Editor information

Editors and Affiliations

CESNET, Prague, Czech Republic
Dalibor Klusáček
Google, Mountain View, California, USA
Walfredo Cirne
Google, Seattle, Washington, USA
Narayan Desai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Klusáček, D., Parák, B. (2018). Analysis of Mixed Workloads from Shared Cloud Infrastructure. In: Klusáček, D., Cirne, W., Desai, N. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2017. Lecture Notes in Computer Science(), vol 10773. Springer, Cham. https://doi.org/10.1007/978-3-319-77398-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-77398-8_2
Published: 28 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77397-1
Online ISBN: 978-3-319-77398-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Analysis of Mixed Workloads from Shared Cloud Infrastructure

Abstract

Access this chapter

Similar content being viewed by others

PerficientCloudSim: a tool to simulate large-scale computation in heterogeneous clouds

Trace-Based Workload Generation and Execution

Introduction to the 3rd International Workshop on Cloud Computing and Scientific Applications (CCSA’13)

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Analysis of Mixed Workloads from Shared Cloud Infrastructure

Abstract

Access this chapter

Similar content being viewed by others

PerficientCloudSim: a tool to simulate large-scale computation in heterogeneous clouds

Trace-Based Workload Generation and Execution

Introduction to the 3rd International Workshop on Cloud Computing and Scientific Applications (CCSA’13)

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation