Skip to main content

Analysis of Mixed Workloads from Shared Cloud Infrastructure

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10773))

Included in the following conference series:

Abstract

Modern computing environments such as clouds, grids or HPC clusters are both complex and costly installations. Therefore, it has always been a major challenge to utilize them properly. Workload scheduling is a critical process in every production system with an unwanted potential to hamper overall performance if the given scheduler is not adequate or properly configured. Therefore, researchers as well as system administrators are frequently using historic workload traces to model/analyze the behavior of real systems in order to improve existing scheduling approaches. In this work we provide such real-life workload traces from the CERIT-SC system. Importantly, our traces describe a “mixed” workload consisting of both cloud VMs and grid jobs executed over a shared computing infrastructure. Provided workloads represent an interesting scheduling problem. First, these mixed workloads involving both “grid jobs” and cloud VMs increase the complexity of required (co)scheduling necessary to efficiently use the underlying physical infrastructure. Second, we also provide a detailed description of the setup of the system, its operational constraints and unresolved issues, putting the observed workloads into a broader context. Last but not least, the workloads are made freely available to the scientific community allowing for further independent research and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Nice overview can be found at: http://bit.ly/2kLf44d.

  2. 2.

    VM overcommitment factor is computed as \( \text {vCPUs}/\text {CPUs}\).

  3. 3.

    As discussed in Sect. 2.2, “grid worker” VMs are started/stopped by the system administrator, so the 72% utilization of cloud VMs is computed with respect to the remaining (i.e., available) capacity in the system (see Fig. 6).

  4. 4.

    It is the dukan cluster which is not part of the CERIT-SC infrastructure but it executes similar workloads from the same user-base.

  5. 5.

    This log is available at: https://github.com/CERIT-SC/cerit-maintenance.

  6. 6.

    For example, real schedulers must limit the number of concurrently running licensed applications (jobs using licensed SW) with respect to the number of available software licenses, i.e., even if resources are free some jobs must wait until a license is available. Such information is not usually recorded in the workload.

  7. 7.

    SWF format: http://www.cs.huji.ac.il/labs/parallel/workload/swf.html.

References

  1. Adaptive Computing Enterprises, Inc.: Torque 6.1.0 Administrator Guide, February 2017. http://docs.adaptivecomputing.com

  2. CERIT Scientific Cloud, February 2017. http://www.cerit-sc.cz

  3. Ernemann, C., Hamscher, V., Yahyapour, R.: Benefits of global Grid computing for job scheduling. In: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, GRID 2004, pp. 374–379. IEEE (2004)

    Google Scholar 

  4. Feitelson, D.G.: Parallel workloads archive, February 2017. http://www.cs.huji.ac.il/labs/parallel/workload/

  5. Feitelson, D.G., Rudolph, L., Schwiegelshohn, U., Sevcik, K.C., Wong, P.: Theory and practice in parallel job scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-63574-2_14

    Chapter  Google Scholar 

  6. Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R., Shenker, S., Stoica, I.: Mesos: a platform for fine-grained resource sharing in the data center. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI 2011, pp. 295–308, Berkeley, CA, USA. USENIX Association (2011)

    Google Scholar 

  7. Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.J.: The Grid workloads archive. Future Gener. Comput. Syst. 24(7), 672–686 (2008)

    Article  Google Scholar 

  8. Jackson, D., Snell, Q., Clement, M.: Core algorithms of the Maui scheduler. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 87–102. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45540-X_6

    Chapter  Google Scholar 

  9. Jackson, K.: OpenStack Cloud Computing Cookbook. Packt Publishing, Birmingham (2012)

    Google Scholar 

  10. Jones, C., Wilkes, J., Murphy, N., Smith, C., Beyer, B.: Service level objectives. In: Beyer, B., Jones, C., Petoff, J., Murphy, N. (eds.), Site Reliability Engineering: How Google Runs Production Systems, Chap. 4. O’Reilly Media (2016). https://landing.google.com/sre/book.html

  11. Introducing JSON, February 2017. http://www.json.org/

  12. Klusáček, D.: Workload traces from CERIT Scientific Cloud, February 2017. http://jsspp.org/workload/

  13. Klusáček, D., Chlumský, V.: Planning and metaheuristic optimization in production job scheduler. In: Desai, N., Cirne, W. (eds.) JSSPP 2015-2016. LNCS, vol. 10353, pp. 198–216. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61756-5_11

    Chapter  Google Scholar 

  14. Klusáček, D., Tóth, Š.: On interactions among scheduling policies: finding efficient queue setup using high-resolution simulations. In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014. LNCS, vol. 8632, pp. 138–149. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09873-9_12

    Google Scholar 

  15. Klusáček, D., Tóth, Š., Podolníková, G.: Real-life experience with major reconfiguration of job scheduling system. In: Desai, N., Cirne, W. (eds.) JSSPP 2015-2016. LNCS, vol. 10353, pp. 83–101. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61756-5_5

    Chapter  Google Scholar 

  16. Krakov, D., Feitelson, D.G.: High-resolution analysis of parallel job workloads. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2012. LNCS, vol. 7698, pp. 178–195. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35867-8_10

    Chapter  Google Scholar 

  17. Merkel, D.: Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014)

    Google Scholar 

  18. MetaCentrum, February 2017. http://www.metacentrum.cz/

  19. Montero, R.S., Llorente, I.M., Miloji, D.: OpenNebula: a cloud management tool. IEEE Internet Comput. 15(2), 11–14 (2011)

    Article  Google Scholar 

  20. Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib.Syst. 12(6), 529–543 (2001)

    Article  Google Scholar 

  21. Managing virtual machines, February 2017. https://archives.opennebula.org/documentation:rel4.4:vm_guide_2

  22. Reiss, C., Wilkes, J., Hellerstein, J.L.: Google cluster-usage traces: format+schema. Technical report, Google Inc., Mountain View, CA, USA, November 2011. Version 2.1. Posted at https://github.com/google/cluster-data. Accessed 17 Nov 2014

  23. Singh, K.: Ceph Cookbook. Packt Publishing, Birmingham (2016)

    Google Scholar 

  24. SWIM workload repository, February 2017. https://github.com/SWIMProjectUCB/SWIM/wiki/Workloads-repository

  25. Wolski, R., Brevik, J.: Using parametric models to represent private cloud workloads. IEEE Trans. Serv. Comput. 7(4), 714–725 (2014)

    Article  Google Scholar 

Download references

Acknowledgments

We kindly acknowledge the support and computational resources provided by the MetaCentrum under the program LM2015042 and the CERIT Scientific Cloud under the program LM2015085, provided under the programme “Projects of Large Infrastructure for Research, Development, and Innovations” and the project Reg. No. CZ.02.1.01/0.0/0.0/16_013/0001797 co-funded by the Ministry of Education, Youth and Sports of the Czech Republic. We also highly appreciate the access to CERIT Scientific Cloud workload traces.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dalibor Klusáček .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Klusáček, D., Parák, B. (2018). Analysis of Mixed Workloads from Shared Cloud Infrastructure. In: Klusáček, D., Cirne, W., Desai, N. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2017. Lecture Notes in Computer Science(), vol 10773. Springer, Cham. https://doi.org/10.1007/978-3-319-77398-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77398-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77397-1

  • Online ISBN: 978-3-319-77398-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics