Skip to main content

Resource Management for Running HPC Applications in Container Clouds

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9697))

Included in the following conference series:

Abstract

Innovations in operating-system-level virtualization technologies such as resource control groups, isolated namespaces, and layered file systems have driven a new breed of virtualization solutions called containers. Applications running in containers depend on the host operating system (OS) for resource allocation, throttling, and prioritization. However, the OS is designed to provide only best-effort/fair-share resource allocation. Lack of resource management, as in virtual machine managers, constrains the use of containers and container-based clusters to a subset of workloads other than traditional high-performance computing (HPC) workflows. In this paper, we describe problems with the fair-share resource management of CPUs, network bandwidth, and I/O bandwidth on HPC workloads and present mechanisms to allocate, throttle, and prioritize each of these three critical resources in containerized HPC environments. These mechanisms enable container-based HPC clusters to host applications with different resource requirements and enforce effective resource use so that a large collection of HPC applications can benefit from the flexibility, portability, and agile characteristics of containers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Block I/O Controller. https://www.kernel.org/doc/Documentation/cgroups/blkio-controller.txt

  2. CGroups. https://www.kernel.org/doc/Documentation/cgroups/

  3. Cloud Foundry Warden. https://github.com/cloudfoundry/warden

  4. Kubernets by Google. http://kubernetes.io

  5. Linux Advanced Traffic Control. http://lartc.org/howto/

  6. Network Classifier CGroup. https://www.kernel.org/doc/Documentation/cgroups/net_cls.txt

  7. Blagodurov, S., Fedorova, A.: Towards the contention-aware scheduling in HPC cluster environment. J. Phys. Conf. Ser. 385(1), 012010 (2012)

    Article  Google Scholar 

  8. Dandapanthula, N., Stanfield, J.: High Performance Computing - Containers, Docker, Virtual Machines and HPC. http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2014/11/04/containers-docker-virtual-machines-and-hpc

  9. Diaz, J.M.M., Landwehr, A., Taufer, M.: Poster: resource management layers for dynamic CPU resource allocation in containerized cloud environments. In: Proceedings of the IEEE Cluster 2015 Conference, pp. 1–2, September 2015

    Google Scholar 

  10. Dongarra, J.J.: Performance of various computers using standard linear equations software. SIGARCH Comput. Archit. News 20(3), 22–44 (1992)

    Article  Google Scholar 

  11. Dusia, A., Yang, Y., Taufer, M.: Poster: network quality of service in Docker containers. In: Proceedings of the IEEE Cluster 2015 Conference, pp. 1–2, September 2015

    Google Scholar 

  12. Dusia, A., Yang, Y.: Nework QoS Mechanism - Docker, May 2015. https://github.com/adusia/docker

  13. Feitelson, D.G., Rudolph, L.: Parallel job scheduling: issues and approaches. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 1–18. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  14. Felter, W., Ferreira, A., Rajamony, R., Rubio, J.: An updated performance comparison of virtual machines and Linux containers. In: 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (2015)

    Google Scholar 

  15. Herbein, S.: I/O QoS Mechanism - Docker Swarm, May 2015. https://github.com/SteVwonder/swarm

  16. Hong, J., Balaji, P., Wen, G., Tu, B., Yan, J., Xu, C., Feng, S.: Container-based job management for fair resource sharing. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 290–301. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  17. Jacobsen, D., Canon, R.: Contain this, unleashing Docker for HPC. In: Cray User Group (CUG 2015), Chicago, IL, April 2015

    Google Scholar 

  18. Kamp, P.H., Watson, R.N.M.: Jails: confining the omnipotent root. In: Proceedings of the 2nd International SANE Conference (2000)

    Google Scholar 

  19. McDaniel, S., Herbein, S., Taufer, M.: Poster: a two-tiered approach to I/O quality of service in Linux. In: Proceedings of the IEEE Cluster 2015 Conference, pp. 1–2, September 2015

    Google Scholar 

  20. McDaniel, S.: I/O QoS Mechanism - Docker, May 2015. https://github.com/seanmcdaniel/docker/

  21. Merkel, D.: Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014)

    Google Scholar 

  22. Monsalve, J., Landwehr, A.: CPU QoS Mechanism - Docker, May 2015. https://github.com/josemonsalve2/docker

  23. Ruiz, C., Jeanvoine, E., Nussbaum, L.: Performance evaluation of containers for HPC. In: Hunold, S., et al. (eds.) Euro-Par 2015 Workshops. LNCS, vol. 9523, pp. 813–824. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27308-2_65

    Chapter  Google Scholar 

  24. Sironi, F., Bartolini, D., Campanoni, S., Cancare, F., Hoffmann, H., Sciuto, D., Santambrogio, M.: Metronome: operating system level performance management via self-adaptive computing. In: Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, pp. 856–865, June 2012

    Google Scholar 

  25. Soltesz, S., Pötzl, H., Fiuczynski, M.E., Bavier, A., Peterson, L.: Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. In: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, pp. 275–287 (2007)

    Google Scholar 

  26. Vaughan-Nichols, S.: New approach to virtualization is a lightweight. Computer 39(11), 12–14 (2006)

    Article  Google Scholar 

  27. Xavier, M., Neves, M., Rossi, F., Ferreto, T., Lange, T., De Rose, C.: Performance evaluation of container-based virtualization for high performance computing environments. In: 2013 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 233–240, February 2013

    Google Scholar 

Download references

Acknowledgment

This work is supported by NSF grant #312259 and #312236. We also thank IBM for providing access to their Softlayer (http://www.softlayer.com) and Supervessel (https://ptopenlab.com) cloud platforms and for providing guidance on container technologies. Our code can be found on GitHub at [12, 15, 20, 22].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michela Taufer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Herbein, S. et al. (2016). Resource Management for Running HPC Applications in Container Clouds. In: Kunkel, J., Balaji, P., Dongarra, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9697. Springer, Cham. https://doi.org/10.1007/978-3-319-41321-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41321-1_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41320-4

  • Online ISBN: 978-3-319-41321-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics