skip to main content
10.1145/3437359.3465603acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article

Real-World, Self-Hosted Kubernetes Experience

Published: 17 July 2021 Publication History

Abstract

Containerized applications have exploded in popularity in recent years, due to their ease of deployment, reproducible nature, and speed of startup. Accordingly, container orchestration tools such as Kubernetes have emerged as resource providers and users alike try to organize and scale their work across clusters of systems. This paper documents some real-world experiences of building, operating, and using self-hosted Kubernetes Linux clusters. It aims at comparisons between Kubernetes and single-node container solutions and traditional multi-user, batch queue Linux clusters.
The authors of this paper have background experience first running traditional HPC Linux clusters and queuing systems like Slurm, and later virtual machines using technologies such as Openstack. Much of the experience and perspective below is informed by this perspective. We will also provide a use-case from a researcher who deployed on Kubernetes without being as opinionated about other potential choices.

References

[1]
2016. Ceph File System. https://docs.ceph.com/en/latest/cephfs/index.html.
[2]
2018. Kaniko. https://github.com/GoogleContainerTools/kaniko.
[3]
2019. Install Ceph. https://ceph.io/install/.
[4]
2020. Ansible. https://www.ansible.com.
[5]
2020. DARPA-WASH. https://www.darpa.mil/program/warfighter-analytics-using-smartphones-for-health.
[6]
2020. Don’t Panic: Kubernetes and Docker. https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/.
[7]
2020. Ingress. https://kubernetes.io/docs/concepts/services-networking/ingress/.
[8]
2020. K3s. https://k3s.io.
[9]
2020. kubeadm. https://kubernetes.io/docs/reference/setup-tools/kubeadm/.
[10]
2020. MetalLB. https://metallb.universe.tf.
[11]
2020. Namespaces. https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/.
[12]
2020. Persistent Volumes. https://kubernetes.io/docs/concepts/storage/persistent-volumes.
[13]
2020. RBD Volume Provisioner for Kubernetes. https://github.com/kubernetes-retired/external-storage/tree/master/ceph/rbd.
[14]
2020. UT Austin COVID-19 Modeling Consotrium. https://covid-19.tacc.utexas.edu.
[15]
2021. Ceph. https://ceph.io.
[16]
2021. Creating a cluster with kubeadm. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/.
[17]
2021. Flannel. https://github.com/flannel-io/flannel.
[18]
2021. Helm. https://helm.sh.
[19]
2021. Jaeger: open source, end-to-end distributed tracing. https://www.jaegertracing.io.
[20]
2021. MicroK8s. https://microk8s.io.
[21]
2021. Project Calico. https://www.projectcalico.org.
[22]
2021. RBAC. https://kubernetes.io/docs/reference/access-authn-authz/rbac/.
[23]
2021. ResourceQuota. https://kubernetes.io/docs/concepts/policy/resource-quotas/.
[24]
2021. Services. https://kubernetes.io/docs/concepts/services-networking/service/.
[25]
2021. Texas Advanced Computing Center (TACC). https://www.tacc.utexas.edu.
[26]
December 8, 2017. Canonical pronunciation of “kubectl”. https://twitter.com/arungupta/status/939168964411838464?lang=en.
[27]
Sept 29, 2020. Influx DB: Time series DB. https://www.influxdata.com/products/influxdb-cloud/.
[28]
Sept 30, 2020. Kubernetes: Container Orchestration. https://kubernetes.io.
[29]
September 9, 2019. Docker Hub. https://hub.docker.com.
[30]
Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A. Grüning, Aysam Guerler, Jennifer Hillman-Jackson, Saskia Hiltemann, Vahid Jalili, Helena Rasche, Nicola Soranzo, Jeremy Goecks, James Taylor, Anton Nekrutenko, and Daniel Blankenberg. 2018. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 46, W1 (2018), W537–W544. https://doi.org/10.1093/nar/gky379
[31]
Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, and Carol Willing. 2016. Jupyter Notebooks – a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, F. Loizides and B. Schmidt (Eds.). IOS Press, 87 – 90.
[32]
Ellen M Rathje, Clint Dawson, Jamie E Padgett, Jean-Paul Pinelli, Dan Stanzione, Ashley Adair, Pedro Arduino, Scott J Brandenberg, Tim Cockerill, Charlie Dey, 2017. DesignSafe: new cyberinfrastructure for natural hazards engineering. Natural Hazards Review 18, 3 (2017), 06017001.
[33]
J. Stubbs 2021. Tapis: An API Platform for Reproducible, Distributed Computational Research. Future Generation Computer Systems(2021).

Cited By

View all
  • (2025)Design and Operation of Shared Machine Learning Clusters on CampusProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707266(295-310)Online publication date: 3-Feb-2025
  • (2024)Automated Testing of Over 1,000 Student Assignments: Benefits of Kubernetes2024 IEEE 22nd World Symposium on Applied Machine Intelligence and Informatics (SAMI)10.1109/SAMI60510.2024.10432890(000475-000480)Online publication date: 25-Jan-2024
  • (2024)Transitioning IIoT Data Processing: An Experience with InfluxDB on Kubernetes2024 11th International Conference on Future Internet of Things and Cloud (FiCloud)10.1109/FiCloud62933.2024.00045(247-252)Online publication date: 19-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PEARC '21: Practice and Experience in Advanced Research Computing 2021: Evolution Across All Dimensions
July 2021
310 pages
ISBN:9781450382922
DOI:10.1145/3437359
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Abaco
  2. Docker
  3. ETL
  4. Functions-as-a-Service
  5. Kubernetes
  6. Tapis
  7. bare-metal
  8. containers
  9. devops
  10. gitops
  11. repeatability
  12. reproducibility

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

PEARC '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)98
  • Downloads (Last 6 weeks)7
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Design and Operation of Shared Machine Learning Clusters on CampusProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707266(295-310)Online publication date: 3-Feb-2025
  • (2024)Automated Testing of Over 1,000 Student Assignments: Benefits of Kubernetes2024 IEEE 22nd World Symposium on Applied Machine Intelligence and Informatics (SAMI)10.1109/SAMI60510.2024.10432890(000475-000480)Online publication date: 25-Jan-2024
  • (2024)Transitioning IIoT Data Processing: An Experience with InfluxDB on Kubernetes2024 11th International Conference on Future Internet of Things and Cloud (FiCloud)10.1109/FiCloud62933.2024.00045(247-252)Online publication date: 19-Aug-2024
  • (2022)Method for Continuous Integration and Deployment Using a Pipeline Generator for Agile Software ProjectsSensors10.3390/s2212463722:12(4637)Online publication date: 20-Jun-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media