ABSTRACT
Many organizations maintain and operate large shared computing clusters, since they can substantially reduce computing costs by leveraging statistical multiplexing to amortize it across all users. Importantly, such shared clusters are generally not free to use, but have an internal pricing model that funds their operation. Since employees at many large organizations, especially Universities, have some budgetary autonomy over purchase decisions, internal shared clusters are increasingly competing for users with cloud platforms, which may offer lower costs and better performance. As a result, many organizations are shifting their shared clusters to operate on cloud resources. This paper empirically analyzes the user incentives for shared cloud clusters under two different pricing models using an 8-year job trace from a large shared cluster for a large University system.
Our analysis shows that, with either pricing model, a large fraction of users have little financial incentive to participate in a shared cloud cluster compared to directly acquiring resources from a cloud platform. While shared cloud clusters can provide some limited reductions in cost by leveraging reserved instances at a discount, due to bursty workloads, realizing these reductions generally requires imposing long job waiting times, which for many users are likely not worth the cost reduction. In particular, we show that, assuming users defect from the shared cluster if their wait time is greater than 15x their average job runtime, over 80% of the users would defect, which increases the price of the remaining users such that it eliminates any incentive to participate in a shared cluster. Thus, while shared cloud clusters may provide users other benefits, their financial incentives are weak.
- 2022. Amazon EC2 Spot Instances. https://aws.amazon.com/ec2/spot/.Google Scholar
- 2022. AWS - Discounts on Reserving Resources. https://aws.amazon.com/ec2/pricing/reserved-instances/pricing/.Google Scholar
- 2022. AWS ParallelCluster Auto Scaling. https://docs.aws.amazon.com/parallelcluster/latest/ug/autoscaling.html.Google Scholar
- 2022. Azure Spot Virtual Machines. https://azure.microsoft.com/en-us/products/virtual-machines/spot/.Google Scholar
- 2022. Cloud Cost Optimizer. https://research.redhat.com/blog/research_project/cloud-cost-optimizer/.Google Scholar
- 2022. Cloud Growth in Future. https://www.globenewswire.com/news-release/2022/05/06/2437934/0/en/Cloud-Computing-Market-to-Grow-at-a-CAGR-of-11-until-2028-BlueWeave-Consulting.html.Google Scholar
- 2022. Curator. https://github.com/operate-first/curator/.Google Scholar
- 2022. Google Preemptible Virtual Machines. https://cloud.google.com/compute/docs/instances/preemptible.Google Scholar
- 2022. Kubernetes on AWS. https://aws.amazon.com/kubernetes/.Google Scholar
- 2022. On-Prem Computing. https://www.techslang.com/definition/what-is-on-premises/.Google Scholar
- 2022. On-Prem Computing, Expensive than Cloud. https://www.executech.com/insights/the-cloud-vs-on-premise-cost-comparison/.Google Scholar
- 2022. Privacy and Regulatory on On-Prem Computing. https://www.cleo.com/blog/knowledge-base-on-premise-vs-cloud.Google Scholar
- 2022. Rapid Growth of Cloud. https://www.capacitymedia.com/article/2afswwuvis94wy12r320w/news/google-cloud-growing-45-a-year-with-azure-at-40-says-canalys.Google Scholar
- 2023. Job Simulator. https://github.com/sustainablecomputinglab/waitinggame/tree/master/simulator.Google Scholar
- 2023. University of Massachusetts Green High Performance Computing Cluster. http://wiki.umassrc.org/wiki/index.php/MainPage.Google Scholar
- Abdullah Alzaqebah, Rizik Al-Sayyed, and Raja Masadeh. 2019. Task Scheduling Based on Modified Grey Wolf Optimizer in Cloud Computing Environment. In 2019 2nd International Conference on new Trends in Computing Sciences (ICTCS). IEEE.Google Scholar
- Pradeep Ambati, Noman Bashir, David Irwin, and Prashant Shenoy. 2020. Waiting Game: Optimally Provisioning Fixed Resources for Cloud-Enabled Schedulers. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE.Google Scholar
- Pradeep Ambati, Noman Bashir, David Irwin, and Prashant Shenoy. 2021. Good Things Come to Those Who Wait: Optimizing Job Waiting in the Cloud. In Proceedings of the ACM Symposium on Cloud Computing.Google ScholarDigital Library
- Tekin Bicer, David Chiu, and Gagan Agrawal. 2011. A Framework for Data-Intensive Computing with Cloud Bursting. In IEEE International Conference on Cluster Computing. IEEE.Google ScholarDigital Library
- Kavitha Chandra. 2003. Statistical Multiplexing. Wiley Encyclopedia of Telecommunications 5 (January 2003).Google ScholarCross Ref
- Tian Guo, Upendra Sharma, Prashant Shenoy, Timothy Wood, and Sambit Sahu. 2014. Cost-Aware Cloud Bursting for Enterprise Applications. ACM Transactions on Internet Technology (TOIT) (2014).Google Scholar
- Tian Guo, Upendra Sharma, Timothy Wood, Sambit Sahu, and Prashant Shenoy. 2012. Seagull: Intelligent Cloud Bursting for Enterprise Applications. In USENIX Annual Technical Conference.Google Scholar
- Yu-Ju Hong, Jiachen Xue, and Mithuna Thottethodi. 2011. Dynamic Server Provisioning to Minimize Cost in an IaaS Cloud. In Special Interest Group on Measurement and Evaluation (SIGMETRICS).Google Scholar
- Menglan Hu, Jun Luo, and Bharadwaj Veeravalli. 2012. Optimal Provisioning for Scheduling Divisible Loads with Reserved Cloud Resources. In IEEE International Conference on Networks (ICON).Google Scholar
- Sriram Kailasam, Nathan Gnanasambandam, Janakiram Dharanipragada, and Naveen Sharma. 2010. Optimizing Service Level Agreements for Autonomic Cloud Bursting Schedulers. In International Conference on Parallel Processing Workshops. IEEE.Google Scholar
- Michael Kuchnik, Jun Woo Park, Chuck Cranor, Elisabeth Moore, Nathan DeBardeleben, and George Amvrosiadis. 2019. This is Why ML-driven Cluster Scheduling Remains Widely Impractical. Technical Report (2019).Google Scholar
- Tania Lorido-Botran, Jose Miguel-Alonso, and Jose A Lozano. 2014. A Review of Auto-Scaling Techniques for Elastic Applications in Cloud Environments. Journal of Grid Computing (2014).Google Scholar
- Marko Luksa. 2017. Kubernetes in Action. Simon and Schuster.Google Scholar
- Michael Mattess, Christian Vecchiola, Saurabh Kumar Garg, and Rajkumar Buyya. 2011. Cloud Bursting: Managing Peak Loads by Leasing Public Cloud Services. In Cloud Computing: Methodology, Systems, and Applications. CRC Press.Google Scholar
- Shuangcheng Niu, Jidong Zhai, Xiaosong Ma, Xiongchao Tang, and Wenguang Chen. 2013. Cost-effective Cloud HPC Resource Provisioning by Building Semi- Elastic Virtual Clusters. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC).Google ScholarDigital Library
- Siqi Shen, Kefeng Deng, Alexandru Iosup, and Dick Epema. 2013. Scheduling Jobs in the Cloud using On-demand and Reserved Instances. In International European Conference on Parallel and Distributed Computing (Euro-Par).Google ScholarDigital Library
- Abraham Silberschatz, Peter B Galvin, and Greg Gagne. 2018. Operating System Concepts, 10e Abridged Print Companion. John Wiley & Sons.Google Scholar
- Jose Luis Lucas Simarro, Rafael Moreno-Vozmediano, Ruben S Montero, and Ignacio Martín Llorente. 2011. Dynamic Placement of Virtual Machines for Cost Optimization in Multi-Cloud Environments. In 2011 International Conference on High Performance Computing & Simulation. IEEE.Google Scholar
- Ruben Van den Bossche, Kurt Vanmechelen, and Jan Broeckhove. 2015. IaaS Reserved Contract Procurement Optimisation with Load Prediction. Future Generation Computer Systems (2015).Google Scholar
- Wei Wang, Baochun Li, and Ben Liang. 2013. To Reserve or Not to Reserve: Optimal Online Multi-Instance Aquisition in IaaS Clouds. In International Conference on Autonomic Computing (ICAC).Google Scholar
- Andy B Yoo, Morris A Jette, and Mark Grondona. 2003. Slurm: Simple linux Utility for Resource Management. In Job Scheduling Strategies for Parallel Processing. Springer.Google Scholar
Index Terms
- Is Sharing Caring? Analyzing the Incentives for Shared Cloud Clusters
Recommendations
Self-managed cost-efficient virtual elastic clusters on hybrid Cloud infrastructures
In this study, we describe the further development of Elastic Cloud Computing Cluster (EC3), a tool for creating self-managed cost-efficient virtual hybrid elastic clusters on top of Infrastructure as a Service (IaaS) clouds. By using spot instances and ...
Secret Sharing Scheme Suitable for Cloud Computing
AINA '13: Proceedings of the 2013 IEEE 27th International Conference on Advanced Information Networking and ApplicationsSecret sharing schemes have recently been considered to apply for cloud computing in which many users distribute multiple data to servers. However, when Shamir's (k, n) secret sharing is applied to cloud systems, the amount of share increases more than ...
An Analysis of Provisioning and Allocation Policies for Infrastructure-as-a-Service Clouds
CCGRID '12: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)Today, many commercial and private cloud computing providers offer resources for leasing under the infrastructure as a service (IaaS) paradigm. Although an abundance of mechanisms already facilitate the lease and use of single infrastructure resources, ...
Comments