ABSTRACT
We are witnessing the rapid growth of cloud computing with the proliferation of tenants adopting cloud for elasticity, availability, and flexibility for a plethora of applications. To efficiently cater for different tenant requirements, cloud providers have steadily evolved to offer a myriad of resource and service types which inherently complicates the cloud adoption process. On the other hand, the perpetuating growth of cloud tenants in turn impel providers to expand datacenters to cope with the tenant demand. The objective of this proposal is to maximize the performance and minimize the cost for both tenants and cloud providers, by providing efficient means of managing resource allocations for their applications. Towards this, the proposal comprises of three intertwined tasks. First, we start from a tenant perspective, with the first two tasks aimed at investigating the primary reasons for performance-cost inefficiency. Second, from a provider perspective, the third task investigates the primary reasons for performance-energy inefficiency in datacenters. All the three tasks can collectively improve the performance and cost efficiency of emerging applications in next generation cloud platforms.
- 2020. Alibaba Cloud. https://alibaba.com/cloud.Google Scholar
- 2020. Brigade-workflows. https://brigade.sh/.Google Scholar
- 2020. Cloud Computing Market Projected To Reach $411B By 2020. https://https://www.gartner.com/en/newsroom/press-releases/2019-11-13-gartner-forecasts-worldwide-public-cloud-revenue-to-grow-17-percent-in-2020.Google Scholar
- 2020. Flexera 2020 State of the Cloud Report. https://info.flexera.com/SLO-CM-REPORT-State-of-the-Cloud-2020.Google Scholar
- Amazon. 2020. EC2 pricing. https://aws.amazon.com/ec2/pricing/.Google Scholar
- Microsoft Azure. 2020. Serverless Functions. https://azure.microsoft.com/en-us/services/functions/.Google Scholar
- Ataollah Fatahi Baarzi, Timothy Zhu, and Bhuvan Urgaonkar. 2019. BurScale: Using Burstable Instances for Cost-Effective Autoscaling in the Public Cloud. In Proceedings of the ACM Symposium on Cloud Computing. Association for Computing Machinery, New York, NY, USA. Google ScholarDigital Library
- William H Beluch, Tim Genewein, Andreas Nürnberger, and Jan M Köhler. 2018. The power of ensembles for active learning in image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9368--9377.Google ScholarCross Ref
- Andrew Chung, Jun Woo Park, and Gregory R. Ganger. 2018. Stratus: Cost-aware Container Scheduling in the Public Cloud. In SoCC. Google ScholarDigital Library
- Amazon Elastic Compute Cloud. 2011. Amazon web services. Retrieved November (2011).Google Scholar
- Daniel Crankshaw, Peter Bailis, Joseph E Gonzalez, Haoyuan Li, Zhao Zhang, Michael J Franklin, Ali Ghodsi, and Michael I Jordan. 2014. The missing piece in complex analytics: Low latency, scalable model management and serving with velox. arXiv preprint arXiv:1409.3809 (2014).Google Scholar
- Daniel Crankshaw, Gur-Eyal Sela, Corey Zumar, Xiangxi Mo, Joseph E. Gonzalez, Ion Stoica, and Alexey Tumanov. 2018. InferLine: ML Inference Pipeline Composition Framework. CoRR abs/1812.01776 (2018). arXiv:1812.01776 http://arxiv.org/abs/1812.01776Google Scholar
- Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 613--627. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/crankshaw Google ScholarDigital Library
- Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2016. Job-aware Scheduling in Eagle: Divide and Stick to Your Probes. In Proceedings of the Seventh ACM Symposium on Cloud Computing. Google ScholarDigital Library
- Pamela Delgado, Florin Dinu, Anne-Marie Kermarrec, and Willy Zwaenepoel. 2015. Hawk: Hybrid datacenter scheduling. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 499--510. Google ScholarDigital Library
- Christina Delimitrou, Daniel Sanchez, and Christos Kozyrakis. 2015. Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters. In Proceedings of the Sixth ACM Symposium on Cloud Computing (Kohala Coast, Hawaii) (SoCC '15). ACM, New York, NY, USA. Google ScholarDigital Library
- Sina Esfandiarpoor, Ali Pahlavan, and Maziar Goudarzi. 2015. Structure-aware online virtual machine consolidation for datacenter energy improvement in cloud computing. Computers & Electrical Engineering 42 (2015), 74--89. Google ScholarDigital Library
- Google. 2020. Cloud Functions. https://cloud.google.com/functions/docs/, February2018.Google Scholar
- Arpan Gujarati, Sameh Elnikety, Yuxiong He, Kathryn S. McKinley, and Björn B. Brandenburg. 2017. Swayam: Distributed Autoscaling to Meet SLAs of Machine Learning Inference Services with Resource Efficiency. In USENIX Middleware Conference. Google ScholarDigital Library
- Jashwant Raj Gunasekaran, Prashanth Thinakaran, Nachiappan C.Nachiappan, Mahmut Taylan Kandemir, and Chita R. Das. 2020. Fifer: Tackling Resource Underutilization in the Serverless Era. In USENIX Middleware Conference. Google ScholarDigital Library
- Jashwant Raj Gunasekaran, Prashanth Thinakaran, Mahmut Taylan Kandemir, Bhuvan Urgaonkar, George Kesidis, and Chita Das. 2019. Spock: Exploiting serverless functions for slo and cost aware resource procurement in public cloud. In IEEE CLOUD.Google Scholar
- Jashwant Raj Gunasekaran, Prashanth Thinakaran, Cyan Subhra Mishra, Mahmut Taylan Kandemir, and Chita R. Das. 2020. Towards Designing a Self-Managed Machine Learning Inference Serving System inPublic Cloud. arXiv:2008.09491 [cs.DC]Google Scholar
- Rui Han, Moustafa M. Ghanem, Li Guo, Yike Guo, and Michelle Osmond. 2014. Enabling Cost-Aware and Adaptive Elasticity of Multi-Tier Cloud Applications. Future Gener. Comput. Syst. 32, C (March 2014), 82--98.Google Scholar
- Aaron Harlap, Andrew Chung, Alexey Tumanov, Gregory R. Ganger, and Phillip B. Gibbons. 2018. Tributary: spot-dancing for elastic services with latency SLOs. In ATC. Google ScholarDigital Library
- Aaron Harlap, Alexey Tumanov, Andrew Chung, Gregory R. Ganger, and Phillip B. Gibbons. 2017. Proteus: Agile ML Elasticity Through Tiered Reliability in Dynamic Resource Markets. In Eurosys. Google ScholarDigital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation (1997).Google Scholar
- Konstantinos Karanasos, Sriram Rao, Carlo Curino, Chris Douglas, Kishore Chaliparambil, Giovanni Matteo Fumarola, Solom Heddaya, Raghu Ramakrishnan, and Sarvesh Sakalanaga. 2015. Mercury: Hybrid centralized and distributed scheduling in large shared clusters. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 485--497. Google ScholarDigital Library
- Adithya Kumar, Iyswarya Narayanan, Timothy Zhu, and Anand Sivasubramaniam. 2020. The Fast and The Frugal: Tail Latency Aware Provisioning for Coping with Load Variations. In Proceedings of The Web Conference 2020 (WWW '20). Association for Computing Machinery, New York, NY, USA. Google ScholarDigital Library
- Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santambrogio, Markus Weimer, and Matteo Interlandi. 2018. PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 611--626. https://www.usenix.org/conference/osdi18/presentation/lee Google ScholarDigital Library
- I. Narayanan, D. Wang, A. Mamun, A. Sivasubramaniam, H. K. Fathy, and S. James. 2017. Evaluating energy storage for a multitude of uses in the datacenter. In 2017 IEEE International Symposium on Workload Characterization (IISWC). 12--21. https://doi.org/10.1109/IISWC.2017. 8167752Google ScholarCross Ref
- Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: distributed, low latency scheduling. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 69--84. Google ScholarDigital Library
- Khalid Rafique, Abdul Wahid Tareen, Muhammad Saeed, Jingzhu Wu, and Shahryar Shafique Qureshi. 2011. Cloud computing economics opportunities and challenges. In Broadband Network and Multimedia Technology (IC-BNMT), 2011 4th IEEE International Conference on. IEEE, 401--406.Google ScholarCross Ref
- Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 205--218. https://www.usenix.org/conference/atc20/presentation/shahradGoogle Scholar
- Prateek Sharma, David Irwin, and Prashant Shenoy. 2017. Portfolio-Driven Resource Management for Transient Cloud Servers. Proc. ACM Meas. Anal. Comput. Syst. 1, 1, Article 5 (June 2017), 23 pages. Google ScholarDigital Library
- Prateek Sharma, Stephen Lee, Tian Guo, David Irwin, and Prashant Shenoy. 2015. Spotcheck: Designing a derivative iaas cloud on the spot market. In Proceedings of the Tenth European Conference on Computer Systems. 1--15. Google ScholarDigital Library
- Arjun Singhvi, Kevin Houck, Arjun Balasubramanian, Mohammed Danish Shaikh, Shivaram Venkataraman, and Aditya Akella. 2019. Archipelago: A Scalable Low-Latency Serverless Platform. arXiv preprint arXiv:1911.09849 (2019).Google Scholar
- P. Thinakaran, J. R. Gunasekaran, B. Sharma, M. T. Kandemir, and C. R. Das. 2019. Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters. In CLUSTER.Google Scholar
- Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal, and Timothy Wood. 2008. Agile Dynamic Provisioning of Multi-tier Internet Applications. TAAS (2008). Google ScholarDigital Library
- Cheng Wang, Bhuvan Urgaonkar, Neda Nasiriani, and George Kesidis. 2017. Using Burstable Instances in the Public Cloud: Why, When and How? SIGMETRICS (June 2017). Google ScholarDigital Library
- Neeraja J. Yadwadkar, Francisco Romero, Qian Li, and Christos Kozyrakis. 2019. A Case for Managed and Model-Less Inference Serving. In Proceedings of the Workshop on Hot Topics in Operating Systems. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3317550.3321443 Google ScholarDigital Library
- Jeong-Min Yun, Yuxiong He, Sameh Elnikety, and Shaolei Ren. 2015. Optimal aggregation policy for reducing tail latency of web search. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 63--72. Google ScholarDigital Library
- Chengliang Zhang, Minchen Yu, Wei Wang, and Feng Yan. 2019. MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving. In ATC. Google ScholarDigital Library
Index Terms
- Minimizing Cost and Maximizing Performance for Cloud Platforms
Recommendations
CloudCmp: comparing public cloud providers
IMC '10: Proceedings of the 10th ACM SIGCOMM conference on Internet measurementWhile many public cloud providers offer pay-as-you-go computing, their varying approaches to infrastructure, virtualization, and software services lead to a problem of plenty. To help customers pick a cloud that fits their needs, we develop CloudCmp, a ...
An Economic Model for Maximizing Profit of a Cloud Service Provider
ARES '12: Proceedings of the 2012 Seventh International Conference on Availability, Reliability and SecurityFor Infrastructure-as-a-Service, Cloud service providers, such as Amazon EC2 and Rackspace, allow users to lease their computing resources over the Internet, and invest their money into developing and maintaining the infrastructure. Hence, maximizing ...
Performance and Cost Comparison of Cloud Services for Deep Learning Workload
ICPE '21: Companion of the ACM/SPEC International Conference on Performance EngineeringMany organizations are migrating their on-premise artificial intelligence workloads to the cloud due to the availability of cost-effective and highly scalable infrastructure, software and platform services. To ease the process of migration, many cloud ...
Comments