skip to main content
10.1145/3429351.3431747acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
short-paper

Minimizing Cost and Maximizing Performance for Cloud Platforms

Published:22 December 2020Publication History

ABSTRACT

We are witnessing the rapid growth of cloud computing with the proliferation of tenants adopting cloud for elasticity, availability, and flexibility for a plethora of applications. To efficiently cater for different tenant requirements, cloud providers have steadily evolved to offer a myriad of resource and service types which inherently complicates the cloud adoption process. On the other hand, the perpetuating growth of cloud tenants in turn impel providers to expand datacenters to cope with the tenant demand. The objective of this proposal is to maximize the performance and minimize the cost for both tenants and cloud providers, by providing efficient means of managing resource allocations for their applications. Towards this, the proposal comprises of three intertwined tasks. First, we start from a tenant perspective, with the first two tasks aimed at investigating the primary reasons for performance-cost inefficiency. Second, from a provider perspective, the third task investigates the primary reasons for performance-energy inefficiency in datacenters. All the three tasks can collectively improve the performance and cost efficiency of emerging applications in next generation cloud platforms.

References

  1. 2020. Alibaba Cloud. https://alibaba.com/cloud.Google ScholarGoogle Scholar
  2. 2020. Brigade-workflows. https://brigade.sh/.Google ScholarGoogle Scholar
  3. 2020. Cloud Computing Market Projected To Reach $411B By 2020. https://https://www.gartner.com/en/newsroom/press-releases/2019-11-13-gartner-forecasts-worldwide-public-cloud-revenue-to-grow-17-percent-in-2020.Google ScholarGoogle Scholar
  4. 2020. Flexera 2020 State of the Cloud Report. https://info.flexera.com/SLO-CM-REPORT-State-of-the-Cloud-2020.Google ScholarGoogle Scholar
  5. Amazon. 2020. EC2 pricing. https://aws.amazon.com/ec2/pricing/.Google ScholarGoogle Scholar
  6. Microsoft Azure. 2020. Serverless Functions. https://azure.microsoft.com/en-us/services/functions/.Google ScholarGoogle Scholar
  7. Ataollah Fatahi Baarzi, Timothy Zhu, and Bhuvan Urgaonkar. 2019. BurScale: Using Burstable Instances for Cost-Effective Autoscaling in the Public Cloud. In Proceedings of the ACM Symposium on Cloud Computing. Association for Computing Machinery, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. William H Beluch, Tim Genewein, Andreas Nürnberger, and Jan M Köhler. 2018. The power of ensembles for active learning in image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9368--9377.Google ScholarGoogle ScholarCross RefCross Ref
  9. Andrew Chung, Jun Woo Park, and Gregory R. Ganger. 2018. Stratus: Cost-aware Container Scheduling in the Public Cloud. In SoCC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Amazon Elastic Compute Cloud. 2011. Amazon web services. Retrieved November (2011).Google ScholarGoogle Scholar
  11. Daniel Crankshaw, Peter Bailis, Joseph E Gonzalez, Haoyuan Li, Zhao Zhang, Michael J Franklin, Ali Ghodsi, and Michael I Jordan. 2014. The missing piece in complex analytics: Low latency, scalable model management and serving with velox. arXiv preprint arXiv:1409.3809 (2014).Google ScholarGoogle Scholar
  12. Daniel Crankshaw, Gur-Eyal Sela, Corey Zumar, Xiangxi Mo, Joseph E. Gonzalez, Ion Stoica, and Alexey Tumanov. 2018. InferLine: ML Inference Pipeline Composition Framework. CoRR abs/1812.01776 (2018). arXiv:1812.01776 http://arxiv.org/abs/1812.01776Google ScholarGoogle Scholar
  13. Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 613--627. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/crankshaw Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2016. Job-aware Scheduling in Eagle: Divide and Stick to Your Probes. In Proceedings of the Seventh ACM Symposium on Cloud Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Pamela Delgado, Florin Dinu, Anne-Marie Kermarrec, and Willy Zwaenepoel. 2015. Hawk: Hybrid datacenter scheduling. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 499--510. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Christina Delimitrou, Daniel Sanchez, and Christos Kozyrakis. 2015. Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters. In Proceedings of the Sixth ACM Symposium on Cloud Computing (Kohala Coast, Hawaii) (SoCC '15). ACM, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sina Esfandiarpoor, Ali Pahlavan, and Maziar Goudarzi. 2015. Structure-aware online virtual machine consolidation for datacenter energy improvement in cloud computing. Computers & Electrical Engineering 42 (2015), 74--89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Google. 2020. Cloud Functions. https://cloud.google.com/functions/docs/, February2018.Google ScholarGoogle Scholar
  19. Arpan Gujarati, Sameh Elnikety, Yuxiong He, Kathryn S. McKinley, and Björn B. Brandenburg. 2017. Swayam: Distributed Autoscaling to Meet SLAs of Machine Learning Inference Services with Resource Efficiency. In USENIX Middleware Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jashwant Raj Gunasekaran, Prashanth Thinakaran, Nachiappan C.Nachiappan, Mahmut Taylan Kandemir, and Chita R. Das. 2020. Fifer: Tackling Resource Underutilization in the Serverless Era. In USENIX Middleware Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jashwant Raj Gunasekaran, Prashanth Thinakaran, Mahmut Taylan Kandemir, Bhuvan Urgaonkar, George Kesidis, and Chita Das. 2019. Spock: Exploiting serverless functions for slo and cost aware resource procurement in public cloud. In IEEE CLOUD.Google ScholarGoogle Scholar
  22. Jashwant Raj Gunasekaran, Prashanth Thinakaran, Cyan Subhra Mishra, Mahmut Taylan Kandemir, and Chita R. Das. 2020. Towards Designing a Self-Managed Machine Learning Inference Serving System inPublic Cloud. arXiv:2008.09491 [cs.DC]Google ScholarGoogle Scholar
  23. Rui Han, Moustafa M. Ghanem, Li Guo, Yike Guo, and Michelle Osmond. 2014. Enabling Cost-Aware and Adaptive Elasticity of Multi-Tier Cloud Applications. Future Gener. Comput. Syst. 32, C (March 2014), 82--98.Google ScholarGoogle Scholar
  24. Aaron Harlap, Andrew Chung, Alexey Tumanov, Gregory R. Ganger, and Phillip B. Gibbons. 2018. Tributary: spot-dancing for elastic services with latency SLOs. In ATC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aaron Harlap, Alexey Tumanov, Andrew Chung, Gregory R. Ganger, and Phillip B. Gibbons. 2017. Proteus: Agile ML Elasticity Through Tiered Reliability in Dynamic Resource Markets. In Eurosys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation (1997).Google ScholarGoogle Scholar
  27. Konstantinos Karanasos, Sriram Rao, Carlo Curino, Chris Douglas, Kishore Chaliparambil, Giovanni Matteo Fumarola, Solom Heddaya, Raghu Ramakrishnan, and Sarvesh Sakalanaga. 2015. Mercury: Hybrid centralized and distributed scheduling in large shared clusters. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 485--497. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Adithya Kumar, Iyswarya Narayanan, Timothy Zhu, and Anand Sivasubramaniam. 2020. The Fast and The Frugal: Tail Latency Aware Provisioning for Coping with Load Variations. In Proceedings of The Web Conference 2020 (WWW '20). Association for Computing Machinery, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santambrogio, Markus Weimer, and Matteo Interlandi. 2018. PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 611--626. https://www.usenix.org/conference/osdi18/presentation/lee Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. I. Narayanan, D. Wang, A. Mamun, A. Sivasubramaniam, H. K. Fathy, and S. James. 2017. Evaluating energy storage for a multitude of uses in the datacenter. In 2017 IEEE International Symposium on Workload Characterization (IISWC). 12--21. https://doi.org/10.1109/IISWC.2017. 8167752Google ScholarGoogle ScholarCross RefCross Ref
  31. Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: distributed, low latency scheduling. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 69--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Khalid Rafique, Abdul Wahid Tareen, Muhammad Saeed, Jingzhu Wu, and Shahryar Shafique Qureshi. 2011. Cloud computing economics opportunities and challenges. In Broadband Network and Multimedia Technology (IC-BNMT), 2011 4th IEEE International Conference on. IEEE, 401--406.Google ScholarGoogle ScholarCross RefCross Ref
  33. Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 205--218. https://www.usenix.org/conference/atc20/presentation/shahradGoogle ScholarGoogle Scholar
  34. Prateek Sharma, David Irwin, and Prashant Shenoy. 2017. Portfolio-Driven Resource Management for Transient Cloud Servers. Proc. ACM Meas. Anal. Comput. Syst. 1, 1, Article 5 (June 2017), 23 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Prateek Sharma, Stephen Lee, Tian Guo, David Irwin, and Prashant Shenoy. 2015. Spotcheck: Designing a derivative iaas cloud on the spot market. In Proceedings of the Tenth European Conference on Computer Systems. 1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Arjun Singhvi, Kevin Houck, Arjun Balasubramanian, Mohammed Danish Shaikh, Shivaram Venkataraman, and Aditya Akella. 2019. Archipelago: A Scalable Low-Latency Serverless Platform. arXiv preprint arXiv:1911.09849 (2019).Google ScholarGoogle Scholar
  37. P. Thinakaran, J. R. Gunasekaran, B. Sharma, M. T. Kandemir, and C. R. Das. 2019. Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters. In CLUSTER.Google ScholarGoogle Scholar
  38. Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal, and Timothy Wood. 2008. Agile Dynamic Provisioning of Multi-tier Internet Applications. TAAS (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Cheng Wang, Bhuvan Urgaonkar, Neda Nasiriani, and George Kesidis. 2017. Using Burstable Instances in the Public Cloud: Why, When and How? SIGMETRICS (June 2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Neeraja J. Yadwadkar, Francisco Romero, Qian Li, and Christos Kozyrakis. 2019. A Case for Managed and Model-Less Inference Serving. In Proceedings of the Workshop on Hot Topics in Operating Systems. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3317550.3321443 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jeong-Min Yun, Yuxiong He, Sameh Elnikety, and Shaolei Ren. 2015. Optimal aggregation policy for reducing tail latency of web search. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Chengliang Zhang, Minchen Yu, Wei Wang, and Feng Yan. 2019. MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving. In ATC. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Minimizing Cost and Maximizing Performance for Cloud Platforms

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      Middleware'20 Doctoral Symposium: Proceedings of the 21st International Middleware Conference Doctoral Symposium
      December 2020
      55 pages
      ISBN:9781450382007
      DOI:10.1145/3429351

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 December 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate203of948submissions,21%

      Upcoming Conference

      MIDDLEWARE '24
      25th International Middleware Conference
      December 2 - 6, 2024
      Hong Kong , Hong Kong
    • Article Metrics

      • Downloads (Last 12 months)23
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader