skip to main content
10.1145/3620678.3624645acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Best Paper

Golgi: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing

Published:31 October 2023Publication History

ABSTRACT

This paper introduces Golgi, a novel scheduling system designed for serverless functions, with the goal of minimizing resource provisioning costs while meeting the function latency requirements. To achieve this, Golgi judiciously over-commits functions based on their past resource usage. To ensure overcommitment does not cause significant performance degradation, Golgi identifies nine low-level metrics to capture the runtime performance of functions, encompassing factors like request load, resource allocation, and contention on shared resources. These metrics enable accurate prediction of function performance using the Mondrian Forest, a classification model that is continuously updated in real-time for optimal accuracy without extensive offline training. Golgi employs a conservative exploration-exploitation strategy for request routing. By default, it routes requests to non-overcommitted instances to ensure satisfactory performance. However, it actively explores opportunities for using more resource-efficient overcommitted instances, while maintaining the specified latency SLOs. Golgi also performs vertical scaling to dynamically adjust the concurrency of overcommitted instances, maximizing request throughput and enhancing system robustness to prediction errors. We have prototyped Golgi and evaluated it in both EC2 cluster and a small production cluster. The results show that Golgi can meet the SLOs while reducing the resource provisioning cost by 42% (30%) in EC2 cluster (our production cluster).

References

  1. Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. In Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20).Google ScholarGoogle Scholar
  2. George Amvrosiadis, Jun Woo Park, Gregory R. Ganger, Garth A. Gibson, Elisabeth Baseman, and Nathan DeBardeleben. 2018. On the diversity of cluster workloads and its impact on research results. In Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC 18).Google ScholarGoogle Scholar
  3. The Kubernetes Authors. 2023. Kubernetes Scheduling Framework. https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/.Google ScholarGoogle Scholar
  4. Microsoft Azure. 2022. Azure Functions Pricing. https://azure.microsoft.com/en-us/pricing/details/functions/.Google ScholarGoogle Scholar
  5. Microsoft Azure. 2022. Concurrency in Azure Functions. https://docs.microsoft.com/en-us/azure/azure-functions/functions-concurrency.Google ScholarGoogle Scholar
  6. Microsoft Azure. 2022. What are Durable Functions? https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview?tabs=csharp.Google ScholarGoogle Scholar
  7. Bharathan Balaji, Christopher Kakovitch, and Balakrishnan (Murali) Narayanaswamy. 2020. FirePlace: Placing FireCracker virtual machines with hindsight imitation. In Proceedings of the MLSys 2021, NeurIPS 2020 Workshop on Machine Learning for Systems.Google ScholarGoogle Scholar
  8. Noman Bashir, Nan Deng, Krzysztof Rzadca, David Irwin, Sree Kodak, and Rohit Jnagal. 2021. Take It to the Limit: Peak Prediction-Driven Resource Overcommitment in Datacenters. In Proceedings of the Sixteenth European Conference on Computer Systems (EuroSys 21).Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Alibaba Cloud. 2022. Aliyun Function Compute Pricing. https://www.alibabacloud.com/help/en/doc-detail/54301.html.Google ScholarGoogle Scholar
  10. Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP 17).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alex Ellis. 2022. OpenFaaS: Server Functions, Made Simple. https://www.openfaas.com/.Google ScholarGoogle Scholar
  12. Panagiotis Garefalakis, Konstantinos Karanasos, Peter Pietzuch, Arun Suresh, and Sriram Rao. 2018. Medea: Scheduling of Long Running Applications in Shared Production Clusters. In Proceedings of the Thirteenth EuroSys Conference (EuroSys 18).Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Google. 2022. Overcommitting CPUs on sole-tenant VMs. https://cloud.google.com/compute/docs/nodes/overcommitting-cpus-sole-tenant-vms.Google ScholarGoogle Scholar
  14. Google. 2022. Vertical Pod autoscaling. https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler.Google ScholarGoogle Scholar
  15. Mingzhe Hao, Levent Toksoz, Nanqinqin Li, Edward Edberg Halim, Henry Hoffmann, and Haryadi S. Gunawi. 2020. LinnOS: Predictability on Unpredictable Flash Storage with a Light Neural Network. In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation.Google ScholarGoogle Scholar
  16. Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, et al. 2019. Cloud programming simplified: A berkeley view on serverless computing. arXiv preprint arXiv:1902.03383 (2019).Google ScholarGoogle Scholar
  17. Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. 2018. Pocket: Elastic Ephemeral Storage for Serverless Analytics. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18).Google ScholarGoogle Scholar
  18. Balaji Lakshminarayanan, Daniel M Roy, and Yee Whye Teh. 2014. Mondrian Forests: Efficient Online Random Forests. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 14).Google ScholarGoogle Scholar
  19. AWS Lambda. 2022. AWS Lambda Pricing. https://aws.amazon.com/lambda/pricing/.Google ScholarGoogle Scholar
  20. AWS Lambda. 2022. How do I request a concurrency limit increase for my Lambda function? https://aws.amazon.com/premiumsupport/knowledge-center/lambda-concurrency-limit-increase/.Google ScholarGoogle Scholar
  21. AWS Lambda. 2022. Lambda function scaling. https://docs.aws.amazon.com/lambda/latest/dg/invocation-scaling.html.Google ScholarGoogle Scholar
  22. Suyi Li, Luping Wang, Wei Wang, Yinghao Yu, and Bo Li. 2021. George: Learning to Place Long-Lived Containers in Large Clusters with Operation Constraints. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 21).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ashraf Mahgoub, Karthick Shankar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2021. SONIC: Application-aware Data Passing for Chained Serverless Applications. In Proceedings of the 2021 USENIX Annual Technical Conference (ATC 21).Google ScholarGoogle Scholar
  24. Ashraf Mahgoub, Edgardo Barsallo Yi, Karthick Shankar, Sameh Elnikety, Somali Chaterji, and Saurabh Bagchi. 2022. ORION and the Three Rights: Sizing, Bundling, and Prewarming for Serverless DAGs. In Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22).Google ScholarGoogle Scholar
  25. Michael Mitzenmacher. 2001. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distributed Systems (2001).Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale, Stéphane Pouget, Josiane Kouam, Renaud Lachaize, Jinho Hwang, Tim Wood, Daniel Hagimont, Noël De Palma, Bernabé Batchakui, and Alain Tchana. 2021. OFC: An Opportunistic Caching System for FaaS Platforms. In Proceedings of the Sixteenth European Conference on Computer Systems (EuroSys 21).Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Christopher Olston, Noah Fiedel, Kiril Gorovoy, Jeremiah Harmsen, Li Lao, Fangwei Li, Vinu Rajashekhar, Sukriti Ramesh, and Jordan Soyke. 2017. Tensorflow-serving: Flexible, high-performance ml serving. arXiv preprint arXiv:1712.06139 (2017).Google ScholarGoogle Scholar
  28. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Qifan Pu, Shivaram Venkataraman, and Ion Stoica. 2019. Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure. In Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19).Google ScholarGoogle Scholar
  30. Ran Ribenzaft. 2019. What AWS Lambda's Performance Stats Reveal. https://epsagon.com/observability/what-aws-lambda-performance-stats-reveal-key-metrics/.Google ScholarGoogle Scholar
  31. Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, and Ricardo Bianchini. 2021. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 21).Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Krzysztof Rzadca, Pawel Findeisen, Jacek Swiderski, Przemyslaw Zych, Przemyslaw Broniek, Jarek Kusmierek, Pawel Nowak, Beata Strack, Piotr Witusowski, Steven Hand, and John Wilkes. 2020. Autopilot: Workload Autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems (EuroSys 20).Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Johann Schleier-Smith, Vikram Sreekanti, Anurag Khandelwal, Joao Carreira, Neeraja J Yadwadkar, Raluca Ada Popa, Joseph E Gonzalez, Ion Stoica, and David A Patterson. 2021. What serverless computing is and should become: The next phase of cloud computing. Commun. ACM (2021).Google ScholarGoogle Scholar
  34. Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In Proceedings of the 2020 USENIX Annual Technical Conference (ATC 20).Google ScholarGoogle Scholar
  35. Arjun Singhvi, Arjun Balasubramanian, Kevin Houck, Mohammed Danish Shaikh, Shivaram Venkataraman, and Aditya Akella. 2021. Atoll: A Scalable Low-Latency Serverless Platform. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 21).Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Vikram Sreekanti, Chenggang Wu, Xiayue Charles Lin, Johann Schleier-Smith, Joseph E. Gonzalez, Joseph M. Hellerstein, and Alexey Tumanov. 2020. Cloudburst: Stateful Functions-as-a-Service. In Proc. VLDB Endow. (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Huangshi Tian, Suyi Li, Ao Wang, Wei Wang, Tianlong Wu, and Haoran Yang. 2022. Owl: Performance-Aware Scheduling for Resource-Efficient Function-as-a-Service Cloud. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 22).Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Huangshi Tian, Suyi Li, Ao Wang, Wei Wang, Tianlong Wu, and Haoran Yang. 2022. Owl: Performance-Aware Scheduling for Resource-Efficient Function-as-a-Service Cloud. https://www.cse.ust.hk/~weiwa/papers/owl-techreport.pdf.Google ScholarGoogle Scholar
  39. Jeffrey S Vitter. 1985. Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS) (1985).Google ScholarGoogle Scholar
  40. Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In Proceedings of the 2018 USENIX Annual Technical Conference (ATC 18).Google ScholarGoogle Scholar
  41. Luping Wang, Qizhen Weng, Wei Wang, Chen Chen, and Bo Li. 2020. Metis: Learning to schedule long-running applications in shared container clusters at scale. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.Google ScholarGoogle ScholarCross RefCross Ref
  42. Zhaojie Wen, Yishuo Wang, and Fangming Liu. 2022. StepConf: SLO-Aware Dynamic Resource Configuration for Serverless Function Workflows. In IEEE INFOCOM 2022-IEEE Conference on Computer Communications.Google ScholarGoogle Scholar
  43. Renyu Yang, Chunming Hu, Xiaoyang Sun, Peter Garraghan, Tianyu Wo, Zhenyu Wen, Hao Peng, Jie Xu, and Chao Li. 2020. Performance-Aware Speculative Resource Oversubscription for Large-Scale Clusters. IEEE Transactions on Parallel and Distributed Systems (2020).Google ScholarGoogle ScholarCross RefCross Ref
  44. Minchen Yu, Tingjia Cao, Wei Wang, and Ruichuan Chen. 2023. Following the Data, Not the Function: Rethinking Function Orchestration in Serverless Computing. In Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23).Google ScholarGoogle Scholar
  45. Tianyi Yu, Qingyuan Liu, Dong Du, Yubin Xia, Binyu Zang, Ziqian Lu, Pingchao Yang, Chenggang Qin, and Haibo Chen. 2020. Characterizing Serverless Platforms with Serverlessbench. In Proceedings of the 11th ACM Symposium on Cloud Computing (SoCC 20).Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Chengliang Zhang, Minchen Yu, Wei Wang, and Feng Yan. 2019. MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC 19).Google ScholarGoogle Scholar
  47. Hong Zhang, Yupeng Tang, Anurag Khandelwal, Jingrong Chen, and Ion Stoica. 2021. Caerus: NIMBLE Task Scheduling for Serverless Analytics. In Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21).Google ScholarGoogle Scholar
  48. Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPI2: CPU performance isolation for shared compute clusters. In Proceedings of the SIGOPS European Conference on Computer Systems (EuroSys 13).Google ScholarGoogle Scholar

Index Terms

  1. Golgi: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SoCC '23: Proceedings of the 2023 ACM Symposium on Cloud Computing
      October 2023
      624 pages
      ISBN:9798400703874
      DOI:10.1145/3620678

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 31 October 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate169of722submissions,23%
    • Article Metrics

      • Downloads (Last 12 months)603
      • Downloads (Last 6 weeks)99

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader