skip to main content
10.1145/3588195.3592996acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Libra: Harvesting Idle Resources Safely and Timely in Serverless Clusters

Published:07 August 2023Publication History

ABSTRACT

Serverless computing has been favored by users and infrastructure providers from various industries, including online services and scientific computing. Users enjoy its auto-scaling and ease-of-management, and providers own more control to optimize their service. However, existing serverless platforms still require users to pre-define resource allocations for their functions, leading to frequent misconfiguration by inexperienced users in practice. Besides, functions' varying input data further escalate the gap between their dynamic resource demands and static allocations, leaving functions either over-provisioned or under-provisioned. This paper presents Libra, a safe and timely resource harvesting framework for multi-node serverless clusters. Libra makes precise harvesting decisions to accelerate function invocations with harvested resources and jointly improve resource utilization by profiling dynamic resource demands and availability proactively. Experiments on OpenWhisk clusters with real-world workloads show that Libra reduces response latency by 39% and achieves 3X resource utilization compared to state-of-the-art solutions.

References

  1. Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. 2016. Youtube-8m: A Large-scale Video Classification Benchmark. arXiv preprint arXiv:1609.08675 (2016).Google ScholarGoogle Scholar
  2. Nabeel Akhtar, Ali Raza, Vatche Ishakian, and Ibrahim Matta. 2020. COSE: Configuring Serverless Functions using Statistical Learning. In Proc. of the 2020 IEEE Conference on Computer Communications (INFOCOM).Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ahsan Ali, Riccardo Pinciroli, Feng Yan, and Evgenia Smirni. 2020. BATCH: Machine Learning Inference Serving on Serverless Platforms With Adaptive Batching. In Proc. of the IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE, 1--15.Google ScholarGoogle ScholarCross RefCross Ref
  4. Pradeep Ambati, Í nigo Goiri, Felipe Frujeri, Alper Gun, Ke Wang, Brian Dolan, Brian Corell, Sekhar Pasupuleti, Thomas Moscibroda, Sameh Elnikety, et al. 2020. Providing SLOs for Resource-Harvesting VMs in Cloud Platforms. In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation (OSDI).Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Lixiang Ao, Liz Izhikevich, Geoffrey M Voelker, and George Porter. 2018. Sprocket: A Serverless Video Processing Framework. In Proc. of the ACM Symposium on Cloud Computing (SoCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Apache. 2018. Apache OpenWhisk: Open Source Serverless Cloud Platform. https://openwhisk.apache.org. [Online; accessed 1-May-2018].Google ScholarGoogle Scholar
  7. Arda Aytekin and Mikael Johansson. 2019. Harnessing the Power of Serverless Runtimes for Large-Scale Optimization. arXiv preprint arXiv:1901.03161 (2019).Google ScholarGoogle Scholar
  8. Yossi Azar and Danny Vainstein. 2019. Tight Bounds for Clairvoyant Dynamic Bin Packing. ACM Transactions on Parallel Computing (TOPC) (2019).Google ScholarGoogle Scholar
  9. Bharathan Balaji, Christopher Kakovitch, and Balakrishnan Narayanaswamy. 2021. FirePlace: Placing Firecraker Virtual Machines with Hindsight Imitation. Proc. of Machine Learning and Systems (MLSys), Vol. 3 (2021).Google ScholarGoogle Scholar
  10. Vivek M Bhasi, Jashwant Raj Gunasekaran, Aakash Sharma, Mahmut Taylan Kandemir, and Chita Das. 2022. Cypress: Input Size-sensitive Container Provisioning and Request Scheduling for Serverless Platforms. In Proceedings of the 13th Symposium on Cloud Computing (SoCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. 2019. Cirrus: a Serverless Framework for End-to-end ML Workflows. In Proc. of the ACM Symposium on Cloud Computing (SoCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ryan Chard, Yadu Babuji, Zhuozhao Li, Tyler Skluzacek, Anna Woodard, Ben Blaiszik, Ian Foster, and Kyle Chard. 2020. FuncX: A Federated Function Serving Fabric for Science. In Proc. of The 29th International Symposium on High-Performance Parallel and Distributed Computing (HPDC). 65--76.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Edward G Coffman, Jr, Michael R Garey, and David S Johnson. 1983. Dynamic Bin Packing. SIAM J. Comput. (1983).Google ScholarGoogle Scholar
  14. Marcin Copik et al. 2020. SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing. arXiv preprint arXiv:2012.14132 (2020).Google ScholarGoogle Scholar
  15. Gabor Csardi, Tamas Nepusz, et al. 2006. The igraph Software Package for Complex Network Research. InterJournal, complex systems (2006).Google ScholarGoogle Scholar
  16. Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-efficient and QoS-aware Cluster Management. ACM SIGPLAN Notices (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Docker. 2021. Docker Update API. https://docs.docker.com/engine/reference/commandline/update/. [Online].Google ScholarGoogle Scholar
  18. Simon Eismann, Long Bui, Johannes Grohmann, Cristina Abad, Nikolas Herbst, and Samuel Kounev. 2021. Sizeless: Predicting the Optimal Size of Serverless Functions. In Proc. of the 22nd International Middleware Conference (MIDDLEWARE).Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sadjad Fouladi, Riad S Wahby, Brennan Shacklett, Karthikeyan Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein. 2017. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads. In Proc. of USENIX NSDI.Google ScholarGoogle Scholar
  20. Alexander Fuerst, Stanko Novaković, Í nigo Goiri, Gohar Irfan Chaudhry, Prateek Sharma, Kapil Arya, Kevin Broas, Eugene Bak, Mehmet Iyigun, and Ricardo Bianchini. 2022. Memory-harvesting VMs in Cloud Platforms. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Alexander Fuerst and Prateek Sharma. 2022. Locality-aware Load-Balancing For Serverless Clusters. In Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing (HPDC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jashwant Raj Gunasekaran, Prashanth Thinakaran, Nachiappan Chidambaram, Mahmut T Kandemir, and Chita R Das. 2020. Fifer: Tackling Underutilization in the Serverless Era. In Proc. the 21st International Middleware Conference (Middleware).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Varun Gupta, Mor Harchol Balter, Karl Sigman, and Ward Whitt. 2007. Analysis of Join-the-Shortest-Queue Routing for Web Server Farms. Performance Evaluation (2007).Google ScholarGoogle Scholar
  24. Charles R Harris, K Jarrod Millman, Stéfan J Van Der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J Smith, et al. 2020. Array Programming with NumPy. Nature (2020).Google ScholarGoogle Scholar
  25. Heo, Tejun. 2021. Control Group v2. https://www.kernel.org/doc/Documentation/cgroup-v2.txt. [Online; accessed 1-April-2022].Google ScholarGoogle Scholar
  26. Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning Multiple Layers of Features from Tiny Images. (2009).Google ScholarGoogle Scholar
  27. Ashraf Mahgoub, Li Wang, Karthick Shankar, Yiming Zhang, Huangshi Tian, Subrata Mitra, Yuxing Peng, Hongqi Wang, Ana Klimovic, Haoran Yang, et al. 2021. SONIC: Application-aware Data Passing for Chained Serverless Applications. In Proc. of the 2021 USENIX Annual Technical Conference (USENIX ATC).Google ScholarGoogle Scholar
  28. Bilal Muhammad, Canini Marco, Fonseca Rodrigo, and Rodrigues Rodrigo. 2023. With Great Freedom Comes Great Opportunity: Rethinking Resource Allocation for Serverless Functions. In Proceedings of the European Conference on Computer Systems (EuroSys).Google ScholarGoogle Scholar
  29. Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale, Stéphane Pouget, Josiane Kouam, Renaud Lachaize, Jinho Hwang, Tim Wood, Daniel Hagimont, et al. 2021. OFC: an Opportunistic Caching System for FaaS Platforms. In Proceedings of the Sixteenth European Conference on Computer Systems (EuroSys).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jinwoo Park, Byungkwon Choi, Chunghan Lee, and Dongsu Han. 2021. GRAF: A Graph Neural Network Based Proactive Resource Allocation Framework for SLO-Oriented Microservices. In Proc. of the 17th International Conference on emerging Networking EXperiments and Technologies (CONEXT).Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research (JMLR) (2011).Google ScholarGoogle Scholar
  32. Benjamin Reidys, Jinghan Sun, Anirudh Badam, Shadi Noghabi, and Jian Huang. 2022. BlockFlex: Enabling Storage Harvesting with Software-Defined Flash in Modern Cloud Platforms. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google ScholarGoogle Scholar
  33. Francisco Romero, Gohar Irfan Chaudhry, Í nigo Goiri, Pragna Gopa, Paul Batum, Neeraja J Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, and Ricardo Bianchini. 2021. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications. In Proceedings of the ACM Symposium on Cloud Computing (SoCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Rohan Basu Roy, Tirthak Patel, Vijay Gadepally, and Devesh Tiwari. 2022. Mashup: Making Serverless Computing Useful for HPC Workflows via Hybrid Execution. In Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP).Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mohammad Shahrad, Jonathan Balkind, and David Wentzlaff. 2019. Architectural Implications of Function-as-a-Service Computing. In Proc. of IEEE/ACM MICRO.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In Proc. of USENIX ATC.Google ScholarGoogle Scholar
  37. Vaishaal Shankar, Karl Krauth, Kailas Vodrahalli, Qifan Pu, Benjamin Recht, Ion Stoica, Jonathan Ragan-Kelley, Eric Jonas, and Shivaram Venkataraman. 2020. Serverless Linear Algebra. In Proc. of the 11th ACM Symposium on Cloud Computing (SoCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Arjun Singhvi, Arjun Balasubramanian, Kevin Houck, Mohammed Danish Shaikh, Shivaram Venkataraman, and Aditya Akella. 2021. Atoll: A Scalable Low-Latency Serverless Platform. In Proc. of the ACM Symposium on Cloud Computing (SoCC). 138--152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Tyler J Skluzacek, Ryan Wong, Zhuozhao Li, Ryan Chard, Kyle Chard, and Ian Foster. 2021. A Serverless Framework for Distributed Bulk Metadata Extraction. In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing (HPDC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Craig A Stewart, Timothy M Cockerill, Ian Foster, David Hancock, Nirav Merchant, Edwin Skidmore, Daniel Stanzione, James Taylor, Steven Tuecke, George Turner, et al. 2015. Jetstream: a Self-provisioned, Scalable Science and Engineering Cloud Environment. In Proc. of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure (XSEDE).Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Amoghavarsha Suresh and Anshul Gandhi. 2021. ServerMore: Opportunistic Execution of Serverless Functions in the Cloud. In Proc. of the ACM Symposium on Cloud Computing (SoCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Huangshi Tian, Suyi Li, Ao Wang, Wei Wang, Tianlong Wu, and Haoran Yang. 2022. Owl: Performance-aware Scheduling for Resource-efficient Function-as-a-Service Cloud. In Proc. of the 13th ACM Symposium on Cloud Computing (SoCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. John Towns, Timothy Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, Andrew Grimshaw, Victor Hazlewood, Scott Lathrop, Dave Lifka, Gregory D Peterson, et al. 2014. XSEDE: Accelerating Scientific Discovery. Computing in Science & Engineering (2014).Google ScholarGoogle Scholar
  44. Hao Wang, Di Niu, and Baochun Li. 2019. Distributed Machine Learning with a Serverless Architecture. In Proc. of IEEE INFOCOM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yawen Wang, Kapil Arya, Marios Kogias, Manohar Vanga, Aditya Bhandari, Neeraja J Yadwadkar, Siddhartha Sen, Sameh Elnikety, Christos Kozyrakis, and Ricardo Bianchini. 2021. SmartHarvest: Harvesting Idle CPUs Safely and Efficiently in the Cloud. In Proc. of ACM EuroSys.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Zhaojie Wen, Yishuo Wang, and Fangming Liu. 2022. StepConf: SLO-Aware Dynamic Resource Configuration for Serverless Function Workflows. In Proc. of the IEEE International Conference on Computer Communications (INFOCOM).Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. David L Wheeler, Tanya Barrett, Dennis A Benson, Stephen H Bryant, Kathi Canese, Vyacheslav Chetvernin, Deanna M Church, Michael DiCuccio, Ron Edgar, Scott Federhen, et al. 2007. Database Resources of the National Center for Biotechnology Information. Nucleic Acids Research (2007).Google ScholarGoogle Scholar
  48. Hanfei Yu, Athirai A Irissappane, Hao Wang, and Wes J Lloyd. 2021. FaaSRank: Learning to Schedule Functions in Serverless Platforms. In Proc. of IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS).Google ScholarGoogle ScholarCross RefCross Ref
  49. Hanfei Yu, Hao Wang, Jian Li, Xu Yuan, and Seung-Jong Park. 2022. Accelerating Serverless Computing by Harvesting Idle Resources. In Proc. of the ACM Web Conference (WWW).Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Yanqi Zhang, Í nigo Goiri, Gohar Irfan Chaudhry, Rodrigo Fonseca, Sameh Elnikety, Christina Delimitrou, and Ricardo Bianchini. 2021. Faster and Cheaper Serverless Computing on Harvested Resources. In Proc. of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Libra: Harvesting Idle Resources Safely and Timely in Serverless Clusters

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      HPDC '23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing
      August 2023
      350 pages
      ISBN:9798400701559
      DOI:10.1145/3588195
      • General Chair:
      • Ali R. Butt,
      • Program Chairs:
      • Ningfang Mi,
      • Kyle Chard

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate166of966submissions,17%

      Upcoming Conference

    • Article Metrics

      • Downloads (Last 12 months)176
      • Downloads (Last 6 weeks)30

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader