skip to main content
10.1145/3485983.3494866acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article

GRAF: a graph neural network based proactive resource allocation framework for SLO-oriented microservices

Published: 03 December 2021 Publication History

Abstract

Microservice is an architectural style that has been widely adopted in various latency-sensitive applications. Similar to the monolith, autoscaling has attracted the attention of operators for managing resource utilization of microservices. However, it is still challenging to optimize resources in terms of latency service-level-objective (SLO) without human intervention. In this paper, we present GRAF, a graph neural network-based proactive resource allocation framework for minimizing total CPU resources while satisfying latency SLO. GRAF leverages front-end workload, distributed tracing data, and machine learning approaches to (a) observe/estimate impact of traffic change (b) find optimal resource combinations (c) make proactive resource allocation. Experiments using various open-source benchmarks demonstrate that GRAF successfully targets latency SLO while saving up to 19% of total CPU resources compared to the fine-tuned autoscaler. Moreover, GRAF handles traffic surge with 36% fewer resources while achieving up to 2.6x faster tail latency convergence compared to the Kubernetes autoscaler.

Supplementary Material

MP4 File (3494866-presentation.mp4)
Presentation video. "GRAF: A Graph Neural Network based Proactive Resource Allocation Framework for SLO-Oriented Microservices"

References

[1]
2014. Understanding Container Reuse in AWS Lambda. https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/.
[2]
2015. Microservices at Amazon. https://www.slideshare.net/apigee/i-love-apis-2015-microservices-at-amazon-54487258.
[3]
2017. Airbnb, From Monolith to Microservices: How to Scale Your Architecture. https://www.youtube.com/watch?v=N1BWMW9NEQc.
[4]
2018. Examples and types of microservices. https://www.itrelease.com/2018/10/examples-and-types-of-microservices/.
[5]
2018. Horizontal Pod Autoscaler in AWS. https://docs.aws.amazon.com/eks/latest/userguide/horizontal-pod-autoscaler.html.
[6]
2018. Stan's Robot Shop by Instana. https://www.instana.com/blog/stans-robot-shop-sample-microservice-application/.
[7]
2019. Borg cluster workload traces. https://github.com/google/cluster-data.
[8]
2019. Cloud Waste To Hit Over 14 Billion in 2019. https://devops.com/cloud-waste-to-hit-over-14-billion-in-2019/.
[9]
2019. Multi-Tenancy Kubernetes on Bare Metal Servers. https://deview.kr/data/deview/2019/presentation/[231]+Multi-Tenancy+Kubernetes+on+Bare+Metal+Servers.pdf (16p).
[10]
2020. Microservice architecture growing in popularity, adopters enjoying success. https://www.itproportal.com/news/microservice-architecture-growing-in-popularity-adopters-enjoying-success/.
[11]
2020. Microservices Adoption in 2020. https://www.oreilly.com/radar/microservices-adoption-in-2020/.
[12]
2020. The Story of Netflix and Microservices. https://www.geeksforgeeks.org/the-story-of-netflix-and-microservices/.
[13]
2020. Vegeta: a versatile HTTP load testing tool. https://github.com/tsenart/vegeta.
[14]
2020. Wasted Cloud Spend to Exceed 17.6 Billion in 2020, Fueled by Cloud Computing Growth. https://jaychapel.medium.com/wasted-cloud-spend-to-exceed-17-6-billion-in-2020-fueled-by-cloud-computing-growth-7c8f81d5c616.
[15]
2021. Amazon EC2 On-Demand Pricing. https://aws.amazon.com/ec2/pricing/on-demand/.
[16]
2021. Bookinfo Application by Istio. https://istio.io/latest/docs/examples/bookinfo/.
[17]
2021. cAdvisor software on Github. https://github.com/google/cadvisor.
[18]
2021. Configuring horizontal Pod autoscaling in GKE. https://cloud.google.com/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling.
[19]
2021. Horizontal Pod Autoscaler of Kubernetes. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.
[20]
2021. Jaeger: open source, end-to-end distributed tracing. https://www.jaegertracing.io/.
[21]
2021. Kubernetes: Production-Grade Container Orchestration. https://kubernetes.io/.
[22]
2021. Linkerd: The world's lightest, fastest service mesh. https://linkerd.io/.
[23]
2021. Locust: An open source load testing tool. https://locust.io/.
[24]
2021. Microservices - Netflix Techblog. https://netflixtechblog.com/tagged/microservices.
[25]
2021. Online Boutique by Google. https://github.com/GoogleCloudPlatform/microservices-demo.
[26]
2021. Prometheus - Monitoring system&time series database. https://prometheus.io/.
[27]
2021. Scale applications in Azure Kubernetes Service (AKS). https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-scale.
[28]
Setareh Ariafar, Jaume Coll-Font, Dana Brooks, and Jennifer Dy. 2019. ADMMBO: Bayesian Optimization with Unknown Constraints using ADMM. Journal of Machine Learning Research 20, 123 (2019), 1--26. http://jmlr.org/papers/v20/18-227.html
[29]
Maximilian Balandat, Brian Karrer, Daniel R Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy. 2019. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. arXiv preprint arXiv:1910.06403 (2019).
[30]
Thomas Barrett, William Clements, Jakob Foerster, and Alex Lvovsky. 2020. Exploratory combinatorial optimization with reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3243--3250.
[31]
Irwan Bello, Hieu Pham, Quoc V Le, Mohammad Norouzi, and Samy Bengio. 2016. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016).
[32]
Eric Brochu, Vlad M Cora, and Nando De Freitas. 2010. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599 (2010).
[33]
Quentin Cappart, Thierry Moisan, Louis-Martin Rousseau, Isabeau Prémont-Schwarz, and Andre Cire. 2020. Combining reinforcement learning and constraint programming for combinatorial optimization. arXiv preprint arXiv:2006.01610 (2020).
[34]
Byungkwon Choi, Jinwoo Park, Chunghan Lee, and Dongsu Han. 2021. pHPA: A Proactive Autoscaling Framework For Microservice Chain. In 5th Asia-Pacific Workshop on Networking (APNet 2021). Association for Computing Machinery, Inc.
[35]
Chris Jones, John Wilkes, Niall Murphy, Cody Smith. [n.d.]. Service Level Objectives. https://sre.google/sre-book/service-level-objectives/.
[36]
Hanjun Dai, Elias B Khalil, Yuyu Zhang, Bistra Dilkina, and Le Song. 2017. Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665 (2017).
[37]
David Eriksson and Matthias Poloczek. 2021. Scalable Constrained Bayesian Optimization. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 130), Arindam Banerjee and Kenji Fukumizu (Eds.).PMLR, 730--738. http://proceedings.mlr.press/v130/eriksson21a.html
[38]
Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 (2019).
[39]
Silvery Fu, Radhika Mittal, Lei Zhang, and Sylvia Ratnasamy. 2020. Fast and efficient container startup at the edge via dependency scheduling. In 3rd {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 20).
[40]
Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, et al. 2019. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 3--18.
[41]
Alim Ul Gias, Giuliano Casale, and Murray Woodside. 2019. ATOM: Model-driven autoscaling for microservices. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). IEEE, 1994--2004.
[42]
Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 1263--1272. http://proceedings.mlr.press/v70/gilmer17a.html
[43]
Tyler Harter, Brandon Salmon, Rose Liu, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2016. Slacker: Fast distribution with lazy docker containers. In 14th {USENIX} Conference on File and Storage Technologies ({FAST} 16). 181--195.
[44]
Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing. 445--451.
[45]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[46]
Qian Li, Bin Li, Pietro Mercati, Ramesh Illikkal, Charlie Tai, Michael Kishinevsky, and Christos Kozyrakis. 2021. RAMBO: Resource Allocation for Microservices Using Bayesian Optimization. IEEE Computer Architecture Letters 20, 1 (2021), 46--49.
[47]
Zhuwen Li, Qifeng Chen, and Vladlen Koltun. 2018. Combinatorial optimization with graph convolutional networks and guided tree search. arXiv preprint arXiv:1810.10659 (2018).
[48]
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. 2019. Learning scheduling algorithms for data processing clusters. In Proceedings of the ACM Special Interest Group on Data Communication. 270--288.
[49]
Christos H Papadimitriou and Kenneth Steiglitz. 1998. Combinatorial optimization: algorithms and complexity. Courier Corporation.
[50]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).
[51]
Martin Pelikan, David E Goldberg, Erick Cantú-Paz, et al. 1999. BOA: The Bayesian optimization algorithm. In Proceedings of the genetic and evolutionary computation conference GECCO-99, Vol. 1. Citeseer, 525--532.
[52]
Issaret Prachitmutita, Wachirawit Aittinonmongkol, Nasoret Pojjanasuksakul, Montri Supattatham, and Praisan Padungweang. 2018. Auto-scaling microservices on IaaS under SLA with cost-effective framework. In 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI). 583--588.
[53]
Haoran Qiu, Subho S. Banerjee, Saurabh Jha, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. 2020. FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 805--825. https://www.usenix.org/conference/osdi20/presentation/qiu
[54]
Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the third ACM symposium on cloud computing. 1--13.
[55]
Krzysztof Rzadca, Paweł Findeisen, Jacek Świderski, Przemyslaw Zych, Przemyslaw Broniek, Jarek Kusmierek, Paweł Krzysztof Nowak, Beata Strack, Piotr Witusowski, Steven Hand, and John Wilkes. 2020. Autopilot: Workload Autoscaling at Google Scale. In Proceedings of the Fifteenth European Conference on Computer Systems.
[56]
Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 205--218. https://www.usenix.org/conference/atc20/presentation/shahrad
[57]
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. arXiv preprint arXiv:1206.2944 (2012).
[58]
Jacopo Soldani, Damian Tamburri, and Willem-Jan Heuvel. 2018. The Pains and Gains of Microservices: A Systematic Grey Literature Review. Journal of Systems and Software 146 (09 2018).
[59]
Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter. 2016. Bayesian optimization with robust Bayesian neural networks. Advances in neural information processing systems 29 (2016), 4134--4142.
[60]
Arunchandar Vasan, Anand Sivasubramaniam, Vikrant Shimpi, T. Sivabalan, and Rajesh Subbiah. 2010. Worth their watts? - an empirical study of data-center servers. In HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture. 1--10.
[61]
Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking behind the curtains of serverless platforms. In 2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18). 133--146.
[62]
Z. Yang, P. Nguyen, H. Jin, and K. Nahrstedt. 2019. MIRAS: Model-based Reinforcement Learning for Microservice Resource Allocation over Scientific Workflows. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). 122--132.
[63]
Guangba Yu, Pengfei Chen, and Zibin Zheng. 2019. Microscaler: Automatic scaling for microservices with an online learning approach. In 2019 IEEE International Conference on Web Services (ICWS). IEEE, 68--75.
[64]
Nannan Zhao, Vasily Tarasov, Hadeel Albahar, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Amit S Warke, Mohamed Mohamed, and Ali R Butt. 2019. Large-scale analysis of the docker hub dataset. In 2019 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 1--10.

Cited By

View all
  • (2025)Humas: A Heterogeneity- and Upgrade-Aware Microservice Auto-Scaling Framework in Large-Scale Data CentersIEEE Transactions on Computers10.1109/TC.2024.350686274:3(968-982)Online publication date: 1-Mar-2025
  • (2025)On the Stability of the Kubernetes Horizontal Autoscaler Control LoopIEEE Access10.1109/ACCESS.2025.352675113(7160-7166)Online publication date: 2025
  • (2025)Proactive–reactive microservice architecture global scalingJournal of Systems and Software10.1016/j.jss.2024.112262220:COnline publication date: 1-Feb-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies
December 2021
507 pages
ISBN:9781450390989
DOI:10.1145/3485983
  • General Chairs:
  • Georg Carle,
  • Jörg Ott
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 December 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. applied machine learning
  2. autoscaler
  3. cloud computing
  4. graph neural networks
  5. microservices
  6. resources optimization

Qualifiers

  • Research-article

Conference

CoNEXT '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 198 of 789 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)289
  • Downloads (Last 6 weeks)44
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Humas: A Heterogeneity- and Upgrade-Aware Microservice Auto-Scaling Framework in Large-Scale Data CentersIEEE Transactions on Computers10.1109/TC.2024.350686274:3(968-982)Online publication date: 1-Mar-2025
  • (2025)On the Stability of the Kubernetes Horizontal Autoscaler Control LoopIEEE Access10.1109/ACCESS.2025.352675113(7160-7166)Online publication date: 2025
  • (2025)Proactive–reactive microservice architecture global scalingJournal of Systems and Software10.1016/j.jss.2024.112262220:COnline publication date: 1-Feb-2025
  • (2024)AutothrottleProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691834(149-165)Online publication date: 16-Apr-2024
  • (2024)Kale: Elastic GPU Scheduling for Online DL Model TrainingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698532(36-51)Online publication date: 20-Nov-2024
  • (2024)TopFull: An Adaptive Top-Down Overload Control for SLO-Oriented MicroservicesProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672253(876-890)Online publication date: 4-Aug-2024
  • (2024)Optimizing Resource Management for Shared Microservices: A Scalable System DesignACM Transactions on Computer Systems10.1145/363160742:1-2(1-28)Online publication date: 13-Feb-2024
  • (2024)PASS: Predictive Auto-Scaling System for Large-scale Enterprise Web ApplicationsProceedings of the ACM Web Conference 202410.1145/3589334.3645330(2747-2758)Online publication date: 13-May-2024
  • (2024)PBScaler: A Bottleneck-Aware Autoscaling Framework for Microservice-Based ApplicationsIEEE Transactions on Services Computing10.1109/TSC.2024.337620217:2(604-616)Online publication date: Mar-2024
  • (2024)Freyr +: Harvesting Idle Resources in Serverless Computing via Deep Reinforcement LearningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.346229435:11(2254-2269)Online publication date: Nov-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media