research-article

GRAF: a graph neural network based proactive resource allocation framework for SLO-oriented microservices

Authors:

Byungkwon Choi,

Dongsu HanAuthors Info & Claims

CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies

Pages 154 - 167

https://doi.org/10.1145/3485983.3494866

Published: 03 December 2021 Publication History

Abstract

Microservice is an architectural style that has been widely adopted in various latency-sensitive applications. Similar to the monolith, autoscaling has attracted the attention of operators for managing resource utilization of microservices. However, it is still challenging to optimize resources in terms of latency service-level-objective (SLO) without human intervention. In this paper, we present GRAF, a graph neural network-based proactive resource allocation framework for minimizing total CPU resources while satisfying latency SLO. GRAF leverages front-end workload, distributed tracing data, and machine learning approaches to (a) observe/estimate impact of traffic change (b) find optimal resource combinations (c) make proactive resource allocation. Experiments using various open-source benchmarks demonstrate that GRAF successfully targets latency SLO while saving up to 19% of total CPU resources compared to the fine-tuned autoscaler. Moreover, GRAF handles traffic surge with 36% fewer resources while achieving up to 2.6x faster tail latency convergence compared to the Kubernetes autoscaler.

Supplementary Material

MP4 File (3494866-presentation.mp4)

Presentation video. "GRAF: A Graph Neural Network based Proactive Resource Allocation Framework for SLO-Oriented Microservices"

Download
256.36 MB

References

[1]

2014. Understanding Container Reuse in AWS Lambda. https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/.

[2]

2015. Microservices at Amazon. https://www.slideshare.net/apigee/i-love-apis-2015-microservices-at-amazon-54487258.

[3]

2017. Airbnb, From Monolith to Microservices: How to Scale Your Architecture. https://www.youtube.com/watch?v=N1BWMW9NEQc.

[4]

2018. Examples and types of microservices. https://www.itrelease.com/2018/10/examples-and-types-of-microservices/.

[5]

2018. Horizontal Pod Autoscaler in AWS. https://docs.aws.amazon.com/eks/latest/userguide/horizontal-pod-autoscaler.html.

[6]

2018. Stan's Robot Shop by Instana. https://www.instana.com/blog/stans-robot-shop-sample-microservice-application/.

[7]

2019. Borg cluster workload traces. https://github.com/google/cluster-data.

[8]

2019. Cloud Waste To Hit Over 14 Billion in 2019. https://devops.com/cloud-waste-to-hit-over-14-billion-in-2019/.

[9]

2019. Multi-Tenancy Kubernetes on Bare Metal Servers. https://deview.kr/data/deview/2019/presentation/[231]+Multi-Tenancy+Kubernetes+on+Bare+Metal+Servers.pdf (16p).

[10]

2020. Microservice architecture growing in popularity, adopters enjoying success. https://www.itproportal.com/news/microservice-architecture-growing-in-popularity-adopters-enjoying-success/.

[11]

2020. Microservices Adoption in 2020. https://www.oreilly.com/radar/microservices-adoption-in-2020/.

[12]

2020. The Story of Netflix and Microservices. https://www.geeksforgeeks.org/the-story-of-netflix-and-microservices/.

[13]

2020. Vegeta: a versatile HTTP load testing tool. https://github.com/tsenart/vegeta.

[14]

2020. Wasted Cloud Spend to Exceed 17.6 Billion in 2020, Fueled by Cloud Computing Growth. https://jaychapel.medium.com/wasted-cloud-spend-to-exceed-17-6-billion-in-2020-fueled-by-cloud-computing-growth-7c8f81d5c616.

[15]

2021. Amazon EC2 On-Demand Pricing. https://aws.amazon.com/ec2/pricing/on-demand/.

[16]

2021. Bookinfo Application by Istio. https://istio.io/latest/docs/examples/bookinfo/.

[17]

2021. cAdvisor software on Github. https://github.com/google/cadvisor.

[18]

2021. Configuring horizontal Pod autoscaling in GKE. https://cloud.google.com/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling.

[19]

2021. Horizontal Pod Autoscaler of Kubernetes. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.

[20]

2021. Jaeger: open source, end-to-end distributed tracing. https://www.jaegertracing.io/.

[21]

2021. Kubernetes: Production-Grade Container Orchestration. https://kubernetes.io/.

[22]

2021. Linkerd: The world's lightest, fastest service mesh. https://linkerd.io/.

[23]

2021. Locust: An open source load testing tool. https://locust.io/.

[24]

2021. Microservices - Netflix Techblog. https://netflixtechblog.com/tagged/microservices.

[25]

2021. Online Boutique by Google. https://github.com/GoogleCloudPlatform/microservices-demo.

[26]

2021. Prometheus - Monitoring system&time series database. https://prometheus.io/.

[27]

2021. Scale applications in Azure Kubernetes Service (AKS). https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-scale.

[28]

Setareh Ariafar, Jaume Coll-Font, Dana Brooks, and Jennifer Dy. 2019. ADMMBO: Bayesian Optimization with Unknown Constraints using ADMM. Journal of Machine Learning Research 20, 123 (2019), 1--26. http://jmlr.org/papers/v20/18-227.html

[29]

Maximilian Balandat, Brian Karrer, Daniel R Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy. 2019. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. arXiv preprint arXiv:1910.06403 (2019).

[30]

Thomas Barrett, William Clements, Jakob Foerster, and Alex Lvovsky. 2020. Exploratory combinatorial optimization with reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3243--3250.

[31]

Irwan Bello, Hieu Pham, Quoc V Le, Mohammad Norouzi, and Samy Bengio. 2016. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016).

[32]

Eric Brochu, Vlad M Cora, and Nando De Freitas. 2010. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599 (2010).

[33]

Quentin Cappart, Thierry Moisan, Louis-Martin Rousseau, Isabeau Prémont-Schwarz, and Andre Cire. 2020. Combining reinforcement learning and constraint programming for combinatorial optimization. arXiv preprint arXiv:2006.01610 (2020).

[34]

Byungkwon Choi, Jinwoo Park, Chunghan Lee, and Dongsu Han. 2021. pHPA: A Proactive Autoscaling Framework For Microservice Chain. In 5th Asia-Pacific Workshop on Networking (APNet 2021). Association for Computing Machinery, Inc.

Digital Library

[35]

Chris Jones, John Wilkes, Niall Murphy, Cody Smith. [n.d.]. Service Level Objectives. https://sre.google/sre-book/service-level-objectives/.

[36]

Hanjun Dai, Elias B Khalil, Yuyu Zhang, Bistra Dilkina, and Le Song. 2017. Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665 (2017).

[37]

David Eriksson and Matthias Poloczek. 2021. Scalable Constrained Bayesian Optimization. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 130), Arindam Banerjee and Kenji Fukumizu (Eds.).PMLR, 730--738. http://proceedings.mlr.press/v130/eriksson21a.html

[38]

Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428 (2019).

[39]

Silvery Fu, Radhika Mittal, Lei Zhang, and Sylvia Ratnasamy. 2020. Fast and efficient container startup at the edge via dependency scheduling. In 3rd {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 20).

[40]

Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, et al. 2019. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 3--18.

Digital Library

[41]

Alim Ul Gias, Giuliano Casale, and Murray Woodside. 2019. ATOM: Model-driven autoscaling for microservices. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). IEEE, 1994--2004.

[42]

Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 1263--1272. http://proceedings.mlr.press/v70/gilmer17a.html

[43]

Tyler Harter, Brandon Salmon, Rose Liu, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2016. Slacker: Fast distribution with lazy docker containers. In 14th {USENIX} Conference on File and Storage Technologies ({FAST} 16). 181--195.

[44]

Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing. 445--451.

Digital Library

[45]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[46]

Qian Li, Bin Li, Pietro Mercati, Ramesh Illikkal, Charlie Tai, Michael Kishinevsky, and Christos Kozyrakis. 2021. RAMBO: Resource Allocation for Microservices Using Bayesian Optimization. IEEE Computer Architecture Letters 20, 1 (2021), 46--49.

[47]

Zhuwen Li, Qifeng Chen, and Vladlen Koltun. 2018. Combinatorial optimization with graph convolutional networks and guided tree search. arXiv preprint arXiv:1810.10659 (2018).

[48]

Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. 2019. Learning scheduling algorithms for data processing clusters. In Proceedings of the ACM Special Interest Group on Data Communication. 270--288.

[49]

Christos H Papadimitriou and Kenneth Steiglitz. 1998. Combinatorial optimization: algorithms and complexity. Courier Corporation.

[50]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).

[51]

Martin Pelikan, David E Goldberg, Erick Cantú-Paz, et al. 1999. BOA: The Bayesian optimization algorithm. In Proceedings of the genetic and evolutionary computation conference GECCO-99, Vol. 1. Citeseer, 525--532.

[52]

Issaret Prachitmutita, Wachirawit Aittinonmongkol, Nasoret Pojjanasuksakul, Montri Supattatham, and Praisan Padungweang. 2018. Auto-scaling microservices on IaaS under SLA with cost-effective framework. In 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI). 583--588.

[53]

Haoran Qiu, Subho S. Banerjee, Saurabh Jha, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. 2020. FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 805--825. https://www.usenix.org/conference/osdi20/presentation/qiu

[54]

Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the third ACM symposium on cloud computing. 1--13.

Digital Library

[55]

Krzysztof Rzadca, Paweł Findeisen, Jacek Świderski, Przemyslaw Zych, Przemyslaw Broniek, Jarek Kusmierek, Paweł Krzysztof Nowak, Beata Strack, Piotr Witusowski, Steven Hand, and John Wilkes. 2020. Autopilot: Workload Autoscaling at Google Scale. In Proceedings of the Fifteenth European Conference on Computer Systems.

Digital Library

[56]

Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 205--218. https://www.usenix.org/conference/atc20/presentation/shahrad

[57]

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. arXiv preprint arXiv:1206.2944 (2012).

[58]

Jacopo Soldani, Damian Tamburri, and Willem-Jan Heuvel. 2018. The Pains and Gains of Microservices: A Systematic Grey Literature Review. Journal of Systems and Software 146 (09 2018).

[59]

Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter. 2016. Bayesian optimization with robust Bayesian neural networks. Advances in neural information processing systems 29 (2016), 4134--4142.

[60]

Arunchandar Vasan, Anand Sivasubramaniam, Vikrant Shimpi, T. Sivabalan, and Rajesh Subbiah. 2010. Worth their watts? - an empirical study of data-center servers. In HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture. 1--10.

[61]

Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking behind the curtains of serverless platforms. In 2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18). 133--146.

[62]

Z. Yang, P. Nguyen, H. Jin, and K. Nahrstedt. 2019. MIRAS: Model-based Reinforcement Learning for Microservice Resource Allocation over Scientific Workflows. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). 122--132.

[63]

Guangba Yu, Pengfei Chen, and Zibin Zheng. 2019. Microscaler: Automatic scaling for microservices with an online learning approach. In 2019 IEEE International Conference on Web Services (ICWS). IEEE, 68--75.

[64]

Nannan Zhao, Vasily Tarasov, Hadeel Albahar, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Amit S Warke, Mohamed Mohamed, and Ali R Butt. 2019. Large-scale analysis of the docker hub dataset. In 2019 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 1--10.

Cited By

Hua QYang DQian SCao JXue GLi M(2025)Humas: A Heterogeneity- and Upgrade-Aware Microservice Auto-Scaling Framework in Large-Scale Data CentersIEEE Transactions on Computers10.1109/TC.2024.350686274:3(968-982)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1109/TC.2024.3506862
Serracanta BLukács ARodriguez-Natal ACabellos ARétvári G(2025)On the Stability of the Kubernetes Horizontal Autoscaler Control LoopIEEE Access10.1109/ACCESS.2025.352675113(7160-7166)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3526751
Bacchiani LBravetti MGiallorenzo SGabbrielli MZavattaro GZingaro S(2025)Proactive–reactive microservice architecture global scalingJournal of Systems and Software10.1016/j.jss.2024.112262220:COnline publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1016/j.jss.2024.112262
Show More Cited By

Index Terms

GRAF: a graph neural network based proactive resource allocation framework for SLO-oriented microservices
1. Networks
  1. Network algorithms
    1. Control path algorithms
      1. Network resources allocation
2. Software and its engineering
  1. Software organization and properties
    1. Software system structures
      1. Distributed systems organizing principles
        Cloud computing

Recommendations

Portable Autoscaler for Managing Multi-cloud Elasticity
CUBE '13: Proceedings of the 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies

Ability to scale resources up or down dynamically as per changes in workload conditions is one of the key features of clouds. We present here a framework for elastic scaling of cloud resources that is portable across clouds from a wide range of private ...
TopFull: An Adaptive Top-Down Overload Control for SLO-Oriented Microservices
ACM SIGCOMM '24: Proceedings of the ACM SIGCOMM 2024 Conference

Microservice has become a de facto standard for building large-scale cloud applications. Overload control is essential in preventing microservice failures and maintaining system performance under overloads. Although several approaches have been proposed, ...
Practical Efficient Microservice Autoscaling with QoS Assurance
HPDC '22: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing

Cloud applications are increasingly moving away from monolithic services to agile microservices-based deployments. However, efficient resource management for microservices poses a significant hurdle due to the sheer number of loosely coupled and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies

December 2021

507 pages

ISBN:9781450390989

DOI:10.1145/3485983

General Chairs:
Georg Carle
Technical University of Munich, Germany
,
Jörg Ott
Technical University of Munich, Germany

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 December 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CoNEXT '21

Sponsor:

SIGCOMM

CoNEXT '21: The 17th International Conference on emerging Networking EXperiments and Technologies

December 7 - 10, 2021

Virtual Event, Germany

Acceptance Rates

Overall Acceptance Rate 198 of 789 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
1,330
Total Downloads

Downloads (Last 12 months)289
Downloads (Last 6 weeks)44

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hua QYang DQian SCao JXue GLi M(2025)Humas: A Heterogeneity- and Upgrade-Aware Microservice Auto-Scaling Framework in Large-Scale Data CentersIEEE Transactions on Computers10.1109/TC.2024.350686274:3(968-982)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1109/TC.2024.3506862
Serracanta BLukács ARodriguez-Natal ACabellos ARétvári G(2025)On the Stability of the Kubernetes Horizontal Autoscaler Control LoopIEEE Access10.1109/ACCESS.2025.352675113(7160-7166)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3526751
Bacchiani LBravetti MGiallorenzo SGabbrielli MZavattaro GZingaro S(2025)Proactive–reactive microservice architecture global scalingJournal of Systems and Software10.1016/j.jss.2024.112262220:COnline publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1016/j.jss.2024.112262
Wang ZLi PLiang CWu FYan FVanbever LZhang I(2024)AutothrottleProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691834(149-165)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.5555/3691825.3691834
Liu ZYang ROuyang JJiang WYe TZhang MHuang SHuang JSong CZhang DWo THu C(2024)Kale: Elastic GPU Scheduling for Online DL Model TrainingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698532(36-51)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698532
Park JPark JJung YLim HYeo HHan DSekar VYu MSeneviratne AVeitch D(2024)TopFull: An Adaptive Top-Down Overload Control for SLO-Oriented MicroservicesProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672253(876-890)Online publication date: 4-Aug-2024
https://dl.acm.org/doi/10.1145/3651890.3672253
Luo SLin CYe KXu GZhang LYang GXu HXu C(2024)Optimizing Resource Management for Shared Microservices: A Scalable System DesignACM Transactions on Computer Systems10.1145/363160742:1-2(1-28)Online publication date: 13-Feb-2024
https://dl.acm.org/doi/10.1145/3631607
Guo YGe JGuo PChai YLi TShi MTu YOuyang JChua TNgo CKa-Wei Lee RKumar RLauw H(2024)PASS: Predictive Auto-Scaling System for Large-scale Enterprise Web ApplicationsProceedings of the ACM Web Conference 202410.1145/3589334.3645330(2747-2758)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645330
Xie SWang JLi BZhang ZLi DHung P(2024)PBScaler: A Bottleneck-Aware Autoscaling Framework for Microservice-Based ApplicationsIEEE Transactions on Services Computing10.1109/TSC.2024.337620217:2(604-616)Online publication date: Mar-2024
https://doi.org/10.1109/TSC.2024.3376202
Yu HWang HLi JYuan XPark S(2024)Freyr +: Harvesting Idle Resources in Serverless Computing via Deep Reinforcement LearningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.346229435:11(2254-2269)Online publication date: Nov-2024
https://doi.org/10.1109/TPDS.2024.3462294
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten