research-article

The power of prediction: microservice auto scaling via workload learning

Authors:
Shutian Luo

Univ. of Macau

Univ. of Macau
View Profile

,
Huanle Xu

University of Macau

University of Macau
View Profile

,
Kejiang Ye

CAS

CAS
View Profile

,
Guoyao Xu

Alibaba Group

Alibaba Group
View Profile

,
Liping Zhang

Alibaba Group

Alibaba Group
View Profile

,
Guodong Yang

Alibaba Group

Alibaba Group
View Profile

,
Chengzhong Xu

University of Macau

University of Macau
View Profile

SoCC '22: Proceedings of the 13th Symposium on Cloud ComputingNovember 2022Pages 355–369https://doi.org/10.1145/3542929.3563477

Published:07 November 2022Publication History

SoCC '22: Proceedings of the 13th Symposium on Cloud Computing

Pages 355–369

ABSTRACT

When deploying microservices in production clusters, it is critical to automatically scale containers to improve cluster utilization and ensure service level agreements (SLA). Although reactive scaling approaches work well for monolithic architectures, they are not necessarily suitable for microservice frameworks due to the long delay caused by complex microservice call chains. In contrast, existing proactive approaches leverage end-to-end performance prediction for scaling, but cannot effectively handle microservice multiplexing and dynamic microservice dependencies.

In this paper, we present Madu, a proactive microservice auto-scaler that scales containers based on predictions for individual microservices. Madu learns workload uncertainty to handle the highly dynamic dependency between microservices. Additionally, Madu adopts OS-level metrics to optimize resource usage while maintaining good control over scaling overhead. Experiments on large-scale deployments of microservices in Alibaba clusters show that the overall prediction accuracy of Madu can reach as high as 92.3% on average, which is 13% higher than the state-of-the-art approaches. Furthermore, experiments running real-world microservice benchmarks in a local cluster of 20 servers show that Madu can reduce the overall resource usage by 1.7X compared to reactive solutions, while reducing end-to-end service latency by 50%.

References

2022. Alibaba Cloud. https://www.alibabacloud.com/.Google Scholar
2022. Alibaba Microservice Traces. https://github.com/alibaba/clusterdata/tree/master/cluster-trace-microservices-v2022.Google Scholar
2022. Jaeger: Open source, end-to-end distributed tracing. https://jaegertracing.io/.Google Scholar
2022. Kubernetes Horizon Pod Autoscaler. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.Google Scholar
2022. Object Storage Service. https://www.alibabacloud.com/product/oss.Google Scholar
2022. The Prometheus monitoring system and time series database. https://github.com/prometheus/prometheus/.Google Scholar
Martín Abadi, Paul Barham, et al. 2016. Tensorflow: A system for large-scale machine learning. In Processing of OSDI.Google Scholar
Haldun Akoglu. 2018. User's guide to correlation coefficients. Turkish journal of emergency medicine (2018).Google Scholar
Ataollah Fatahi Baarzi and George Kesidis. 2021. Showar: Right-sizing and efficient scheduling of microservices. In Proceedings of the ACM Symposium on Cloud Computing.Google ScholarDigital Library
Luciano Baresi, Sam Guinea, Alberto Leva, and Giovanni Quattrocchi. 2016. A Discrete-Time Feedback Controller for Containerized Cloud Applications. In Proceedings of FSE.Google ScholarDigital Library
S. Becker, E. J. Candès, and M. Grant. 2011. Templates for convex cone problems with applications to sparse signal recovery. Mathematical Programming Computation (2011).Google Scholar
Vivek M Bhasi, Jashwant Raj Gunasekaran, Prashanth Thinakaran, Cyan Subhra Mishra, Mahmut Taylan Kandemir, and Chita Das. 2021. Kraken: Adaptive Container Provisioning for Deploying Dynamic DAGs in Serverless Platforms. In Proceedings of SoCC.Google ScholarDigital Library
Marcus Carvalho, Walfredo Cirne, et al. 2014. Long-term SLOs for reclaimed cloud computing resources. In Proceedings of SoCC.Google ScholarDigital Library
Jie Chang, Zhonghao Lan, et al. 2020. Data Uncertainty Learning in Face Recognition. In Proceedings of CVPR.Google ScholarCross Ref
Chris Chatfield. 2003. The analysis of time series: an introduction. Chapman and hall/CRC.Google Scholar
Ka-Ho Chow, Umesh Deshpande, Sangeetha Seshadri, and Ling Liu. 2022. DeepRest: Deep Resource Estimation for Interactive Microservices. In Proceedings of EuroSys.Google ScholarDigital Library
Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: resource-efficient and QoS-aware cluster management. In Proceedings of ASPLOS.Google ScholarDigital Library
Yu Gan, Mingyu Liang, Sundar Dev, David Lo, and Christina Delimitrou. 2021. Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices. In Proceedings of ASPLOS.Google ScholarDigital Library
Yu Gan, Yanqi Zhang, et al. 2019. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems. In Proceedings of ASPLOS.Google ScholarDigital Library
Yu Gan, Yanqi Zhang, Kelvin Hu, Dailun Cheng, Yuan He, Meghna Pancholi, and Christina Delimitrou. 2019. Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices. In Proceedings of ASPLOS.Google ScholarDigital Library
Alim Ul Gias, Giuliano Casale, and Murray Woodside. 2019. ATOM: Model-Driven Autoscaling for Microservices. In Proceedings of ICDCS.Google ScholarCross Ref
GoogleCloud. 2022. https://cloud.google.com/.Google Scholar
Hadoop. 2022. https://hadoop.apache.org/.Google Scholar
Jay Heo, Hae Beom Lee, et al. 2018. Uncertainty-aware attention for reliable interpretation and prediction. In Proceedings of NeurIPS.Google Scholar
Md Rajib Hossen, Mohammad A Islam, and Kishwar Ahmed. 2022. Practical Efficient Microservice Autoscaling with QoS Assurance. In Proceedings of HPDC.Google ScholarDigital Library
Ram Srivatsa Kannan, Lavanya Subramanian, Ashwin Raju, Jeongseob Ahn, and Jason Mars. 2019. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In Proceedings of EuroSys.Google ScholarDigital Library
Alex Kendall and Yarin Gal. 2017. What uncertainties do we need in bayesian deep learning for computer vision?. In Proceedings of NeurIPS.Google Scholar
Kubernetes. [n.d.]. https://kubernetes.io..Google Scholar
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In Proceedings of NeurIPS.Google Scholar
Qixiao Liu and Zhibin Yu. 2018. The elasticity and plasticity in semi-containerized co-locating cloud workload: A view from Alibaba trace. In Proceedings of SoCC.Google ScholarDigital Library
Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Yu Ding, Jian He, and Chengzhong Xu. 2021. Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis. In Proceedings of SoCC.Google ScholarDigital Library
Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Jian He, and Cheng-Zhong Xu. 2022. An In-depth Study of Microservice Call Graph and Runtime Performance. IEEE Transactions on Parallel and Distributed Systems (2022).Google ScholarCross Ref
Lin Ma, Dana Van Aken, et al. 2018. Query-based workload forecasting for self-driving database management systems. In Proceedings of SIGMOD.Google ScholarDigital Library
Ashraf Mahgoub, Alexander Medoff, Rakesh Kumar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2020. OPTIMUS-CLOUD: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud. In Proceedings of ATC.Google Scholar
Andrey Malinin and Mark Gales. 2018. Predictive uncertainty estimation via prior networks. In Proceedings of NeurIPS.Google Scholar
MaxCompute. [n.d.]. https://www.alibabacloud.com/product/maxcompute.Google Scholar
Amirhossein Mirhosseini, Sameh Elnikety, and Thomas F Wenisch. 2021. Parslo: A Gradient Descent-based Approach for Near-optimal Partial SLO Allotment in Microservices. In Proceedings of SoCC. 442--457.Google ScholarDigital Library
Hiep Nguyen, Zhiming Shen, et al. 2013. AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service. In Proceedings of ICAC.Google Scholar
Yaniv Ovadia, Emily Fertig, et al. 2019. Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift. In Proceedings of NeurIPS.Google Scholar
Jinwoo Park, Byungkwon Choi, Chunghan Lee, and Dongsu Han. 2021. GRAF: A Graph Neural Network based Proactive Resource Allocation Framework for SLO-Oriented Microservices. In Proceedings of ACM CoNext.Google ScholarDigital Library
Haoran Qiu, Subho S. Banerjee, Saurabh Jha, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. 2020. FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In Proceedings of OSDI.Google Scholar
Jia Rao, Xiangping Bu, Cheng-Zhong Xu, Leyi Wang, and George Yin. 2009. VCONF: a reinforcement learning approach to virtual machines auto-configuration. In Proceedings of ICAC.Google ScholarDigital Library
Francisco Romero, Mark Zhao, Neeraja J Yadwadkar, and Christos Kozyrakis. 2021. Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines. In Proceedings of SoCC.Google ScholarDigital Library
Krzysztof Rzadca, Pawel Findeisen, et al. 2020. Autopilot: workload autoscaling at Google. In Proceedings of EuroSys.Google ScholarDigital Library
Zhiming Shen, Sethuraman Subbiah, et al. 2011. Cloudscale: elastic resource scaling for multi-tenant cloud systems. In Proceedings of SoCC.Google ScholarDigital Library
Konstantin Shvachko, Hairong Kuang, et al. 2010. The hadoop distributed file system. In Proceedings of MSST.Google ScholarDigital Library
Nitish Srivastava, Geoffrey Hinton, et al. 2014. Dropout: a simple way to prevent neural networks from overfitting. In JMLR (2014).Google Scholar
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of NeurIPS.Google Scholar
Swarm. [n.d.]. https://docs.docker.com/swarm/.Google Scholar
Dustin Tran, Mike Dusenberry, et al. 2019. Bayesian layers: A module for neural network uncertainty. In Proceedings of NeurIPS.Google Scholar
Yijun Xiao and William Yang Wang. 2019. Quantifying uncertainties in natural language processing tasks. In Proceedings of AAAI.Google ScholarDigital Library
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of ICML.Google Scholar
Guangba Yu, Pengfei Chen, and Zibin Zheng. 2019. Microscaler: Automatic Scaling for Microservices with an Online Learning Approach. In Proceedings of ICWS.Google ScholarCross Ref
Jerrold H Zar. 2005. Spearman rank correlation. Encyclopedia of biostatistics (2005).Google Scholar
Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, G. Edward Suh, and Christina Delimitrou. 2021. Sinan: ML-Based and QoS-Aware Resource Management for Cloud Microservices. In Proceedings of ASPLOS.Google ScholarDigital Library
Laiping Zhao, Yanan Yang, Kaixuan Zhang, Xiaobo Zhou, Tie Qiu, Keqiu Li, and Yungang Bao. 2020. Rhythm: component-distinguishable workload deployment in datacenters. In Proceedings of EuroSys.Google ScholarDigital Library

Index Terms

The power of prediction: microservice auto scaling via workload learning
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing

Recommendations

Proactive-Reactive Global Scaling, with Analytics
Service-Oriented Computing
Abstract
In this work, we focus on by-design global scaling, a technique that, given a functional specification of a microservice architecture, orchestrates the scaling of all its components, avoiding cascading slowdowns typical of uncoordinated, ...
Read More
Practical Efficient Microservice Autoscaling with QoS Assurance
HPDC '22: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing

Cloud applications are increasingly moving away from monolithic services to agile microservices-based deployments. However, efficient resource management for microservices poses a significant hurdle due to the sheer number of loosely coupled and ...
Read More
How do microservices evolve? An empirical analysis of changes in open-source microservice repositories
Abstract Context.
Microservice architectures are an emergent service-oriented paradigm widely used in industry to develop and deploy scalable software systems. The underlying idea is to design highly independent services that ...
Highlights
- We analyzed 11 open-source microservice repositories to study their evolution over time.
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SoCC '22: Proceedings of the 13th Symposium on Cloud Computing
November 2022
574 pages
ISBN:9781450394147
DOI:10.1145/3542929
General Chair:
Ada Gavrilovska
Georgia Institute of Technology
,
Program Chairs:
Deniz Altınbüken
Google Research
,
Carsten Binnig
TU Darmstadt
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 November 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
microservices
proactive auto-scaler
workload uncertainty learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate169of722submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 1,431
  Total Downloads
- Downloads (Last 12 months)952
- Downloads (Last 6 weeks)140
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The power of prediction: microservice auto scaling via workload learning

SoCC '22: Proceedings of the 13th Symposium on Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Proactive-Reactive Global Scaling, with Analytics

Practical Efficient Microservice Autoscaling with QoS Assurance

How do microservices evolve? An empirical analysis of changes in open-source microservice repositories

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The power of prediction: microservice auto scaling via workload learning

SoCC '22: Proceedings of the 13th Symposium on Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Proactive-Reactive Global Scaling, with Analytics

Practical Efficient Microservice Autoscaling with QoS Assurance

How do microservices evolve? An empirical analysis of changes in open-source microservice repositories

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media