ABSTRACT
When deploying microservices in production clusters, it is critical to automatically scale containers to improve cluster utilization and ensure service level agreements (SLA). Although reactive scaling approaches work well for monolithic architectures, they are not necessarily suitable for microservice frameworks due to the long delay caused by complex microservice call chains. In contrast, existing proactive approaches leverage end-to-end performance prediction for scaling, but cannot effectively handle microservice multiplexing and dynamic microservice dependencies.
In this paper, we present Madu, a proactive microservice auto-scaler that scales containers based on predictions for individual microservices. Madu learns workload uncertainty to handle the highly dynamic dependency between microservices. Additionally, Madu adopts OS-level metrics to optimize resource usage while maintaining good control over scaling overhead. Experiments on large-scale deployments of microservices in Alibaba clusters show that the overall prediction accuracy of Madu can reach as high as 92.3% on average, which is 13% higher than the state-of-the-art approaches. Furthermore, experiments running real-world microservice benchmarks in a local cluster of 20 servers show that Madu can reduce the overall resource usage by 1.7X compared to reactive solutions, while reducing end-to-end service latency by 50%.
- 2022. Alibaba Cloud. https://www.alibabacloud.com/.Google Scholar
- 2022. Alibaba Microservice Traces. https://github.com/alibaba/clusterdata/tree/master/cluster-trace-microservices-v2022.Google Scholar
- 2022. Jaeger: Open source, end-to-end distributed tracing. https://jaegertracing.io/.Google Scholar
- 2022. Kubernetes Horizon Pod Autoscaler. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.Google Scholar
- 2022. Object Storage Service. https://www.alibabacloud.com/product/oss.Google Scholar
- 2022. The Prometheus monitoring system and time series database. https://github.com/prometheus/prometheus/.Google Scholar
- Martín Abadi, Paul Barham, et al. 2016. Tensorflow: A system for large-scale machine learning. In Processing of OSDI.Google Scholar
- Haldun Akoglu. 2018. User's guide to correlation coefficients. Turkish journal of emergency medicine (2018).Google Scholar
- Ataollah Fatahi Baarzi and George Kesidis. 2021. Showar: Right-sizing and efficient scheduling of microservices. In Proceedings of the ACM Symposium on Cloud Computing.Google ScholarDigital Library
- Luciano Baresi, Sam Guinea, Alberto Leva, and Giovanni Quattrocchi. 2016. A Discrete-Time Feedback Controller for Containerized Cloud Applications. In Proceedings of FSE.Google ScholarDigital Library
- S. Becker, E. J. Candès, and M. Grant. 2011. Templates for convex cone problems with applications to sparse signal recovery. Mathematical Programming Computation (2011).Google Scholar
- Vivek M Bhasi, Jashwant Raj Gunasekaran, Prashanth Thinakaran, Cyan Subhra Mishra, Mahmut Taylan Kandemir, and Chita Das. 2021. Kraken: Adaptive Container Provisioning for Deploying Dynamic DAGs in Serverless Platforms. In Proceedings of SoCC.Google ScholarDigital Library
- Marcus Carvalho, Walfredo Cirne, et al. 2014. Long-term SLOs for reclaimed cloud computing resources. In Proceedings of SoCC.Google ScholarDigital Library
- Jie Chang, Zhonghao Lan, et al. 2020. Data Uncertainty Learning in Face Recognition. In Proceedings of CVPR.Google ScholarCross Ref
- Chris Chatfield. 2003. The analysis of time series: an introduction. Chapman and hall/CRC.Google Scholar
- Ka-Ho Chow, Umesh Deshpande, Sangeetha Seshadri, and Ling Liu. 2022. DeepRest: Deep Resource Estimation for Interactive Microservices. In Proceedings of EuroSys.Google ScholarDigital Library
- Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: resource-efficient and QoS-aware cluster management. In Proceedings of ASPLOS.Google ScholarDigital Library
- Yu Gan, Mingyu Liang, Sundar Dev, David Lo, and Christina Delimitrou. 2021. Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices. In Proceedings of ASPLOS.Google ScholarDigital Library
- Yu Gan, Yanqi Zhang, et al. 2019. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems. In Proceedings of ASPLOS.Google ScholarDigital Library
- Yu Gan, Yanqi Zhang, Kelvin Hu, Dailun Cheng, Yuan He, Meghna Pancholi, and Christina Delimitrou. 2019. Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices. In Proceedings of ASPLOS.Google ScholarDigital Library
- Alim Ul Gias, Giuliano Casale, and Murray Woodside. 2019. ATOM: Model-Driven Autoscaling for Microservices. In Proceedings of ICDCS.Google ScholarCross Ref
- GoogleCloud. 2022. https://cloud.google.com/.Google Scholar
- Hadoop. 2022. https://hadoop.apache.org/.Google Scholar
- Jay Heo, Hae Beom Lee, et al. 2018. Uncertainty-aware attention for reliable interpretation and prediction. In Proceedings of NeurIPS.Google Scholar
- Md Rajib Hossen, Mohammad A Islam, and Kishwar Ahmed. 2022. Practical Efficient Microservice Autoscaling with QoS Assurance. In Proceedings of HPDC.Google ScholarDigital Library
- Ram Srivatsa Kannan, Lavanya Subramanian, Ashwin Raju, Jeongseob Ahn, and Jason Mars. 2019. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In Proceedings of EuroSys.Google ScholarDigital Library
- Alex Kendall and Yarin Gal. 2017. What uncertainties do we need in bayesian deep learning for computer vision?. In Proceedings of NeurIPS.Google Scholar
- Kubernetes. [n.d.]. https://kubernetes.io..Google Scholar
- Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In Proceedings of NeurIPS.Google Scholar
- Qixiao Liu and Zhibin Yu. 2018. The elasticity and plasticity in semi-containerized co-locating cloud workload: A view from Alibaba trace. In Proceedings of SoCC.Google ScholarDigital Library
- Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Yu Ding, Jian He, and Chengzhong Xu. 2021. Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis. In Proceedings of SoCC.Google ScholarDigital Library
- Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Jian He, and Cheng-Zhong Xu. 2022. An In-depth Study of Microservice Call Graph and Runtime Performance. IEEE Transactions on Parallel and Distributed Systems (2022).Google ScholarCross Ref
- Lin Ma, Dana Van Aken, et al. 2018. Query-based workload forecasting for self-driving database management systems. In Proceedings of SIGMOD.Google ScholarDigital Library
- Ashraf Mahgoub, Alexander Medoff, Rakesh Kumar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2020. OPTIMUS-CLOUD: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud. In Proceedings of ATC.Google Scholar
- Andrey Malinin and Mark Gales. 2018. Predictive uncertainty estimation via prior networks. In Proceedings of NeurIPS.Google Scholar
- MaxCompute. [n.d.]. https://www.alibabacloud.com/product/maxcompute.Google Scholar
- Amirhossein Mirhosseini, Sameh Elnikety, and Thomas F Wenisch. 2021. Parslo: A Gradient Descent-based Approach for Near-optimal Partial SLO Allotment in Microservices. In Proceedings of SoCC. 442--457.Google ScholarDigital Library
- Hiep Nguyen, Zhiming Shen, et al. 2013. AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service. In Proceedings of ICAC.Google Scholar
- Yaniv Ovadia, Emily Fertig, et al. 2019. Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift. In Proceedings of NeurIPS.Google Scholar
- Jinwoo Park, Byungkwon Choi, Chunghan Lee, and Dongsu Han. 2021. GRAF: A Graph Neural Network based Proactive Resource Allocation Framework for SLO-Oriented Microservices. In Proceedings of ACM CoNext.Google ScholarDigital Library
- Haoran Qiu, Subho S. Banerjee, Saurabh Jha, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. 2020. FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In Proceedings of OSDI.Google Scholar
- Jia Rao, Xiangping Bu, Cheng-Zhong Xu, Leyi Wang, and George Yin. 2009. VCONF: a reinforcement learning approach to virtual machines auto-configuration. In Proceedings of ICAC.Google ScholarDigital Library
- Francisco Romero, Mark Zhao, Neeraja J Yadwadkar, and Christos Kozyrakis. 2021. Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines. In Proceedings of SoCC.Google ScholarDigital Library
- Krzysztof Rzadca, Pawel Findeisen, et al. 2020. Autopilot: workload autoscaling at Google. In Proceedings of EuroSys.Google ScholarDigital Library
- Zhiming Shen, Sethuraman Subbiah, et al. 2011. Cloudscale: elastic resource scaling for multi-tenant cloud systems. In Proceedings of SoCC.Google ScholarDigital Library
- Konstantin Shvachko, Hairong Kuang, et al. 2010. The hadoop distributed file system. In Proceedings of MSST.Google ScholarDigital Library
- Nitish Srivastava, Geoffrey Hinton, et al. 2014. Dropout: a simple way to prevent neural networks from overfitting. In JMLR (2014).Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of NeurIPS.Google Scholar
- Swarm. [n.d.]. https://docs.docker.com/swarm/.Google Scholar
- Dustin Tran, Mike Dusenberry, et al. 2019. Bayesian layers: A module for neural network uncertainty. In Proceedings of NeurIPS.Google Scholar
- Yijun Xiao and William Yang Wang. 2019. Quantifying uncertainties in natural language processing tasks. In Proceedings of AAAI.Google ScholarDigital Library
- Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of ICML.Google Scholar
- Guangba Yu, Pengfei Chen, and Zibin Zheng. 2019. Microscaler: Automatic Scaling for Microservices with an Online Learning Approach. In Proceedings of ICWS.Google ScholarCross Ref
- Jerrold H Zar. 2005. Spearman rank correlation. Encyclopedia of biostatistics (2005).Google Scholar
- Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, G. Edward Suh, and Christina Delimitrou. 2021. Sinan: ML-Based and QoS-Aware Resource Management for Cloud Microservices. In Proceedings of ASPLOS.Google ScholarDigital Library
- Laiping Zhao, Yanan Yang, Kaixuan Zhang, Xiaobo Zhou, Tie Qiu, Keqiu Li, and Yungang Bao. 2020. Rhythm: component-distinguishable workload deployment in datacenters. In Proceedings of EuroSys.Google ScholarDigital Library
Index Terms
- The power of prediction: microservice auto scaling via workload learning
Recommendations
Proactive-Reactive Global Scaling, with Analytics
Service-Oriented ComputingAbstractIn this work, we focus on by-design global scaling, a technique that, given a functional specification of a microservice architecture, orchestrates the scaling of all its components, avoiding cascading slowdowns typical of uncoordinated, ...
Practical Efficient Microservice Autoscaling with QoS Assurance
HPDC '22: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed ComputingCloud applications are increasingly moving away from monolithic services to agile microservices-based deployments. However, efficient resource management for microservices poses a significant hurdle due to the sheer number of loosely coupled and ...
How do microservices evolve? An empirical analysis of changes in open-source microservice repositories
Abstract Context.Microservice architectures are an emergent service-oriented paradigm widely used in industry to develop and deploy scalable software systems. The underlying idea is to design highly independent services that ...
Highlights- We analyzed 11 open-source microservice repositories to study their evolution over time.
Comments