skip to main content
10.1145/3542929.3563477acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

The power of prediction: microservice auto scaling via workload learning

Published:07 November 2022Publication History

ABSTRACT

When deploying microservices in production clusters, it is critical to automatically scale containers to improve cluster utilization and ensure service level agreements (SLA). Although reactive scaling approaches work well for monolithic architectures, they are not necessarily suitable for microservice frameworks due to the long delay caused by complex microservice call chains. In contrast, existing proactive approaches leverage end-to-end performance prediction for scaling, but cannot effectively handle microservice multiplexing and dynamic microservice dependencies.

In this paper, we present Madu, a proactive microservice auto-scaler that scales containers based on predictions for individual microservices. Madu learns workload uncertainty to handle the highly dynamic dependency between microservices. Additionally, Madu adopts OS-level metrics to optimize resource usage while maintaining good control over scaling overhead. Experiments on large-scale deployments of microservices in Alibaba clusters show that the overall prediction accuracy of Madu can reach as high as 92.3% on average, which is 13% higher than the state-of-the-art approaches. Furthermore, experiments running real-world microservice benchmarks in a local cluster of 20 servers show that Madu can reduce the overall resource usage by 1.7X compared to reactive solutions, while reducing end-to-end service latency by 50%.

References

  1. 2022. Alibaba Cloud. https://www.alibabacloud.com/.Google ScholarGoogle Scholar
  2. 2022. Alibaba Microservice Traces. https://github.com/alibaba/clusterdata/tree/master/cluster-trace-microservices-v2022.Google ScholarGoogle Scholar
  3. 2022. Jaeger: Open source, end-to-end distributed tracing. https://jaegertracing.io/.Google ScholarGoogle Scholar
  4. 2022. Kubernetes Horizon Pod Autoscaler. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.Google ScholarGoogle Scholar
  5. 2022. Object Storage Service. https://www.alibabacloud.com/product/oss.Google ScholarGoogle Scholar
  6. 2022. The Prometheus monitoring system and time series database. https://github.com/prometheus/prometheus/.Google ScholarGoogle Scholar
  7. Martín Abadi, Paul Barham, et al. 2016. Tensorflow: A system for large-scale machine learning. In Processing of OSDI.Google ScholarGoogle Scholar
  8. Haldun Akoglu. 2018. User's guide to correlation coefficients. Turkish journal of emergency medicine (2018).Google ScholarGoogle Scholar
  9. Ataollah Fatahi Baarzi and George Kesidis. 2021. Showar: Right-sizing and efficient scheduling of microservices. In Proceedings of the ACM Symposium on Cloud Computing.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Luciano Baresi, Sam Guinea, Alberto Leva, and Giovanni Quattrocchi. 2016. A Discrete-Time Feedback Controller for Containerized Cloud Applications. In Proceedings of FSE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Becker, E. J. Candès, and M. Grant. 2011. Templates for convex cone problems with applications to sparse signal recovery. Mathematical Programming Computation (2011).Google ScholarGoogle Scholar
  12. Vivek M Bhasi, Jashwant Raj Gunasekaran, Prashanth Thinakaran, Cyan Subhra Mishra, Mahmut Taylan Kandemir, and Chita Das. 2021. Kraken: Adaptive Container Provisioning for Deploying Dynamic DAGs in Serverless Platforms. In Proceedings of SoCC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Marcus Carvalho, Walfredo Cirne, et al. 2014. Long-term SLOs for reclaimed cloud computing resources. In Proceedings of SoCC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jie Chang, Zhonghao Lan, et al. 2020. Data Uncertainty Learning in Face Recognition. In Proceedings of CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  15. Chris Chatfield. 2003. The analysis of time series: an introduction. Chapman and hall/CRC.Google ScholarGoogle Scholar
  16. Ka-Ho Chow, Umesh Deshpande, Sangeetha Seshadri, and Ling Liu. 2022. DeepRest: Deep Resource Estimation for Interactive Microservices. In Proceedings of EuroSys.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: resource-efficient and QoS-aware cluster management. In Proceedings of ASPLOS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yu Gan, Mingyu Liang, Sundar Dev, David Lo, and Christina Delimitrou. 2021. Sage: Practical & Scalable ML-Driven Performance Debugging in Microservices. In Proceedings of ASPLOS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yu Gan, Yanqi Zhang, et al. 2019. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems. In Proceedings of ASPLOS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yu Gan, Yanqi Zhang, Kelvin Hu, Dailun Cheng, Yuan He, Meghna Pancholi, and Christina Delimitrou. 2019. Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices. In Proceedings of ASPLOS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Alim Ul Gias, Giuliano Casale, and Murray Woodside. 2019. ATOM: Model-Driven Autoscaling for Microservices. In Proceedings of ICDCS.Google ScholarGoogle ScholarCross RefCross Ref
  22. GoogleCloud. 2022. https://cloud.google.com/.Google ScholarGoogle Scholar
  23. Hadoop. 2022. https://hadoop.apache.org/.Google ScholarGoogle Scholar
  24. Jay Heo, Hae Beom Lee, et al. 2018. Uncertainty-aware attention for reliable interpretation and prediction. In Proceedings of NeurIPS.Google ScholarGoogle Scholar
  25. Md Rajib Hossen, Mohammad A Islam, and Kishwar Ahmed. 2022. Practical Efficient Microservice Autoscaling with QoS Assurance. In Proceedings of HPDC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ram Srivatsa Kannan, Lavanya Subramanian, Ashwin Raju, Jeongseob Ahn, and Jason Mars. 2019. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In Proceedings of EuroSys.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alex Kendall and Yarin Gal. 2017. What uncertainties do we need in bayesian deep learning for computer vision?. In Proceedings of NeurIPS.Google ScholarGoogle Scholar
  28. Kubernetes. [n.d.]. https://kubernetes.io..Google ScholarGoogle Scholar
  29. Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In Proceedings of NeurIPS.Google ScholarGoogle Scholar
  30. Qixiao Liu and Zhibin Yu. 2018. The elasticity and plasticity in semi-containerized co-locating cloud workload: A view from Alibaba trace. In Proceedings of SoCC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Yu Ding, Jian He, and Chengzhong Xu. 2021. Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis. In Proceedings of SoCC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Jian He, and Cheng-Zhong Xu. 2022. An In-depth Study of Microservice Call Graph and Runtime Performance. IEEE Transactions on Parallel and Distributed Systems (2022).Google ScholarGoogle ScholarCross RefCross Ref
  33. Lin Ma, Dana Van Aken, et al. 2018. Query-based workload forecasting for self-driving database management systems. In Proceedings of SIGMOD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ashraf Mahgoub, Alexander Medoff, Rakesh Kumar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2020. OPTIMUS-CLOUD: Heterogeneous Configuration Optimization for Distributed Databases in the Cloud. In Proceedings of ATC.Google ScholarGoogle Scholar
  35. Andrey Malinin and Mark Gales. 2018. Predictive uncertainty estimation via prior networks. In Proceedings of NeurIPS.Google ScholarGoogle Scholar
  36. MaxCompute. [n.d.]. https://www.alibabacloud.com/product/maxcompute.Google ScholarGoogle Scholar
  37. Amirhossein Mirhosseini, Sameh Elnikety, and Thomas F Wenisch. 2021. Parslo: A Gradient Descent-based Approach for Near-optimal Partial SLO Allotment in Microservices. In Proceedings of SoCC. 442--457.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Hiep Nguyen, Zhiming Shen, et al. 2013. AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service. In Proceedings of ICAC.Google ScholarGoogle Scholar
  39. Yaniv Ovadia, Emily Fertig, et al. 2019. Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift. In Proceedings of NeurIPS.Google ScholarGoogle Scholar
  40. Jinwoo Park, Byungkwon Choi, Chunghan Lee, and Dongsu Han. 2021. GRAF: A Graph Neural Network based Proactive Resource Allocation Framework for SLO-Oriented Microservices. In Proceedings of ACM CoNext.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Haoran Qiu, Subho S. Banerjee, Saurabh Jha, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. 2020. FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In Proceedings of OSDI.Google ScholarGoogle Scholar
  42. Jia Rao, Xiangping Bu, Cheng-Zhong Xu, Leyi Wang, and George Yin. 2009. VCONF: a reinforcement learning approach to virtual machines auto-configuration. In Proceedings of ICAC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Francisco Romero, Mark Zhao, Neeraja J Yadwadkar, and Christos Kozyrakis. 2021. Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines. In Proceedings of SoCC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Krzysztof Rzadca, Pawel Findeisen, et al. 2020. Autopilot: workload autoscaling at Google. In Proceedings of EuroSys.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Zhiming Shen, Sethuraman Subbiah, et al. 2011. Cloudscale: elastic resource scaling for multi-tenant cloud systems. In Proceedings of SoCC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Konstantin Shvachko, Hairong Kuang, et al. 2010. The hadoop distributed file system. In Proceedings of MSST.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Nitish Srivastava, Geoffrey Hinton, et al. 2014. Dropout: a simple way to prevent neural networks from overfitting. In JMLR (2014).Google ScholarGoogle Scholar
  48. Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of NeurIPS.Google ScholarGoogle Scholar
  49. Swarm. [n.d.]. https://docs.docker.com/swarm/.Google ScholarGoogle Scholar
  50. Dustin Tran, Mike Dusenberry, et al. 2019. Bayesian layers: A module for neural network uncertainty. In Proceedings of NeurIPS.Google ScholarGoogle Scholar
  51. Yijun Xiao and William Yang Wang. 2019. Quantifying uncertainties in natural language processing tasks. In Proceedings of AAAI.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of ICML.Google ScholarGoogle Scholar
  53. Guangba Yu, Pengfei Chen, and Zibin Zheng. 2019. Microscaler: Automatic Scaling for Microservices with an Online Learning Approach. In Proceedings of ICWS.Google ScholarGoogle ScholarCross RefCross Ref
  54. Jerrold H Zar. 2005. Spearman rank correlation. Encyclopedia of biostatistics (2005).Google ScholarGoogle Scholar
  55. Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, G. Edward Suh, and Christina Delimitrou. 2021. Sinan: ML-Based and QoS-Aware Resource Management for Cloud Microservices. In Proceedings of ASPLOS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Laiping Zhao, Yanan Yang, Kaixuan Zhang, Xiaobo Zhou, Tie Qiu, Keqiu Li, and Yungang Bao. 2020. Rhythm: component-distinguishable workload deployment in datacenters. In Proceedings of EuroSys.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The power of prediction: microservice auto scaling via workload learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SoCC '22: Proceedings of the 13th Symposium on Cloud Computing
      November 2022
      574 pages
      ISBN:9781450394147
      DOI:10.1145/3542929

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 November 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate169of722submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader