Journals & Magazines >IEEE/ACM Transactions on Netw... >Volume: 32 Issue: 5

DeepScaling: Autoscaling Microservices With Stable CPU Utilization for Large Scale Production Cloud Systems

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Cloud service providers often provision excessive resources to meet the desired Service Level Objectives (SLOs), by setting lower CPU utilization targets. This can result...Show More

Metadata

Abstract:

Cloud service providers often provision excessive resources to meet the desired Service Level Objectives (SLOs), by setting lower CPU utilization targets. This can result in a waste of resources and a noticeable increase in power consumption in large-scale cloud deployments. To address this issue, this paper presents DeepScaling, an innovative solution for minimizing resource cost while ensuring SLO requirements are met in a dynamic, large-scale production microservice-based system. We propose DeepScaling, which introduces three innovative components to adaptively refine the target CPU utilization of servers in the data center, and we maintain it at a stable value to meet SLO constraints while using minimum amount of system resources. First, DeepScaling forecasts workloads for each service using a Spatio-temporal Graph Neural Network. Secondly, it estimates CPU utilization with a Deep Neural Network, considering factors such as periodic tasks and traffic. Finally, it uses a modified Deep Q-Network (DQN) to generate an autoscaling policy that controls service resources to maximize service stability while meeting SLOs. Evaluation of DeepScaling in Ant Group’s large-scale cloud environment shows that it outperforms state-of-the-art autoscaling approaches in terms of maintaining stable performance and resource savings. The deployment of DeepScaling in the real-world environment of 1900+ microservices saves the provisioning of over 100,000 CPU cores per day, on average.

Published in: IEEE/ACM Transactions on Networking ( Volume: 32, Issue: 5, October 2024)

Page(s): 3961 - 3976

Date of Publication: 31 May 2024

ISSN Information:

DOI: 10.1109/TNET.2024.3400953

Funding Agency:

Contents

References is not available for this document.

DeepScaling: Autoscaling Microservices With Stable CPU Utilization for Large Scale Production Cloud Systems

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

DeepScaling: Autoscaling Microservices With Stable CPU Utilization for Large Scale Production Cloud Systems

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?