Loading [a11y]/accessibility-menu.js
Horizontal Pod Autoscaling for Precise Startup of AI Microservices at the Network Edge: A Hybrid Proactive and Reactive Approach | IEEE Conference Publication | IEEE Xplore

Horizontal Pod Autoscaling for Precise Startup of AI Microservices at the Network Edge: A Hybrid Proactive and Reactive Approach


Abstract:

Providing AI microservices at the network edge, such as YOLOv5 for detecting mini-subjects and Unet for segmentation, have resulted in a lot of potential business opportu...Show More

Abstract:

Providing AI microservices at the network edge, such as YOLOv5 for detecting mini-subjects and Unet for segmentation, have resulted in a lot of potential business opportunities. However, numerous AI microservices will face the issue of resource competition, especially in a resource-limited environment. Thus, resource scaling for AI microservices at edge nodes, balancing their resource utilization, plays a key role in ensuring the QoS of providing multiple AI microservices but poses several challenges, including varying traffic requests, unstable node environments, and long startup times when scaling resources horizontally. To overcome these challenges, we propose a proactive and reactive hybrid auto-scaling policy called PRHAS. By predicting future traffic volume and the corresponding startup time of microservice, The proposed PRHAS mechanism can adapt computing resource to optimize scaling decisions in terms of normalizing the startup time of microservice. Compared to traditional k8s HPA methods, our policy not only enhances the QoS/SLO of edge AI microservices but also improves resource utilization in edge nodes.
Date of Conference: 25-26 October 2024
Date Added to IEEE Xplore: 16 December 2024
ISBN Information:

ISSN Information:

Conference Location: Hsinchu, Taiwan

Contact IEEE to Subscribe

References

References is not available for this document.