K-AGRUED: A Container Autoscaling Technique for Cloud-based Web Applications in Kubernetes Using Attention-based GRU Encoder-Decoder

Dogani, Javad; Khunjush, Farshad; Seydali, Mehdi

doi:10.1007/s10723-022-09634-x

K-AGRUED: A Container Autoscaling Technique for Cloud-based Web Applications in Kubernetes Using Attention-based GRU Encoder-Decoder

Published: 01 December 2022

Volume 20, article number 40, (2022)
Cite this article

Journal of Grid Computing Aims and scope Submit manuscript

451 Accesses
13 Citations
Explore all metrics

Abstract

Cloud service providers can operate several execution instances on a single physical server using virtualization technology, which improves resource utilization. In recent years, container-based virtualization has been developed as a remarkably lightweight alternative to virtual machines. Containers consume less memory than virtual machines, enabling faster setup and portability. Cloud-based applications require dynamic resource allocation in response to fluctuations in the number of incoming requests. Most articles on proactive autoscaling in cloud computing have shortcomings in two ways. 1) During feature extraction, the temporal patterns of the data are ignored, and the historical sequences are assigned equal weight. 2) Existing research omits cool down time (CDT) from the planning phase. 3) Scaling operations can be performed at any time depending only on the current input workload, resulting in a large number of contradicting scaling actions. In response to the above shortcomings, this paper presents a proactive autoscaling method for web applications in Kubernetes using an attention-based gated recurrent unit (GRU) encoder-decoder (K-AGRUED), which predicts the resource usage of several future steps based on CDT. The results demonstrate that the proposed method reduces prediction error by 2–25% compared to state of the art methods. Our approach significantly reduces scaling operations and under-provisioning compared to the standard horizontal pod autoscaler (HPA) of Kubernetes and two previous studies. The K-AGRUED increases the scaling speedup by a factor of up to five in a real environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning Based Adaptive Auto-scaling Policy for Resource Orchestration in Kubernetes Clusters

Efficient Kalman filter based deep learning approaches for workload prediction in cloud and edge environments

Article 26 November 2024

A deep learning-based resource usage prediction model for resource provisioning in an autonomic cloud computing environment

Article 11 November 2021

Data Availability

The datasets analyzed during scenario1 and scenario2 of experiments in the current study are available at ftp://ftp.ita.ee.lbl.gov/html/contrib/NASA-HTTP.html and ftp://ftp.ita.ee.lbl.gov/html/contrib/WorldCup.html, respectively. The datasets analyzed during scenario3 in current study are available from the corresponding author on reasonable request.

References

Hosseinzadeh, M., Ghafour, M., Hama, H., Vo, B., Khoshnevis, A.: Multi-objective task and workflow scheduling approaches in cloud computing: a comprehensive review. J. Grid Comput. 18(3), 327–356 (2020). https://doi.org/10.1007/s10723-020-09533-z
Article Google Scholar
Shukur, H., Zeebaree, S., Zebari, R., Zeebaree, D., Ahmed, O., Salih, A.: Cloud computing virtualization of resources allocation for distributed systems. J. Appl. Sci. Technol. Trends 1(3), 98–105 (2020). https://doi.org/10.38094/jastt1331
Article Google Scholar
Imdoukh, M., Ahmad, I., Alfailakawi, M.: Machine learning-based auto-scaling for containerized applications. Neural Comput. Appl. 32(13), 9745–9760 (2020). https://doi.org/10.1007/s00521-019-04507-z
Article Google Scholar
Borangiu, T., Trentesaux, D., Thomas, A., Leitão, P., Barata, J.: Digital transformation of manufacturing through cloud services and resource virtualization. Comput. Ind. 108, 150–162 (2019). https://doi.org/10.1016/j.compind.2019.01.006
Article Google Scholar
Guerrero, C., Lera, I., Juiz, C.: Genetic algorithm for multi-objective optimization of container allocation in cloud architecture. J. Grid Comput. 16(1), 113–135 (2017). https://doi.org/10.1007/s10723-017-9419-x
Article Google Scholar
Goethals, T., DeTurck, F., Volckaert, B.: Extending kubernetes clusters to low-resource edge devices using virtual kubelets. IEEE Trans. Cloud Comput. 1–1 (2020). https://doi.org/10.1109/tcc.2020.3033807
Risco, S., Moltó, G., Naranjo, D., Blanquer, I.: Serverless workflows for containerised applications in the cloud continuum. J. Grid Comput. 19(3) (2021). https://doi.org/10.1007/s10723-021-09570-2
Zhu, C., Han, B., Zhao, Y.: A bi-metric autoscaling approach for n-tier web applications on kubernetes. Front. Comput. Sci. 16(3) (2021). https://doi.org/10.1007/s11704-021-0118-1
Ullah, A., Li, J., Shen, Y., Hussain, A.: A control theoretical view of cloud elasticity: taxonomy, survey and challenges. Clust. Comput. 21(4), 1735–1764 (2018). https://doi.org/10.1007/s10586-018-2807-6
Article Google Scholar
Barnawi, A., Sakr, S., Xiao, W., Al-Barakati, A.: The views, measurements and challenges of elasticity in the cloud: a review. Comput. Commun. 154, 111–117 (2020). https://doi.org/10.1016/j.comcom.2020.02.010
Article Google Scholar
Liu, B., Guo, J., Li, C., Luo, Y.: Workload forecasting based elastic resource management in edge cloud. Comput. Ind. Eng. 139, 106136 (2020). https://doi.org/10.1016/j.cie.2019.106136
Article Google Scholar
Li, C., Tang, J., Luo, Y.: Elastic edge cloud resource management based on horizontal and vertical scaling. J. Supercomput. 76(10), 7707–7732 (2020). https://doi.org/10.1007/s11227-020-03192-3
Article Google Scholar
Kovács, J.: Supporting programmable autoscaling rules for containers and virtual machines on clouds. J. Grid Comput. 17(4), 813–829 (2019). https://doi.org/10.1007/s10723-019-09488-w
Article Google Scholar
Aslanpour, M., Ghobaei-Arani, M., NadjaranToosi, A.: Auto-scaling web applications in clouds: a cost-aware approach. J. Netw. Comput. Appl. 95, 26–41 (2017). https://doi.org/10.1016/j.jnca.2017.07.012
Article Google Scholar
Moghaddam, S., Buyya, R., Ramamohanarao, K.: ACAS: an anomaly-based cause aware auto-scaling framework for clouds. J. Parallel Distrib. Comput. 126, 107–120 (2019). https://doi.org/10.1016/j.jpdc.2018.12.002
Article Google Scholar
Rattihalli, G., Govindaraju, M., Lu, H., Tiwari, D.: Exploring potential for non-disruptive vertical auto scaling and resource estimation in Kubernetes. 2019 IEEE 12th International Conference on Cloud Computing (CLOUD) (2019). https://doi.org/10.1109/cloud.2019.00018
Lorido-Botran, T., Miguel-Alonso, J., Lozano, J.A.: A review of auto-scaling techniques for elastic applications in cloud environments. J. Grid Comput. 12(4), 559–592 (2014). https://doi.org/10.1007/s10723-014-9314-7
Article Google Scholar
Nadjaran Toosi, A., Son, J., Chi, Q., Buyya, R.: ELASTICSFC: Auto-scaling techniques for elastic service function chaining in network functions virtualization-based clouds. J. Syst. Softw. 152, 108–119 (2019). https://doi.org/10.1016/j.jss.2019.02.052
Article Google Scholar
Sahni, J., Vidyarthi, D.P.: Heterogeneity-aware adaptive auto-scaling heuristic for improved QoS and resource usage in cloud environments. Computing 99(4), 351–381 (2016). https://doi.org/10.1007/s00607-016-0530-9
Article MathSciNet Google Scholar
Alaei, N., Safi-Esfahani, F.: Repro-active: a reactive–proactive scheduling method based on simulation in cloud computing. J. Supercomput. 74(2), 801–829 (2017). https://doi.org/10.1007/s11227-017-2161-0
Article Google Scholar
Augustyn, D. R.: Improvements of the reactive auto scaling method for cloud platform. Computer Networks, pp. 422–431 (2021). https://doi.org/10.1007/978-3-319-59767-6_33
Bento, A., Correia, J., Filipe, R., Araujo, F., Cardoso, J.: Automated analysis of distributed tracing: challenges and research directions. J. Grid Comput. 19(1) (2021). https://doi.org/10.1007/s10723-021-09551-5
Bauer, A., Herbst, N., Spinner, S., Ali-Eldin, A., Kounev, S.: Chameleon: a hybrid, proactive auto-scaling mechanism on a level-playing field. IEEE Trans. Parallel Distrib. Syst. 30(4), 800–813 (2019). https://doi.org/10.1109/tpds.2018.2870389
Article Google Scholar
Masdari, M., Zangakani, M.: Green cloud computing using proactive virtual machine placement: challenges and issues. J. Grid Comput. 18(4), 727–759 (2019). https://doi.org/10.1007/s10723-019-09489-9
Article Google Scholar
Iqbal, W., Erradi, A., Mahmood, A.: Dynamic workload patterns prediction for proactive auto-scaling of web applications. J. Netw. Comput. Appl. 124, 94–107 (2018). https://doi.org/10.1016/j.jnca.2018.09.023
Article Google Scholar
Kim, W.-Y., Lee, J.-S., Huh, E.-N.: Study on proactive auto scaling for instance through the prediction of network traffic on the container environment. Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication (2017). https://doi.org/10.1145/3022227.3022243
Saxena, D., Singh, A.K.: A proactive autoscaling and energy-efficient VM allocation framework using online multi-resource neural network for cloud data center. Neurocomputing 426, 248–264 (2021). https://doi.org/10.1016/j.neucom.2020.08.076
Article Google Scholar
Singh, P., Kaur, A., Gupta, P., Gill, S.S., Jyoti, K.: RHAS: robust hybrid auto-scaling for web applications in cloud computing. Clust. Comput. 24(2), 717–737 (2020). https://doi.org/10.1007/s10586-020-03148-5
Article Google Scholar
Al-Dulaimy, A., Taheri, J., Kassler, A., Hoseiny Farahabady, M.R., Deng, S., Zomaya, A.: MULTISCALER: a multi-loop auto-scaling approach for cloud-based applications. IEEE Trans. Cloud Comput. 1–1 (2020). https://doi.org/10.1109/tcc.2020.3031676
Guo, Y., Stolyar, A., Walid, A.: Online VM auto-scaling algorithms for application hosting in a cloud. IEEE Trans. Cloud Comput. 1–1 (2018). https://doi.org/10.1109/tcc.2018.2830793
Kan, C.: DoCloud: an elastic cloud platform for web applications based on Docker. 2016 18th International Conference on Advanced Communication Technology (ICACT) (2016). https://doi.org/10.1109/icact.2016.7423440
Ciptaningtyas, H.T., Santoso, B.J., Razi, M.F.: Resource elasticity controller for docker-based web applications. 2017 11th International Conference on Information & Communication Technology and System (ICTS) (2017). https://doi.org/10.1109/icts.2017.8265669
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput 9(8), 1735–1780 (1997)
Article Google Scholar
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder–decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014). https://doi.org/10.3115/v1/d14-1179
Dang-Quang, N.-M., Yoo, M.: Deep learning-based autoscaling using bidirectional long short-term memory for kubernetes. Appl. Sci. 11(9), 3835 (2021). https://doi.org/10.3390/app11093835
Article Google Scholar
Du, S., Li, T., Yang, Y., Horng, S.-J.: Multivariate time series forecasting via attention-based encoder–decoder framework. Neurocomputing 388, 269–279 (2020). https://doi.org/10.1016/j.neucom.2019.12.118
Article Google Scholar
Casalicchio, E.: A study on performance measures for auto-scaling CPU-intensive containerized applications. Clust. Comput. 22(3), 995–1006 (2019). https://doi.org/10.1007/s10586-018-02890-1
Article MathSciNet Google Scholar
Srirama, S.N., Adhikari, M., Paul, S.: Application deployment using containers with auto-scaling for microservices in cloud environment. J. Netw. Comput. Appl. 160, 102629 (2020). https://doi.org/10.1016/j.jnca.2020.102629
Article Google Scholar
Al-Dhuraibi, Y., Paraiso, F., Djarallah, N., Merle, P.: Autonomic vertical elasticity of Docker containers with ELASTICDOCKER. 2017 IEEE 10th International Conference on Cloud Computing (CLOUD) (2020). https://doi.org/10.1109/cloud.2017.67
Tang, X., Zhang, F., Li, X., Khan, S.U., Li, Z.: Quantifying cloud elasticity with container-based autoscaling. 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech) (2017). https://doi.org/10.1109/dasc-picom-datacom-cyberscitec.2017.143
Simic, V., Stojanovic, B., Ivanovic, M.: Optimizing the performance of optimization in the cloud environment–an intelligent auto-scaling approach. Futur. Gener. Comput. Syst. 101, 909–920 (2019). https://doi.org/10.1016/j.future.2019.07.042
Article Google Scholar
Shahin, A.A.: Automatic cloud resource scaling algorithm based on long short-term memory recurrent neural network. Int. J. Adv. Comput. Sci. Appl. 7(12). https://doi.org/10.14569/ijacsa.2016.071236
Calheiros, R.N., Masoumi, E., Ranjan, R., Buyya, R.: Workload prediction using Arima model and its impact on cloud applications’ QoS. IEEE Trans. Cloud Comput. 3(4), 449–458 (2015). https://doi.org/10.1109/tcc.2014.2350475
Article Google Scholar
Prachitmutita, I., Aittinonmongkol, W., Pojjanasuksakul, N., Supattatham, M., Padungweang, P.: Auto-scaling microservices on iaas under SLA with cost-effective framework. 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI) (2018). https://doi.org/10.1109/icaci.2018.8377525
Fang, W., Lu, Z.H., Wu, J., Cao, Z.Y.: RPPS: a novel resource prediction and provisioning scheme in Cloud Data Center. 2012 IEEE Ninth International Conference on Services Computing (2012). https://doi.org/10.1109/scc.2012.47
Tang, X., Liu, Q., Dong, Y., Han, J., Zhang, Z.: Fisher: an efficient container load prediction model with deep neural network in clouds. 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom) (2018). https://doi.org/10.1109/bdcloud.2018.00041
Radhika, E.G., Sudha Sadasivam, G., Fenila Naomi, J.: An efficient predictive technique to autoscale the resources for web applications in private cloud. 2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB) (2018). https://doi.org/10.1109/aeeicb.2018.8480899
Messias, V.R., Estrella, J.C., Ehlers, R., Santana, M.J., Santana, R.C., Reiff-Marganiec, S.: Combining time series prediction models using genetic algorithm to autoscaling web applications hosted in the Cloud Infrastructure. Neural Comput. Appl. 27(8), 2383–2406 (2015). https://doi.org/10.1007/s00521-015-2133-3
Article Google Scholar
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Adaptive AI-based auto-scaling for kubernetes. 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) (2020). https://doi.org/10.1109/ccgrid49817.2020.00-33
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Machine learning-based scaling management for kubernetes edge clusters. IEEE Trans. Netw. Serv. Manage. 18(1), 958–972 (2021). https://doi.org/10.1109/tnsm.2021.3052837
Article Google Scholar
Wang, B., Kong, W., Guan, H.: Air quality forcasting based on gated recurrent long short-term memory model. Proceedings of the ACM Turing Celebration Conference – China (2019). https://doi.org/10.1145/3321408.3326656
Zhu, Q., Zhang, F., Liu, S., Wu, Y., Wang, L.: A hybrid VMD–BIGRU model for rubber futures time series forecasting. Appl. Soft Comput. 84, 105739 (2019). https://doi.org/10.1016/j.asoc.2019.105739
Article Google Scholar
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015). https://doi.org/10.18653/v1/d15-1166
Yan, M., Liang, X.M., Lu, Z.H., Wu, J., Zhang, W.: Hansel: adaptive horizontal scaling of microservices using BiLSTM. Appl. Soft Comput. 105, 107216 (2021). https://doi.org/10.1016/j.asoc.2021.107216
Article Google Scholar
The Reliable, High Performance TCP/HTTP Load Balancer. Available online: http://www.haproxy.org/. Accessed on August 1 2020
Prometheus-Monitoring System & Time Series Database. Available online: https://prometheus.io/. Accessed on August 1 2020
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
Girish, L., Rao, S.K.: Anomaly detection in cloud environment using artificial intelligence techniques. Computing (2021). https://doi.org/10.1007/s00607-021-00941-x
Article Google Scholar
Zulqarnain, M., Ghazali, R., Hassim, Y.M., Aamir, M.: An enhanced gated recurrent unit with auto-encoder for solving text classification problems. Arab. J. Sci. Eng. 46(9), 8953–8967 (2021). https://doi.org/10.1007/s13369-021-05691-8
Article Google Scholar
Arlitt, M., Jin, T.: A workload characterization study of the 1998 World Cup Web Site. IEEE Network, 14(3), 30–37 (2000). https://doi.org/10.1109/65.844498. Online: ftp://ftp.ita.ee.lbl.gov/html/contrib/WorldCup.html
Two Month'sWorth of All HTTP Requests to the NASA Kennedy Space Center. Available online: ftp://ftp.ita.ee.lbl.gov/html/contrib/NASA-HTTP.html
Peng, C., Li, Y., Yu, Y., Zhou, Y., Du, S.: Multi-step-ahead host load prediction with GRU based encoder-decoder in cloud computing. 2018 10th International Conference on Knowledge and Smart Technology (KST), (2018). Available online: https://doi.org/10.1109/kst.2018.8426104
Bauer, A., Grohmann, J., Herbst, N., Kounev, S.: On the value of service demand estimation for auto-scaling. Lecture Notes in Computer Science, pp. 142–156 (2018). https://doi.org/10.1007/978-3-319-74947-1_10

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Mollasadara St., Shiraz, 71348-51154, Iran
Javad Dogani, Farshad Khunjush & Mehdi Seydali

Authors

Javad Dogani
View author publications
You can also search for this author inPubMed Google Scholar
Farshad Khunjush
View author publications
You can also search for this author inPubMed Google Scholar
Mehdi Seydali
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Javad Dogani: Methodology, Software, Simulation, Writing—original draft.

Farshad Khunjush: Conceptualization, Validation, Methodology, Writing—review & editing.

Mehdi Seydali: Conceptualization, software, Writing—review & editing.

Corresponding author

Correspondence to Farshad Khunjush.

Ethics declarations

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dogani, J., Khunjush, F. & Seydali, M. K-AGRUED: A Container Autoscaling Technique for Cloud-based Web Applications in Kubernetes Using Attention-based GRU Encoder-Decoder. J Grid Computing 20, 40 (2022). https://doi.org/10.1007/s10723-022-09634-x

Download citation

Received: 13 November 2021
Accepted: 18 November 2022
Published: 01 December 2022
DOI: https://doi.org/10.1007/s10723-022-09634-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

K-AGRUED: A Container Autoscaling Technique for Cloud-based Web Applications in Kubernetes Using Attention-based GRU Encoder-Decoder

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Machine Learning Based Adaptive Auto-scaling Policy for Resource Orchestration in Kubernetes Clusters

Efficient Kalman filter based deep learning approaches for workload prediction in cloud and edge environments

A deep learning-based resource usage prediction model for resource provisioning in an autonomic cloud computing environment

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now