Skip to main content
Log in

K-AGRUED: A Container Autoscaling Technique for Cloud-based Web Applications in Kubernetes Using Attention-based GRU Encoder-Decoder

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Cloud service providers can operate several execution instances on a single physical server using virtualization technology, which improves resource utilization. In recent years, container-based virtualization has been developed as a remarkably lightweight alternative to virtual machines. Containers consume less memory than virtual machines, enabling faster setup and portability. Cloud-based applications require dynamic resource allocation in response to fluctuations in the number of incoming requests. Most articles on proactive autoscaling in cloud computing have shortcomings in two ways. 1) During feature extraction, the temporal patterns of the data are ignored, and the historical sequences are assigned equal weight. 2) Existing research omits cool down time (CDT) from the planning phase. 3) Scaling operations can be performed at any time depending only on the current input workload, resulting in a large number of contradicting scaling actions. In response to the above shortcomings, this paper presents a proactive autoscaling method for web applications in Kubernetes using an attention-based gated recurrent unit (GRU) encoder-decoder (K-AGRUED), which predicts the resource usage of several future steps based on CDT. The results demonstrate that the proposed method reduces prediction error by 2–25% compared to state of the art methods. Our approach significantly reduces scaling operations and under-provisioning compared to the standard horizontal pod autoscaler (HPA) of Kubernetes and two previous studies. The K-AGRUED increases the scaling speedup by a factor of up to five in a real environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

The datasets analyzed during scenario1 and scenario2 of experiments in the current study are available at ftp://ftp.ita.ee.lbl.gov/html/contrib/NASA-HTTP.html and ftp://ftp.ita.ee.lbl.gov/html/contrib/WorldCup.html, respectively. The datasets analyzed during scenario3 in current study are available from the corresponding author on reasonable request.

References

  1. Hosseinzadeh, M., Ghafour, M., Hama, H., Vo, B., Khoshnevis, A.: Multi-objective task and workflow scheduling approaches in cloud computing: a comprehensive review. J. Grid Comput. 18(3), 327–356 (2020). https://doi.org/10.1007/s10723-020-09533-z

    Article  Google Scholar 

  2. Shukur, H., Zeebaree, S., Zebari, R., Zeebaree, D., Ahmed, O., Salih, A.: Cloud computing virtualization of resources allocation for distributed systems. J. Appl. Sci. Technol. Trends 1(3), 98–105 (2020). https://doi.org/10.38094/jastt1331

    Article  Google Scholar 

  3. Imdoukh, M., Ahmad, I., Alfailakawi, M.: Machine learning-based auto-scaling for containerized applications. Neural Comput. Appl. 32(13), 9745–9760 (2020). https://doi.org/10.1007/s00521-019-04507-z

    Article  Google Scholar 

  4. Borangiu, T., Trentesaux, D., Thomas, A., Leitão, P., Barata, J.: Digital transformation of manufacturing through cloud services and resource virtualization. Comput. Ind. 108, 150–162 (2019). https://doi.org/10.1016/j.compind.2019.01.006

    Article  Google Scholar 

  5. Guerrero, C., Lera, I., Juiz, C.: Genetic algorithm for multi-objective optimization of container allocation in cloud architecture. J. Grid Comput. 16(1), 113–135 (2017). https://doi.org/10.1007/s10723-017-9419-x

    Article  Google Scholar 

  6. Goethals, T., DeTurck, F., Volckaert, B.: Extending kubernetes clusters to low-resource edge devices using virtual kubelets. IEEE Trans. Cloud Comput. 1–1 (2020). https://doi.org/10.1109/tcc.2020.3033807

  7. Risco, S., Moltó, G., Naranjo, D., Blanquer, I.: Serverless workflows for containerised applications in the cloud continuum. J. Grid Comput. 19(3) (2021). https://doi.org/10.1007/s10723-021-09570-2

  8. Zhu, C., Han, B., Zhao, Y.: A bi-metric autoscaling approach for n-tier web applications on kubernetes. Front. Comput. Sci. 16(3) (2021). https://doi.org/10.1007/s11704-021-0118-1

  9. Ullah, A., Li, J., Shen, Y., Hussain, A.: A control theoretical view of cloud elasticity: taxonomy, survey and challenges. Clust. Comput. 21(4), 1735–1764 (2018). https://doi.org/10.1007/s10586-018-2807-6

    Article  Google Scholar 

  10. Barnawi, A., Sakr, S., Xiao, W., Al-Barakati, A.: The views, measurements and challenges of elasticity in the cloud: a review. Comput. Commun. 154, 111–117 (2020). https://doi.org/10.1016/j.comcom.2020.02.010

    Article  Google Scholar 

  11. Liu, B., Guo, J., Li, C., Luo, Y.: Workload forecasting based elastic resource management in edge cloud. Comput. Ind. Eng. 139, 106136 (2020). https://doi.org/10.1016/j.cie.2019.106136

    Article  Google Scholar 

  12. Li, C., Tang, J., Luo, Y.: Elastic edge cloud resource management based on horizontal and vertical scaling. J. Supercomput. 76(10), 7707–7732 (2020). https://doi.org/10.1007/s11227-020-03192-3

    Article  Google Scholar 

  13. Kovács, J.: Supporting programmable autoscaling rules for containers and virtual machines on clouds. J. Grid Comput. 17(4), 813–829 (2019). https://doi.org/10.1007/s10723-019-09488-w

    Article  Google Scholar 

  14. Aslanpour, M., Ghobaei-Arani, M., NadjaranToosi, A.: Auto-scaling web applications in clouds: a cost-aware approach. J. Netw. Comput. Appl. 95, 26–41 (2017). https://doi.org/10.1016/j.jnca.2017.07.012

    Article  Google Scholar 

  15. Moghaddam, S., Buyya, R., Ramamohanarao, K.: ACAS: an anomaly-based cause aware auto-scaling framework for clouds. J. Parallel Distrib. Comput. 126, 107–120 (2019). https://doi.org/10.1016/j.jpdc.2018.12.002

    Article  Google Scholar 

  16. Rattihalli, G., Govindaraju, M., Lu, H., Tiwari, D.: Exploring potential for non-disruptive vertical auto scaling and resource estimation in Kubernetes. 2019 IEEE 12th International Conference on Cloud Computing (CLOUD) (2019). https://doi.org/10.1109/cloud.2019.00018

  17. Lorido-Botran, T., Miguel-Alonso, J., Lozano, J.A.: A review of auto-scaling techniques for elastic applications in cloud environments. J. Grid Comput. 12(4), 559–592 (2014). https://doi.org/10.1007/s10723-014-9314-7

    Article  Google Scholar 

  18. Nadjaran Toosi, A., Son, J., Chi, Q., Buyya, R.: ELASTICSFC: Auto-scaling techniques for elastic service function chaining in network functions virtualization-based clouds. J. Syst. Softw. 152, 108–119 (2019). https://doi.org/10.1016/j.jss.2019.02.052

    Article  Google Scholar 

  19. Sahni, J., Vidyarthi, D.P.: Heterogeneity-aware adaptive auto-scaling heuristic for improved QoS and resource usage in cloud environments. Computing 99(4), 351–381 (2016). https://doi.org/10.1007/s00607-016-0530-9

    Article  MathSciNet  Google Scholar 

  20. Alaei, N., Safi-Esfahani, F.: Repro-active: a reactive–proactive scheduling method based on simulation in cloud computing. J. Supercomput. 74(2), 801–829 (2017). https://doi.org/10.1007/s11227-017-2161-0

    Article  Google Scholar 

  21. Augustyn, D. R.: Improvements of the reactive auto scaling method for cloud platform. Computer Networks, pp. 422–431 (2021). https://doi.org/10.1007/978-3-319-59767-6_33

  22. Bento, A., Correia, J., Filipe, R., Araujo, F., Cardoso, J.: Automated analysis of distributed tracing: challenges and research directions. J. Grid Comput. 19(1) (2021). https://doi.org/10.1007/s10723-021-09551-5

  23. Bauer, A., Herbst, N., Spinner, S., Ali-Eldin, A., Kounev, S.: Chameleon: a hybrid, proactive auto-scaling mechanism on a level-playing field. IEEE Trans. Parallel Distrib. Syst. 30(4), 800–813 (2019). https://doi.org/10.1109/tpds.2018.2870389

    Article  Google Scholar 

  24. Masdari, M., Zangakani, M.: Green cloud computing using proactive virtual machine placement: challenges and issues. J. Grid Comput. 18(4), 727–759 (2019). https://doi.org/10.1007/s10723-019-09489-9

    Article  Google Scholar 

  25. Iqbal, W., Erradi, A., Mahmood, A.: Dynamic workload patterns prediction for proactive auto-scaling of web applications. J. Netw. Comput. Appl. 124, 94–107 (2018). https://doi.org/10.1016/j.jnca.2018.09.023

    Article  Google Scholar 

  26. Kim, W.-Y., Lee, J.-S., Huh, E.-N.: Study on proactive auto scaling for instance through the prediction of network traffic on the container environment. Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication (2017). https://doi.org/10.1145/3022227.3022243

  27. Saxena, D., Singh, A.K.: A proactive autoscaling and energy-efficient VM allocation framework using online multi-resource neural network for cloud data center. Neurocomputing 426, 248–264 (2021). https://doi.org/10.1016/j.neucom.2020.08.076

    Article  Google Scholar 

  28. Singh, P., Kaur, A., Gupta, P., Gill, S.S., Jyoti, K.: RHAS: robust hybrid auto-scaling for web applications in cloud computing. Clust. Comput. 24(2), 717–737 (2020). https://doi.org/10.1007/s10586-020-03148-5

    Article  Google Scholar 

  29. Al-Dulaimy, A., Taheri, J., Kassler, A., Hoseiny Farahabady, M.R., Deng, S., Zomaya, A.: MULTISCALER: a multi-loop auto-scaling approach for cloud-based applications. IEEE Trans. Cloud Comput. 1–1 (2020). https://doi.org/10.1109/tcc.2020.3031676

  30. Guo, Y., Stolyar, A., Walid, A.: Online VM auto-scaling algorithms for application hosting in a cloud. IEEE Trans. Cloud Comput. 1–1 (2018). https://doi.org/10.1109/tcc.2018.2830793

  31. Kan, C.: DoCloud: an elastic cloud platform for web applications based on Docker. 2016 18th International Conference on Advanced Communication Technology (ICACT) (2016). https://doi.org/10.1109/icact.2016.7423440

  32. Ciptaningtyas, H.T., Santoso, B.J., Razi, M.F.: Resource elasticity controller for docker-based web applications. 2017 11th International Conference on Information & Communication Technology and System (ICTS) (2017). https://doi.org/10.1109/icts.2017.8265669

  33. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  34. Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder–decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014). https://doi.org/10.3115/v1/d14-1179

  35. Dang-Quang, N.-M., Yoo, M.: Deep learning-based autoscaling using bidirectional long short-term memory for kubernetes. Appl. Sci. 11(9), 3835 (2021). https://doi.org/10.3390/app11093835

    Article  Google Scholar 

  36. Du, S., Li, T., Yang, Y., Horng, S.-J.: Multivariate time series forecasting via attention-based encoder–decoder framework. Neurocomputing 388, 269–279 (2020). https://doi.org/10.1016/j.neucom.2019.12.118

    Article  Google Scholar 

  37. Casalicchio, E.: A study on performance measures for auto-scaling CPU-intensive containerized applications. Clust. Comput. 22(3), 995–1006 (2019). https://doi.org/10.1007/s10586-018-02890-1

    Article  MathSciNet  Google Scholar 

  38. Srirama, S.N., Adhikari, M., Paul, S.: Application deployment using containers with auto-scaling for microservices in cloud environment. J. Netw. Comput. Appl. 160, 102629 (2020). https://doi.org/10.1016/j.jnca.2020.102629

    Article  Google Scholar 

  39. Al-Dhuraibi, Y., Paraiso, F., Djarallah, N., Merle, P.: Autonomic vertical elasticity of Docker containers with ELASTICDOCKER. 2017 IEEE 10th International Conference on Cloud Computing (CLOUD) (2020). https://doi.org/10.1109/cloud.2017.67

  40. Tang, X., Zhang, F., Li, X., Khan, S.U., Li, Z.: Quantifying cloud elasticity with container-based autoscaling. 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech) (2017). https://doi.org/10.1109/dasc-picom-datacom-cyberscitec.2017.143

  41. Simic, V., Stojanovic, B., Ivanovic, M.: Optimizing the performance of optimization in the cloud environment–an intelligent auto-scaling approach. Futur. Gener. Comput. Syst. 101, 909–920 (2019). https://doi.org/10.1016/j.future.2019.07.042

    Article  Google Scholar 

  42. Shahin, A.A.: Automatic cloud resource scaling algorithm based on long short-term memory recurrent neural network. Int. J. Adv. Comput. Sci. Appl. 7(12). https://doi.org/10.14569/ijacsa.2016.071236

  43. Calheiros, R.N., Masoumi, E., Ranjan, R., Buyya, R.: Workload prediction using Arima model and its impact on cloud applications’ QoS. IEEE Trans. Cloud Comput. 3(4), 449–458 (2015). https://doi.org/10.1109/tcc.2014.2350475

    Article  Google Scholar 

  44. Prachitmutita, I., Aittinonmongkol, W., Pojjanasuksakul, N., Supattatham, M., Padungweang, P.: Auto-scaling microservices on iaas under SLA with cost-effective framework. 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI) (2018). https://doi.org/10.1109/icaci.2018.8377525

  45. Fang, W., Lu, Z.H., Wu, J., Cao, Z.Y.: RPPS: a novel resource prediction and provisioning scheme in Cloud Data Center. 2012 IEEE Ninth International Conference on Services Computing (2012). https://doi.org/10.1109/scc.2012.47

  46. Tang, X., Liu, Q., Dong, Y., Han, J., Zhang, Z.: Fisher: an efficient container load prediction model with deep neural network in clouds. 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom) (2018). https://doi.org/10.1109/bdcloud.2018.00041

  47. Radhika, E.G., Sudha Sadasivam, G., Fenila Naomi, J.: An efficient predictive technique to autoscale the resources for web applications in private cloud. 2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB) (2018). https://doi.org/10.1109/aeeicb.2018.8480899

  48. Messias, V.R., Estrella, J.C., Ehlers, R., Santana, M.J., Santana, R.C., Reiff-Marganiec, S.: Combining time series prediction models using genetic algorithm to autoscaling web applications hosted in the Cloud Infrastructure. Neural Comput. Appl. 27(8), 2383–2406 (2015). https://doi.org/10.1007/s00521-015-2133-3

    Article  Google Scholar 

  49. Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Adaptive AI-based auto-scaling for kubernetes. 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) (2020). https://doi.org/10.1109/ccgrid49817.2020.00-33

  50. Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Machine learning-based scaling management for kubernetes edge clusters. IEEE Trans. Netw. Serv. Manage. 18(1), 958–972 (2021). https://doi.org/10.1109/tnsm.2021.3052837

    Article  Google Scholar 

  51. Wang, B., Kong, W., Guan, H.: Air quality forcasting based on gated recurrent long short-term memory model. Proceedings of the ACM Turing Celebration Conference – China (2019). https://doi.org/10.1145/3321408.3326656

  52. Zhu, Q., Zhang, F., Liu, S., Wu, Y., Wang, L.: A hybrid VMD–BIGRU model for rubber futures time series forecasting. Appl. Soft Comput. 84, 105739 (2019). https://doi.org/10.1016/j.asoc.2019.105739

    Article  Google Scholar 

  53. Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015). https://doi.org/10.18653/v1/d15-1166

  54. Yan, M., Liang, X.M., Lu, Z.H., Wu, J., Zhang, W.: Hansel: adaptive horizontal scaling of microservices using BiLSTM. Appl. Soft Comput. 105, 107216 (2021). https://doi.org/10.1016/j.asoc.2021.107216

    Article  Google Scholar 

  55. The Reliable, High Performance TCP/HTTP Load Balancer. Available online: http://www.haproxy.org/. Accessed on August 1 2020

  56. Prometheus-Monitoring System & Time Series Database. Available online: https://prometheus.io/. Accessed on August 1 2020

  57. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

  58. Girish, L., Rao, S.K.: Anomaly detection in cloud environment using artificial intelligence techniques. Computing (2021). https://doi.org/10.1007/s00607-021-00941-x

    Article  Google Scholar 

  59. Zulqarnain, M., Ghazali, R., Hassim, Y.M., Aamir, M.: An enhanced gated recurrent unit with auto-encoder for solving text classification problems. Arab. J. Sci. Eng. 46(9), 8953–8967 (2021). https://doi.org/10.1007/s13369-021-05691-8

    Article  Google Scholar 

  60. Arlitt, M., Jin, T.: A workload characterization study of the 1998 World Cup Web Site. IEEE Network, 14(3), 30–37 (2000). https://doi.org/10.1109/65.844498. Online: ftp://ftp.ita.ee.lbl.gov/html/contrib/WorldCup.html

  61. Two Month'sWorth of All HTTP Requests to the NASA Kennedy Space Center. Available online: ftp://ftp.ita.ee.lbl.gov/html/contrib/NASA-HTTP.html

  62. Peng, C., Li, Y., Yu, Y., Zhou, Y., Du, S.: Multi-step-ahead host load prediction with GRU based encoder-decoder in cloud computing. 2018 10th International Conference on Knowledge and Smart Technology (KST), (2018). Available online: https://doi.org/10.1109/kst.2018.8426104

  63. Bauer, A., Grohmann, J., Herbst, N., Kounev, S.: On the value of service demand estimation for auto-scaling. Lecture Notes in Computer Science, pp. 142–156 (2018). https://doi.org/10.1007/978-3-319-74947-1_10

Download references

Author information

Authors and Affiliations

Authors

Contributions

Javad Dogani: Methodology, Software, Simulation, Writing—original draft.

Farshad Khunjush: Conceptualization, Validation, Methodology, Writing—review & editing.

Mehdi Seydali: Conceptualization, software, Writing—review & editing.

Corresponding author

Correspondence to Farshad Khunjush.

Ethics declarations

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dogani, J., Khunjush, F. & Seydali, M. K-AGRUED: A Container Autoscaling Technique for Cloud-based Web Applications in Kubernetes Using Attention-based GRU Encoder-Decoder. J Grid Computing 20, 40 (2022). https://doi.org/10.1007/s10723-022-09634-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10723-022-09634-x

Keywords