Abstract
In industrial environments it is critical to find out the capacity of a system and plan for a deployment layout that meets the production traffic demands. The system capacity is influenced by both the performance of the system’s constituting components and the physical environment setup. In a large system, the configuration parameters of individual components give the flexibility to developers and load test engineers to tune system performance without changing the source code. However, due to the large search space, estimating the capacity of the system given different configuration values is a challenging and costly process. In this paper, we propose an approach, called MLASP, that uses machine learning models to predict the system key performance indicators (i.e., KPIs), such as throughput, given a set of features made off configuration parameter values, including server cluster setup, to help engineers in capacity planning for production environments. Under the same load, we evaluate MLASP on two large-scale mission-critical enterprise systems developed by Ericsson and on one open-source system. We find that: 1) MLASP can predict the system throughput with a very high accuracy. The difference between the predicted and the actual throughput is less than 1%; and 2) By using only a small subset of the training data (e.g., 3% of the entire data for the open-source system), MLASP can still predict the throughput accurately. We also document our experience of successfully integrating the approach into an industrial setting. In summary, this paper highlights the benefits and potential of using machine learning models to assist load test engineers in capacity planning.







Change history
16 August 2021
A Correction to this paper has been published: https://doi.org/10.1007/s10664-021-10011-7
References
Aggarwal C, Chen C, Han J (2010) The inverse classification problem. J Comput Sci Technol 25:458–468
ALQahtani AH, Whyte A (2016) Estimation of life-cycle costs of buildings: regression vs artificial neural network
Apache (2019) Apache kafka - a distributed streaming platform. https://kafka.apache.org/
Bao L, Liu X, Xu Z, Fang B (2018a) Autoconfig: Automatic configuration tuning for distributed message systems. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018
Bao L, Liu X, Xu Z, Fang B (2018b) Autoconfig: automatic configuration tuning for distributed message systems. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018, pp 29–40
Breiman L (2001) Random forests. Machine Learn 45(1):5–32
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794
Chen TH, Thomas SW, Nagappan M, Hassan AE (2012) Explaining software defects using topic models. In: Proceedings of the 9th IEEE working conference on mining software repositories, MSR ’12, pp 189–198
Chen TH, Shang W, Hassan AE, Nasser M, Flora P (2016) Cacheoptimizer: Helping developers configure caching frameworks for hibernate-based database-centric web applications. In: Proceedings of the 24th ACM SIGSOFT international symposium on foundations of software engineering, FSE 2016, pp 666–677
Chen TH, Syer MD, Shang W, Jiang ZM, Hassan AE, Nasser M, Flora P (2017) Analytics-driven load testing: An industrial experience report on load testing of large-scale systems
Cloudera Documentation (2018) Configuring apache kafka for performance and resource management. https://docs.cloudera.com/documentation/kafka/latest/topics/kafka_performance.html
Confluent Blogs (2017) Optimizing your apache kafka deployment. https://www.confluent.io/blog/optimizing-apache-kafka-deployment/
Ergen T, Kozat SS (2017) Online training of lstm networks in distributed systems for variable length data sequences. IEEE Trans Neural Netw Learn Syst 29(10):5159–5165
FastCompany (2016) How one second could cost Amazon 1.6 billion sales. http://www.fastcompany.com/1825005/how-one-second-could-cost-amazon-16-billion-sales, Last Accessed Mar 3 2016
Friedman L, Wall M (2005) Graphical views of suppression and multicollinearity in multiple linear regression. Amer Statist 59:127–136. https://doi.org/10.1198/000313005X41337
Garcia Asuero A, Sayago A, Gonzalez G (2006) The correlation coefficient: an overview. Critical Reviews in Analytical Chemistry - CRIT REV ANAL CHEM 36:41–59. https://doi.org/10.1080/10408340500526766
Giulli A, Pal S (2017) Deep Learning with Keras. Packt Publishing Ltd, Birmingham
Guo J, Czarnecki K, Apel S, Siegmund N, Wasowski A (2013) Variability-aware performance prediction: A statistical learning approach. 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE) pp 301–311
Guo J, Yang D, Siegmund N, Apel S, Sarkar A, Valov P, Czarnecki K, Wasowski A, Yu H (2017) Data-efficient performance learning for configurable systems. Empir Softw Eng 23:1826–1867
Ha H, Zhang H (2019) Deepperf: Performance prediction for configurable software with deep sparse neural network. In: Proceedings of the 41st international conference on software engineering, ICSE ’19, pp 1095–1106
Harrell FE (2006) Regression modeling strategies. Springer, Berlin
Jiang ZM, Hassan AE (2015) A survey on load testing of large-scale software systems. IEEE Trans Softw Eng 41(11):1091–1118
Lathuiliére S, Mesejo P, Alameda-Pineda X, Horaud R (2019) A comprehensive analysis of deep regression. IEEE Trans Pattern Anal Machine Intell 1–1
Le Noac’h P, Costan A, Bougé L (2017) A performance evaluation of apache kafka in support of big data streaming applications. In: 2017 IEEE international conference on big data (Big Data), pp 4803–4806
Li H, Chen THP, Hassan AE, Nasser M, Flora P (2018) Adopting autonomic computing capabilities in existing large-scale systems: An industrial experience report. In: Proceedings of the 40th international conference on software engineering: Software Engineering in Practice, ICSE-SEIP ’18, pp 1–10
MLASP (2020) Mlasp - open source system experimental data. https://github.com/SPEAR-SE/mlasp
Montero-Manso P, Athanasopoulos G, Hyndman RJ, Talagala TS (2020) Fforma: Feature-based forecast model averaging. Int J Forecast 36(1):86–92
Ng AY (2004) Feature selection, l1 vs. l2 regularization, and rotational invariance. In: Proceedings of the twenty-first international conference on machine learning, association for computing machinery, New York, NY, USA, ICML ’04. https://doi.org/10.1145/1015330.1015435, p 78
Nigam K, Lafferty J, McCallum A (1999) Using maximum entropy for text classification. In: IJCAI-99 Workshop on machine learning for information filtering, Stockholom, Sweden, vol 1, pp 61–67
Pan B (2018) Application of xgboost algorithm in hourly pm2.5 concentration prediction. IOP Conf Series Earth Environ Sci 113:012127. https://doi.org/10.1088/1755-1315/113/1/012127
Rabbit MQ (2020) Rabbit mq - an open source message broker system. https://www.rabbitmq.com/
Sayyad AS, Ingram J, Menzies T, Ammar H (2013) Scalable product line configuration: A straw to break the camel’s back. In: Proceedings of the 28th IEEE/ACM international conference on automated software engineering, IEEE Press, ASE’13, p 465474
SciKit-Learn (2019) Scikit learn - machine learning in python. https://pypi.org/project/psutil
Singh BK, Verma K, Thoke AS (2015) Investigations on impact of feature normalization techniques on classifier’s performance in breast tumor classification. Int J Comput Appl 116:11–15
Sola J, Sevilla J (1997) Importance of input data normalization for the application of neural networks to complex industrial problems. Nuclear Sci IEEE Trans 44:1464–1468. https://doi.org/10.1109/23.589532
Tibshirani R (2011) Regression shrinkage selection via the lasso. J R Stat Soc Series B 73:273–282. https://doi.org/10.2307/41262671
Wöllmer M, Eyben F, Schuller B, Douglas-Cowie E, Cowie R (2009) Data-driven clustering in emotional space for affect recognition using discriminatively trained lstm networks. In: Proc Interspeech 2009, Brighton, UK, pp 1595–1598
Xu Y, Goodacre R (2018) On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J Anal Test 2. https://doi.org/10.1007/s41664-018-0068-2
Yin Z, Ma X, Zheng J, Zhou Y, Bairavasundaram LN, Pasupathy S (2011) An empirical study on configuration errors in commercial and open source systems. SOSP ’11 159–172
Zaccone G, Karim MR, Menshawy A (2017) Deep Learning with TensorFlow. Packt Publishing Ltd, Birmingham
Acknowledgements
We want to thank Ericsson for providing access to the enterprise systems that we used in our case study. The findings and opinions expressed in this paper are those of the authors and do not necessarily represent or reflect those of Ericsson and/or its subsidiaries and affiliation. Our results do not in any way reflect the quality of Ericsson’s products.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Sven Apel
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: Modifications have been made to the affiliation section and to Figure 6. Full information regarding the corrections made can be found in the erratum/correction for this article.
Rights and permissions
About this article
Cite this article
Vitui, A., Chen, TH.(. MLASP: Machine learning assisted capacity planning. Empir Software Eng 26, 87 (2021). https://doi.org/10.1007/s10664-021-09994-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-021-09994-0