Abstract
It is crucial to evaluate performance of a cloud platform and determine the main factors influencing the property. Moreover, the analysis results of related performance indicators can be applied to making theoretical predictions about the performance status of the cloud platform. This work mainly focuses on researching the interrelations between the performance indicators based on the Spark technology of the cloud platform and the load performance of the cluster, and furthermore makes effective predictions for the load performance. Firstly, we put forward the analytic frameworks of Spark performance analysis, the specific indicators analysis as well as the prediction models towards the cluster load. Secondly, with respect to the evaluation indicators, we explore the basis for their selections as well as their concrete implications, and then objectively, accurately calculate the correlation formula between the practically produced performance parameters and the load performance of the cluster when the Spark cluster performs the batch applications utilizing the MLR (Multiple Linear Regression) method, and, therefore, determine the main factors impacting the load performance. Finally, we predict the load value utilizing the Spark indicator analysis and the load prediction model. The results indicate that accuracy is up to 92.307%. Consequently, the solution presented in this paper predicts the cluster load value with effetioncy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mesbahi, M.R., Hashemi, M., Rahmani, A.M.: Performance evaluation and analysis of load balancing algorithms in cloud computing environments. In: Second International Conference on Web Research, pp. 145–151. IEEE (2016)
Li, M., Tan, J., Wang, Y., et al.: SparkBench: a comprehensive benchmarking suite for in memory data analytic platform Spark. In: ACM International Conference on Computing Frontiers, pp. 1–8. ACM (2015)
Mershad, K., Artail, H., Saghir, M., et al.: A mathematical model to analyze the utilization of a cloud datacenter middleware. J. Netw. Comput. Appl. 59(3), 399–415 (2014)
Gu, L., Li, H.: Memory or time: performance evaluation for iterative operation on Hadoop and Spark. In: IEEE International Conference on High Performance Computing and Communications and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, pp. 721–727. (2013)
Villalpando, L.E.B., April, A., Abran, A.: Methodology to determine relationships between performance factors in hadoop cloud computing applications. In: International Conference on Cloud Computing and Services Sciences, pp. 375–386. (2014)
Sha, L., Ding, J., Chen, X., et al.: Performance modeling of openstack cloud computing platform using performance evaluation process algebra. In: International Conference on Cloud Computing and Big Data, pp. 49–56. IEEE (2015)
Expósito, R.R., Taboada, G.L., Ramos, S., et al.: Evaluation of messaging middleware for high-performance cloud computing. Pers. Ubiquit. Comput. 17(8), 1709–1719 (2013)
Grandhi, S., Wibowo, S.: Performance evaluation of cloud computing providers using fuzzy multiattribute group decision making model. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 130–135. IEEE (2015)
Villalpando, L.E.B., April, A., Abran, A.: Performance analysis model for big data applications in cloud computing. J. Cloud Comput. 3(1), 1–20 (2014)
Prieto, M., Tanner, P., Andrade, C.: Multiple linear regression model for the assessment of bond strength in corroded and non-corroded steel bars in structural concrete. Mater. Struct. 49(11), 4749–4763 (2016)
Pavón-Domínguez, P., Jiménez-Hornero, F.J., Ravé, E.G.D.: Evaluation of the temporal scaling variability in forecasting ground-level ozone concentrations obtained from multiple linear regressions. Env. Monit. Assess. 185(5), 3853–3866 (2013)
Khedher, O., Jarraya, M.: Performance evaluation and improvement in cloud computing environment. In: International Conference on High Performance Computing and Simulation, pp. 650–652. IEEE (2015)
Ataş, G., Gungor, V.C.: Performance evaluation of cloud computing platforms using statistical methods. Comput. Electr. Eng. 40(5), 1636–1649 (2014)
Gong, L., Xie, J., Li, X., et al.: Study on energy saving strategy and evaluation method of green cloud computing system. In: IEEE, Conference on Industrial Electronics and Applications, pp. 483–488. IEEE (2013)
Goga, K., Terzo, O., Ruiu, P., et al.: Simulation, modeling, and performance evaluation tools for cloud applications. In: Eighth International Conference on Complex, Intelligent and Software Intensive Systems, pp. 226–232. IEEE (2014)
Li, L., Rong, M., Zhang, G.: An internet of things QoE evaluation method based on multiple linear regression analysis. In: International Conference on Computer Science and Education, pp. 925–928. IEEE (2015)
Acknowledgments
The subject is sponsored by the National Natural Science Foundation of P. R. China (Nos. 61373017, 61572260, 61572261, 61672296, 61602261), the Natural Science Foundation of Jiangsu Province (Nos. BK20140886, BK20140888, BK20160089), Scientific & Technological Support Project of Jiangsu Province (Nos. BE2015702, BE2016777, BE2016185), China Postdoctoral Science Foundation (Nos. 2014M551636, 2014M561696), Jiangsu Planned Projects for Postdoctoral Research Funds (Nos. 1302090B, 1401005B), Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks Foundation (No. WSNLBZY201508), Research Innovation Program for College Graduates of Jiangsu Province (SJZZ16_0148).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd
About this paper
Cite this paper
Dong, L., Li, P., Xu, H., Luo, B., Mi, Y. (2017). Performance Prediction of Spark Based on the Multiple Linear Regression Analysis. In: Chen, G., Shen, H., Chen, M. (eds) Parallel Architecture, Algorithm and Programming. PAAP 2017. Communications in Computer and Information Science, vol 729. Springer, Singapore. https://doi.org/10.1007/978-981-10-6442-5_7
Download citation
DOI: https://doi.org/10.1007/978-981-10-6442-5_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6441-8
Online ISBN: 978-981-10-6442-5
eBook Packages: Computer ScienceComputer Science (R0)