Abstract
This work addresses performance testing for monitoring mass quantities of large-dataset measurements in infrastructure-as-a-Service (IaaS). Physical resources are not virtualized in sharing dynamic clouds; thus, shared resources compete for access to system resources. This competition introduces significant new challenges when assessing the performance of IaaS. A bottleneck may occur if one system resource is critical to IaaS; this may shut down the system and services, which would reduce the workflow performance by a large margin. To protect against bottlenecks, we propose CloudPT, a performance test management framework for IaaS. CloudPT has many advantages: (I) high-efficiency detection; (II) a unified end-to-end feedback loop to collaborate with cloud-ecosystems management; and (III) a troubleshooting performance test. This paper shows that CloudPT efficiently identifies and detects bottlenecks with a minimal false-positive rate (<13%) and it correlates high accuracy using the failure of a host virtual machine (host VM) to start-up with both cloud illustrative batches and transactional workloads such as the Spark, and Kafka framework for a data partitioning and collecting events on an each server. In a framework based on a trace case study, CloudPT diagnosed performance bottlenecks in 20 s with a precision rate of 86%, confirming its real-time efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Malli, S.S., Soundararajan, V., Venkataraman, B.: Real Time Big Data Analytics to Derive Actionable Intelligence in Enterprise Applications, Internet of Things and Big Data Analytics Toward Next-Generation Intelligence, pp. 99–121. Springer, Cham (2018)
Gregg, B. Systems Performance: Enterprise and The Cloud. Pearson Education, New Jersey
Alsheikh, M.A., Niyato, D., Lin, S., Tan, H.P., Han, Z.: Mobile big data analytics using deep learning and apache spark. IEEE Network 30(3), 22–29 (2016)
Performance-testing (2017). http://www.softwaretestinghelp.com/what-is-performance-testing-load-testing-stress-testing/
Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research challenges. J. Internet Serv. Appl. 1(1), 7–18 (2010)
Alkasem, A., Liu, H., Decheng, Z., et al.: AFDI: A Virtualization-based Accelerated Fault Diagnosis Innovation for High Availability Computing, arXiv preprint arXiv:1507.08036 (2015)
High CPU utilization but low load average (2017). https://serverfault.com/questions/667078/high-cpu-utilization-but-low-load-average/667089
Alkasem, A., Liu, H., Zuo, D.: Utility cloud: a novel approach for diagnosis and self-healing based on the uncertainty in anomalous metrics. In: Proceedings of the 2017 International Conference on Management Engineering, Software Engineering and Service Sciences, pp. 99–107. ACM (2017)
Zhai, Y., Xu, W.: March. efficient bottleneck detection in stream process system using fuzzy logic model. In: Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), pp. 438–445. IEEE (2017)
Castro Fernandez, R., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM (2013)
Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., et al.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28(1), 18–28 (2009)
Massie, M., et al.: Monitoring with Ganglia: Tracking Dynamic Host and Application Metrics at Scale. O’Reilly Media, Inc., Massachusetts (2012)
Barth, W.N.: System and Network Monitoring. No Starch Press, San Francisco (2008)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Massachusetts (2016)
Sharma, B., Praveen, A., Chita, R.D.: Problem determination and diagnosis in shared dynamic clouds. In: 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE (2013)
Cherkasova, L., Ozonat, K., Mi, N., Symons, J., Smirni, E.: Automated anomaly detection and performance modeling of enterprise applications. ACM Trans. Comput. Syst. (TOCS) 27(3), 1–32 (2009)
Kumar, A., Shankar, R., Choudhary, A., Thakur, L.S.: A big data MapReduce framework for fault diagnosis in cloud-based manufacturing. Int. J. Prod. Res. 54(23), 7060–7073 (2016)
Li, J., Qiu, M., Ming, Z., Quan, G., Qin, X., Gu, Z.: Online optimization for scheduling preemptable tasks on IaaS cloud systems. J. Parallel Distrib. Comput. 72(5), 666–677 (2012)
Alkasem, A., Liu, H., Shafiq, M., Zuo, D.: A new theoretical approach: a model construct for fault troubleshooting in cloud computing. Mobile Inf. Syst. 2017, 16 (2017). https://doi.org/10.1155/2017/9038634. Article ID 9038634
SivaSelvan, N., Haider, M.Y., Selvan, N.S., Hegde, G.: Design and Development of Performance Management System (2016)
Wang, C., Talwar, V., Schwan, K., Ranganathan, P.: Online detection of utility cloud anomalies using metric distributions. In: Network Operations and Management Symposium (NOMS). IEEE (2010)
Bertino, Elisa, Catania, Barbara: Integrating XML and databases. IEEE Internet Comput. 5(4), 84–88 (2001)
Barham, P., Boris, D., Keir, F., Steven, H., et al.: Xen and the art of virtualization. In: ACM SIGOPS Operating Systems Review, vol. 37, no. 5, pp. 164–177. ACM (2003)
Riddle, A.R., Soon, M.C.: A survey on the security of hypervisors in cloud computing. In: 2015 IEEE 35th International Conference on Distributed Computing Systems Workshops (ICDCSW), pp. 100–104. IEEE (2015)
Gelman, A., John, B.C., Hal, S.S., Donald, B.R.: Bayesian Data Analysis, vol. 2. Chapman & Hall/CRC, Boca Raton (2014)
Doane, D.P., Lori, E.S.: Applied Statistics in Business and Economics. Irwin, New York (2005)
Alkasem, A., Liu, H., Zuo, D., Algarash, B.: Cloud computing: a model construct of real-time monitoring for big dataset analytics using apache spark. J. Phys: Conf. Ser. 933(1), 012018 (2018)
Jackson, K.: OpenStack Cloud Computing Cookbook. Packt Publishing Ltd, Birmingham (2012)
Kumar, V., Karsten, S.S., Yuan, C., Akhil, S.: A state-space approach to SLA based management. In: Network Operations and Management Symposium NOMS 2008 IEEE, pp. 192–199. IEEE (2008)
Alkasem, A., Liu, H.: A survey of fault-tolerance in cloud computing: concepts and practice. Res. J. Appl. Sci. Eng. Technol. 11(12), 1365–1377 (2015)
Acknowledgments
We are also thankful to anonymous reviewers for their valuable feedback and comments for improving the quality of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A
Appendix A
1.1 A.1. A Proposed Algorithms
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Alkasem, A., Liu, H., Zuo, D. (2018). CloudPT: Performance Testing for Identifying and Detecting Bottlenecks in IaaS. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11336. Springer, Cham. https://doi.org/10.1007/978-3-030-05057-3_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-05057-3_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05056-6
Online ISBN: 978-3-030-05057-3
eBook Packages: Computer ScienceComputer Science (R0)