Skip to main content

CloudPT: Performance Testing for Identifying and Detecting Bottlenecks in IaaS

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2018)

Abstract

This work addresses performance testing for monitoring mass quantities of large-dataset measurements in infrastructure-as-a-Service (IaaS). Physical resources are not virtualized in sharing dynamic clouds; thus, shared resources compete for access to system resources. This competition introduces significant new challenges when assessing the performance of IaaS. A bottleneck may occur if one system resource is critical to IaaS; this may shut down the system and services, which would reduce the workflow performance by a large margin. To protect against bottlenecks, we propose CloudPT, a performance test management framework for IaaS. CloudPT has many advantages: (I) high-efficiency detection; (II) a unified end-to-end feedback loop to collaborate with cloud-ecosystems management; and (III) a troubleshooting performance test. This paper shows that CloudPT efficiently identifies and detects bottlenecks with a minimal false-positive rate (<13%) and it correlates high accuracy using the failure of a host virtual machine (host VM) to start-up with both cloud illustrative batches and transactional workloads such as the Spark, and Kafka framework for a data partitioning and collecting events on an each server. In a framework based on a trace case study, CloudPT diagnosed performance bottlenecks in 20 s with a precision rate of 86%, confirming its real-time efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Malli, S.S., Soundararajan, V., Venkataraman, B.: Real Time Big Data Analytics to Derive Actionable Intelligence in Enterprise Applications, Internet of Things and Big Data Analytics Toward Next-Generation Intelligence, pp. 99–121. Springer, Cham (2018)

    Google Scholar 

  2. Gregg, B. Systems Performance: Enterprise and The Cloud. Pearson Education, New Jersey

    Google Scholar 

  3. Alsheikh, M.A., Niyato, D., Lin, S., Tan, H.P., Han, Z.: Mobile big data analytics using deep learning and apache spark. IEEE Network 30(3), 22–29 (2016)

    Article  Google Scholar 

  4. Performance-testing (2017). http://www.softwaretestinghelp.com/what-is-performance-testing-load-testing-stress-testing/

  5. Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research challenges. J. Internet Serv. Appl. 1(1), 7–18 (2010)

    Article  Google Scholar 

  6. Alkasem, A., Liu, H., Decheng, Z., et al.: AFDI: A Virtualization-based Accelerated Fault Diagnosis Innovation for High Availability Computing, arXiv preprint arXiv:1507.08036 (2015)

  7. High CPU utilization but low load average (2017). https://serverfault.com/questions/667078/high-cpu-utilization-but-low-load-average/667089

  8. Alkasem, A., Liu, H., Zuo, D.: Utility cloud: a novel approach for diagnosis and self-healing based on the uncertainty in anomalous metrics. In: Proceedings of the 2017 International Conference on Management Engineering, Software Engineering and Service Sciences, pp. 99–107. ACM (2017)

    Google Scholar 

  9. Zhai, Y., Xu, W.: March. efficient bottleneck detection in stream process system using fuzzy logic model. In: Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), pp. 438–445. IEEE (2017)

    Google Scholar 

  10. Castro Fernandez, R., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM (2013)

    Google Scholar 

  11. Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., et al.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28(1), 18–28 (2009)

    Article  Google Scholar 

  12. Massie, M., et al.: Monitoring with Ganglia: Tracking Dynamic Host and Application Metrics at Scale. O’Reilly Media, Inc., Massachusetts (2012)

    Google Scholar 

  13. Barth, W.N.: System and Network Monitoring. No Starch Press, San Francisco (2008)

    Google Scholar 

  14. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Massachusetts (2016)

    Google Scholar 

  15. Sharma, B., Praveen, A., Chita, R.D.: Problem determination and diagnosis in shared dynamic clouds. In: 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE (2013)

    Google Scholar 

  16. Cherkasova, L., Ozonat, K., Mi, N., Symons, J., Smirni, E.: Automated anomaly detection and performance modeling of enterprise applications. ACM Trans. Comput. Syst. (TOCS) 27(3), 1–32 (2009)

    Article  Google Scholar 

  17. Kumar, A., Shankar, R., Choudhary, A., Thakur, L.S.: A big data MapReduce framework for fault diagnosis in cloud-based manufacturing. Int. J. Prod. Res. 54(23), 7060–7073 (2016)

    Article  Google Scholar 

  18. Li, J., Qiu, M., Ming, Z., Quan, G., Qin, X., Gu, Z.: Online optimization for scheduling preemptable tasks on IaaS cloud systems. J. Parallel Distrib. Comput. 72(5), 666–677 (2012)

    Article  Google Scholar 

  19. Alkasem, A., Liu, H., Shafiq, M., Zuo, D.: A new theoretical approach: a model construct for fault troubleshooting in cloud computing. Mobile Inf. Syst. 2017, 16 (2017). https://doi.org/10.1155/2017/9038634. Article ID 9038634

    Article  Google Scholar 

  20. SivaSelvan, N., Haider, M.Y., Selvan, N.S., Hegde, G.: Design and Development of Performance Management System (2016)

    Google Scholar 

  21. Wang, C., Talwar, V., Schwan, K., Ranganathan, P.: Online detection of utility cloud anomalies using metric distributions. In: Network Operations and Management Symposium (NOMS). IEEE (2010)

    Google Scholar 

  22. Bertino, Elisa, Catania, Barbara: Integrating XML and databases. IEEE Internet Comput. 5(4), 84–88 (2001)

    Article  Google Scholar 

  23. Barham, P., Boris, D., Keir, F., Steven, H., et al.: Xen and the art of virtualization. In: ACM SIGOPS Operating Systems Review, vol. 37, no. 5, pp. 164–177. ACM (2003)

    Article  Google Scholar 

  24. Riddle, A.R., Soon, M.C.: A survey on the security of hypervisors in cloud computing. In: 2015 IEEE 35th International Conference on Distributed Computing Systems Workshops (ICDCSW), pp. 100–104. IEEE (2015)

    Google Scholar 

  25. Gelman, A., John, B.C., Hal, S.S., Donald, B.R.: Bayesian Data Analysis, vol. 2. Chapman & Hall/CRC, Boca Raton (2014)

    Google Scholar 

  26. Doane, D.P., Lori, E.S.: Applied Statistics in Business and Economics. Irwin, New York (2005)

    Google Scholar 

  27. Alkasem, A., Liu, H., Zuo, D., Algarash, B.: Cloud computing: a model construct of real-time monitoring for big dataset analytics using apache spark. J. Phys: Conf. Ser. 933(1), 012018 (2018)

    Google Scholar 

  28. Jackson, K.: OpenStack Cloud Computing Cookbook. Packt Publishing Ltd, Birmingham (2012)

    Google Scholar 

  29. Kumar, V., Karsten, S.S., Yuan, C., Akhil, S.: A state-space approach to SLA based management. In: Network Operations and Management Symposium NOMS 2008 IEEE, pp. 192–199. IEEE (2008)

    Google Scholar 

  30. Alkasem, A., Liu, H.: A survey of fault-tolerance in cloud computing: concepts and practice. Res. J. Appl. Sci. Eng. Technol. 11(12), 1365–1377 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

We are also thankful to anonymous reviewers for their valuable feedback and comments for improving the quality of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ameen Alkasem .

Editor information

Editors and Affiliations

Appendix A

Appendix A

1.1 A.1. A Proposed Algorithms

Fig. 16.
figure 16

Algorithm for training, filtering and streaming dataset based on Hadoop and Spark

Fig. 17.
figure 17

Algorithm for combining the testing and training datasets classification and evaluation results

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alkasem, A., Liu, H., Zuo, D. (2018). CloudPT: Performance Testing for Identifying and Detecting Bottlenecks in IaaS. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11336. Springer, Cham. https://doi.org/10.1007/978-3-030-05057-3_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05057-3_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05056-6

  • Online ISBN: 978-3-030-05057-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics