Skip to main content
Log in

Task Failure Prediction using Combine Bagging Ensemble (CBE) Classification in Cloud Workflow

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Scientific applications adopt cloud environment for executing its workflows as tasks. When a task fails, dependency nature of the workflows affects the overall performance of the execution. An efficient failure prediction mechanism is needed to execute the workflow efficiently. This paper proposes a failure prediction method which is implemented using various machine learning classifiers. Among different classifiers, Naïve Bayes predicts the failure with the highest accuracy of 94.4%. Further, to improve the accuracy of prediction, a novel ensemble method called combine bagging ensemble is introduced and acquires overall accuracy as 95.8%. The validation of proposed method is carried out by comparing simulation and real-time cloud testbed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Kumar, S., et al. (2015). Fault Tolerance and Load Balancing algorithm in Cloud Computing: A survey. IJARCCE International Journal of Advanced Research in Computer and Communication Engineering, 4(7), 92–96.

    Google Scholar 

  2. Yu, Z., Wang, C., & Shi, W. (2010). FLAW: FaiLure-Aware Workflow scheduling in high performance computing systems. Journal of Cluster Computing, 13(4), 421–434.

    Article  Google Scholar 

  3. Poola, D., Ramamohanarao, K., & Buyya, R. (2016). Enhancing reliability of workflow execution using task replication and spot instances. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 10(4), 30.

    Google Scholar 

  4. Samak, T., Gunter, D., Goode, M., Deelman, E., Juve, G., Silva, F., & Vahi K. (2012) Failure analysis of distributed scientific workflows executing in the cloud. In Proceedings of the 8th International conference on Network and Service Management (pp. 46–54).

  5. Lin, M., Yao, Z., & Huang, T. (2016). A hybrid push protocol for resource monitoring in cloud computing platforms. Optik-International Journal for Light and Electron Optics, 127(4), 2007–2011.

    Article  Google Scholar 

  6. Huang, H., & Wang, L. (2010). P&p: A combined push–pull model for resource monitoring in cloud computing environment. In IEEE 3rd international conference on cloud computing (CLOUD). IEEE.

  7. Cheraghlou, M. N., Khadem-Zadeh, A., & Haghparast, M. (2015). A survey of fault tolerance architecture in cloud computing. Journal of Network and Computer Applications, 61, 81–92.

    Article  Google Scholar 

  8. Derbeko, P., Dolev, S., Gudes, E., & Sharma, S. (2016). Security and privacy aspects in MapReduce on clouds: a survey. Computer Science Review, 20, 1–28.

    Article  MathSciNet  MATH  Google Scholar 

  9. Salfner, F., Lenk, M., & Malek, M. (2010). A survey of online failure prediction methods. ACM Computing Surveys, 42, 1–42.

    Article  Google Scholar 

  10. Zheng, Z., Zhou, T. C., Lyu, M. R., & King, I. (2010, November). FTCloud: A component ranking framework for fault-tolerant cloud applications. In IEEE 21st International Symposium on Software Reliability Engineering (ISSRE), 2010 (pp. 398–407), IEEE

  11. Al-Sayed, M. M., Khattab, S., & Omara, F. A. (2016). Prediction mechanisms for monitoring state of cloud resources using Markov chain model. Journal of Parallel and Distributed Computing, 96, 163–171.

    Article  Google Scholar 

  12. Bala, A., & Chana, I. (2015). Intelligent failure prediction models for scientific workflows. Expert Systems with Applications, 42(3), 980–989.

    Article  Google Scholar 

  13. Bui, D. M., & Lee, S. (2016). Fuzzy Fault Detection in IaaS Cloud Computing. In Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication (p. 65), ACM.

  14. Amiri, M., & Mohammad-Khanli, L. (2017). Survey on prediction models of applications for resources provisioning in cloud. Journal of Network and Computer Applications, 82, 93–113.

    Article  Google Scholar 

  15. Deelman, E., et al. (2005). Pegasus: A framework for mapping complex scientific workflows onto distributed systems. Scientific Programming, 13, 219–237.

    Article  Google Scholar 

  16. Deelman, E. (2010). Grids and clouds: Making workflow applications work in heterogeneous distributed environments. The International Journal of High Performance Computing Applications, 24(3), 284–298.

    Article  Google Scholar 

  17. Zhang, Y., Zheng, Z., & Lyu, M. R. (2011, July). BFTCloud: A byzantine fault tolerance framework for voluntary-resource cloud computing. In IEEE International Conference on Cloud Computing (CLOUD), 2011 (pp. 444–451), IEEE.

  18. Pandeeswari, N., & Kumar, G. (2016). Anomaly detection system in cloud environment using fuzzy clustering based ANN. Mobile Networks and Applications, 21(3), 494–505.

    Article  Google Scholar 

  19. Catal, C., & Diri, B. (2009). A systematic review of software fault prediction studies. Expert Systems with Applications, 36, 7346–7354.

    Article  Google Scholar 

  20. Islam, A., Keunga, J., Lee, K., & Liu, A. (2012). Empirical prediction models for adaptive resource provisioning in the cloud. Future Generation Computer Systems, 28, 155–162.

    Article  Google Scholar 

  21. Malhotra, R., & Jain, A. (2012). Fault prediction using statistical and machine learning methods for improving software quality. Journal of information Processing Systems, 8, 241–262.

    Article  Google Scholar 

  22. Islam T, Manivannan D. Predicting Application Failure in Cloud: A Machine Learning Approach. In IEEE International Conference on Cognitive Computing (ICCC), 2017 Jun 25 (pp. 24–31), IEEE.

  23. Bala, A., & Chana, I. (2012). Fault tolerance-challenges, techniques and implementation in cloud computing. IJCSI, 9(1), 288–293.

    Google Scholar 

  24. Gupta, N., Ahuja, N., Malhotra, S., Bala, A., & Kaur, G. (2017). Intelligent heart disease prediction in cloud environment through ensembling. Expert Systems, 34(3), e12207.

    Article  Google Scholar 

  25. Sindrilaru, E., Costan, A., & Cristea, V. (2010, February). Fault tolerance and recovery in grid workflow management systems. In 2010 international conference on complex, intelligent and software intensive systems (pp. 475–480). IEEE.

  26. W. Yoo, A. Sim, and K. Wu, “Machine learning based job status prediction in scientific clusters. In Proceedings 2016 SAI Computing Conference SAI 2016, (pp. 44–53), 2016.

  27. Jhawar, R., Piuri, V., & Santambrogio, M. D. (2012). A comprehensive conceptual system-level approach to fault tolerance in cloud computing. In IEEE international systems conference (pp. 1–5).

  28. Calheiros, R. N., Ranjan, R., Beloglazov, A., Rose, C. A. F. D., & Buyya, R. (2011). CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and Experience, 41, 23–50.

    Google Scholar 

  29. Chen, W., & Deelman, E. (2012). WorkfowSim: A toolkit for simulating scientific workflows in distributed environments. In IEEE 8th international conference on E-Science, (pp. 1–8).

  30. Juve, G. et al. (2009). Scientific workflow applications on Amazon EC2. In 5th IEEE international conference on E-science workshops, (pp. 59–66).

  31. Amazon Elastic Compute Cloud(Amazon EC2) https://aws.amazon.com/ec2/

  32. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: An update. SIGKDD Explorations, 11.

  33. Catal, C. (2011). Software fault prediction: a literature review and current trends. Expert Systems with Applications, 38(4), 4626–4636.

    Article  Google Scholar 

  34. Mohamed, N, & J. Al-Jaroodi (2012). A collaborative fault-tolerant transfer protocol for replicated data in the cloud. In International Conference on Collaboration Technologies and Systems (CTS), IEEE 2012.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Padmakumari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Padmakumari, P., Umamakeswari, A. Task Failure Prediction using Combine Bagging Ensemble (CBE) Classification in Cloud Workflow. Wireless Pers Commun 107, 23–40 (2019). https://doi.org/10.1007/s11277-019-06238-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-019-06238-9

Keywords

Navigation