Skip to main content
Log in

Performability Evaluation and Optimization of Workflow Applications in Cloud Environments

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Given the characteristics of dynamic provisioning and illusion of unlimited resources, clouds are becoming a popular alternative for running scientific workflows. In a cloud system for processing workflow applications, the system’s performance is heavily influenced by two factors: the scheduling strategy and failure of components. Failures in a cloud system can simultaneously affect several users and depreciate the number of available computing resources. A bad scheduling strategy can increase the expected makespan and the idle time of physical machines. In this paper, we propose an optimization method for the scheduling of scientific workflows on cloud systems. The method comprises the use of a meta-heuristic algorithm coupled to a performability model that provides the fitnesses of explored solutions. For being able to represent the combined effect of scheduling and component failures, we adopted discrete event simulation for the performability model. Experimental results show the effectiveness of the hybrid simulation-optimization approach for optimizing the number of allocated virtual machines and the scheduling of tasks regarding performability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alwabel, A., Walters, R., Wills, G.: Desktopcloudsim: Simulation of node failures in the cloud. In: International Conference on Cloud Computing, GRIDs, and Virtualization, p. 29 (2015)

  2. Ando, E., Nakata, T., Yamashita, M.: Approximating the longest path length of a stochastic dag by a normal distribution in linear time. J. Discrete Algoritms 7(4), 420–438 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  3. Arabnejad, H., Barbosa, J.G.: A budget constrained scheduling algorithm for workflow applications. J. Grid Comput. 12(4), 665–679 (2014)

    Article  Google Scholar 

  4. Bianchi, L., Dorigo, M., Gambardella, L.M., Gutjahr, W.J.: A survey on metaheuristics for stochastic combinatorial optimization. Nat. Comput. 8(2), 239–287 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bitam, S.: Bees life algorithm for job scheduling in cloud computing. In: Proceedings of the Third International Conference on Communications and Information Technology, pp. 186–191 (2012)

  6. Blum, C., Roli, A.: Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput. Surv. (CSUR) 35(3), 268–308 (2003)

    Article  Google Scholar 

  7. Bolch, G., Greiner, S., de Meer, H., Trivedi, K.S.: Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications. Wiley, Hoboken (2006)

    Book  MATH  Google Scholar 

  8. Book, R.V., et al.: Michael r. garey and david s. johnson, computers and intractability: a guide to the theory of np-completeness. Bulletin (New Series) of the American Mathematical Society 3(2), 898–904 (1980)

    Article  Google Scholar 

  9. Brown, D.A., Brady, P.R., Dietz, A., Cao, J., Johnson, B., McNabb, J.: A case study on the use of workflow technologies for scientific analysis: gravitational wave data analysis. In: Workflows for E-Science, pp. 39–59. Springer (2007)

  10. Bux, M., Leser, U.: Dynamiccloudsim: Simulating heterogeneity in computational clouds. Futur. Gener. Comput. Syst. 46, 85–99 (2015)

    Article  Google Scholar 

  11. Cai, Z., Li, Q., Li, X.: Elasticsim: a toolkit for simulating workflows with cloud resource runtime auto-scaling and stochastic task execution times. J. Grid Comput. 15(2), 257–272 (2017)

    Article  Google Scholar 

  12. Cai, Z., Li, X., Ruiz, R., Li, Q.: A delay-based dynamic scheduling algorithm for bag-of-task workflows with stochastic task execution times in clouds. Futur. Gener. Comput. Syst. 71, 57–72 (2017)

    Article  Google Scholar 

  13. Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A., Buyya, R.: Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw. Pract. Exp. 41(1), 23–50 (2011)

    Article  Google Scholar 

  14. Chen, W., Deelman, E.: Workflowsim: a toolkit for simulating scientific workflows in distributed environments. In: 2012 IEEE 8th International Conference on E-Science (E-Science), pp. 1–8. IEEE (2012)

  15. Chen, W.N., Zhang, J.: Ant colony optimization for software project scheduling and staffing with an event-based scheduler. IEEE Trans. Softw. Eng. 39(1), 1–17 (2013)

    Article  Google Scholar 

  16. Davis, N.A., Rezgui, A., Soliman, H., Manzanares, S., Coates, M.: Failuresim: a system for predicting hardware failures in cloud data centers using neural networks. In: 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), pp. 544–551. IEEE (2017)

  17. Entezari-Maleki, R., Trivedi, K.S., Sousa, L., Movaghar, A.: Performability-based workflow scheduling in grids. The Computer Journal (2018)

  18. Ever, E.: Performability analysis of cloud computing centers with large numbers of servers. J. Supercomput. 73(5), 2130–2156 (2017)

    Article  Google Scholar 

  19. Ghosh, R., Trivedi, K.S., Naik, V.K., Kim, D.S.: End-To-End performability analysis for infrastructure-as-a-service cloud: an interacting stochastic models approach. In: 2010 IEEE 16th Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 125–132. IEEE (2010)

  20. Goldberg, D.E., Lingle, R., et al.: Alleles, loci, and the traveling salesman problem. In: Proceedings of an International Conference on Genetic Algorithms and their Applications, vol. 154, pp. 154–159. Lawrence Erlbaum, Hillsdale (1985)

  21. Gorissen, D., Couckuyt, I., Demeester, P., Dhaene, T., Crombecq, K.: A surrogate modeling and adaptive sampling toolbox for computer based design. J. Mach. Learn. Res. 11, 2051–2055 (2010)

    Google Scholar 

  22. Gu, J., Hu, J., Zhao, T., Sun, G.: A new resource scheduling strategy based on genetic algorithm in cloud computing environment. J. Comput. 7(1), 42–52 (2012)

    Article  Google Scholar 

  23. Guimarães, A.P., Maciel, P.R., Matias, R.: An analytical modeling framework to evaluate converged networks through business-oriented metrics. Reliab. Eng. Syst. Saf. 118, 81–92 (2013)

    Article  Google Scholar 

  24. Hamby, D.: A review of techniques for parameter sensitivity analysis of environmental models. Environ. Monit. Assess. 32(2), 135–154 (1994)

    Article  Google Scholar 

  25. Hoffa, C., Mehta, G., Freeman, T., Deelman, E., Keahey, K., Berriman, B., Good, J.: On the use of cloud computing for scientific workflows. In: 2008. Escience’08. IEEE Fourth International Conference on Escience, pp. 640–645. IEEE (2008)

  26. Juve, G., Bharathi, S.: Pegasus synthetic workflow generator. https://confluence.pegasus.isi.edu/display/pegasus/WorkflowGenerator (2014)

  27. Juve, G., Deelman, E., Vahi, K., Mehta, G., Berriman, B., Berman, B.P., Maechling, P.: Scientific workflow applications on amazon Ec2. In: 2009 5th IEEE International Conference on E-Science Workshops, pp. 59–66. IEEE (2009)

  28. Kim, D.S., Machida, F., Trivedi, K.S.: Availability modeling and analysis of a virtualized system. In: 2009. PRDC’09. 15th IEEE Pacific Rim International Symposium on Dependable Computing, pp. 365–371. IEEE (2009)

  29. Kliazovich, D., Pecero, J.E., Tchernykh, A., Bouvry, P., Khan, S.U., Zomaya, A.Y.: Ca-dag: Modeling communication-aware applications for scheduling in cloud computing. J. Grid Comput. 14(1), 23–39 (2016)

    Article  Google Scholar 

  30. Kohne, A., Spohr, M., Nagel, L., Spinczyk, O.: Federatedcloudsim: a sla-aware federated cloud simulation framework. In: Proceedings of the 2nd International Workshop on CrossCloud Systems, pp. 3. ACM (2014)

  31. LD, D.B., Krishna, P.V.: Honey bee behavior inspired load balancing of tasks in cloud computing environments. Appl. Soft Comput. 13(5), 2292–2303 (2013)

    Article  Google Scholar 

  32. Lin, W., Wu, W., Wang, J.Z.: A heuristic task scheduling algorithm for heterogeneous virtual clusters. Sci. Program. 2016, Article ID 7040276 (2016)

  33. Maciel, P., Matos, R., Silva, B., Figueiredo, J., Oliveira, D., Fé, I., Maciel, R., Dantas, J.: Mercury: performance and dependability evaluation of systems with exponential, expolynomial, and general distributions. In: 2017 IEEE 22Nd Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 50–57. IEEE (2017)

  34. Mainkar, V., Trivedi, K.S.: Sufficient conditions for existence of a fixed point in stochastic reward net-based iterative models. IEEE Trans. Softw. Eng. 22(9), 640–653 (1996)

    Article  Google Scholar 

  35. Malawski, M., Juve, G., Deelman, E., Nabrzyski, J.: Algorithms for cost-and deadline-constrained provisioning for scientific workflow ensembles in iaas clouds. Futur. Gener. Comput. Syst. 48, 1–18 (2015)

    Article  Google Scholar 

  36. Meyer, J.F.: On evaluating the performability of degradable computing systems. IEEE Trans. Comput. C-29(8), 720–731 (1980)

    Article  MATH  Google Scholar 

  37. Mezmaz, M., Melab, N., Kessaci, Y., Lee, Y.C., Talbi, E.G., Zomaya, A.Y., Tuyttens, D.: A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J. Parallel Distrib. Comput. 71(11), 1497–1508 (2011)

    Article  Google Scholar 

  38. Molloy, M.K.: Performance analysis using stochastic petri nets. IEEE Trans. Comput. 31(9), 913–917 (1982)

    Article  Google Scholar 

  39. Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)

    Article  MathSciNet  MATH  Google Scholar 

  40. Oliveira, D., Matos, R., Dantas, J., Ferreira, J., Silva, B., Callou, G., Maciel, P., Brinkmann, A.: Advanced stochastic petri net modeling with the mercury scripting language. In: ValueTools 2017, 11th EAI International Conference on Performance Evaluation Methodologies and Tools. Venice, Italy. Elsevier (2017)

  41. Panda, S.K., Jana, P.K.: Efficient task scheduling algorithms for heterogeneous multi-cloud environment. J. Supercomput. 71(4), 1505–1533 (2015)

    Article  Google Scholar 

  42. Plateau, B., Atif, K.: Stochastic automata network of modeling parallel systems. IEEE Trans. Softw. Eng. 17(10), 1093–1108 (1991)

    Article  MathSciNet  Google Scholar 

  43. Qiu, X., Sun, P., Guo, X., Xiang, Y.: Performability analysis of a cloud system. In: 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC), pp. 1–6. IEEE (2015)

  44. Queipo, N.V., Haftka, R.T., Shyy, W., Goel, T., Vaidyanathan, R., Tucker, P.K.: Surrogate-based analysis and optimization. Prog. Aerosp. Sci. 41(1), 1–28 (2005)

    Article  Google Scholar 

  45. Raei, H., Yazdani, N.: Performability analysis of cloudlet in mobile cloud computing. Inform. Sci. 388, 99–117 (2017)

    Article  Google Scholar 

  46. Ramakrishnan, L., Reed, D.A.: Performability modeling for scheduling and fault tolerance strategies for scientific workflows. In: Proceedings of the 17th International Symposium on High Performance Distributed Computing, pp. 23–34. ACM (2008)

  47. Rimal, B.P., Maier, M.: Workflow scheduling in multi-tenant cloud computing environments. IEEE Trans. Parallel Distrib. Syst. 28(1), 290–304 (2017)

    Article  Google Scholar 

  48. Rodriguez, M.A., Buyya, R.: A taxonomy and survey on scheduling algorithms for scientific workflows in iaas cloud computing environments. Concurr. Comput. Pract. Exp. 29(8), e4041 (2017)

    Article  Google Scholar 

  49. Sousa, E., Lins, F., Tavares, E., Cunha, P., Maciel, P.: A modeling approach for cloud infrastructure planning considering dependability and cost requirements. IEEE Trans. Syst. Man Cybern. Syst. Hum. 45(4), 549–558 (2015)

    Article  Google Scholar 

  50. Sousa, E., Lins, F., Tavares, E., Maciel, P.: Cloud infrastructure planning considering different redundancy mechanisms. Computing 99(9), 841–864 (2017)

    Article  MathSciNet  Google Scholar 

  51. Swisher, J.R., Hyden, P.D., Jacobson, S.H., Schruben, L.W.: A Survey of simulation optimization techniques and procedures. In: Simulation Conference, 2000. Proceedings. Winter, vol. 1, pp. 119–128. IEEE (2000)

  52. Tawfeek, M.A., El-Sisi, A., Keshk, A.E., Torkey, F.A.: Cloud task scheduling based on ant colony optimization. In: 2013 8th International Conference on Computer Engineering & Systems (ICCES), pp. 64–69. IEEE (2013)

  53. Tsai, C.W., Rodrigues, J.J.: Metaheuristic scheduling for cloud: a survey. IEEE Syst. J. 8(1), 279–291 (2014)

    Article  Google Scholar 

  54. Vinay, K., Kumar, S.D.: Fault-tolerant scheduling for scientific workflows in cloud environments. In: 2017 IEEE 7th International Advance Computing Conference (IACC), pp. 150–155. IEEE (2017)

  55. Vöckler, J. S., Juve, G., Deelman, E., Rynge, M., Berriman, B.: Experiences using cloud computing for a scientific workflow application, In: Proceedings of the 2nd International Workshop on Scientific Cloud Computing, pp. 15–24. ACM (2011)

  56. Wang, J., Bao, W., Zhu, X., Yang, L.T., Xiang, Y.: Festal: fault-tolerant elastic scheduling algorithm for real-time tasks in virtualized clouds. IEEE Trans. Comput. 64(9), 2545–2558 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  57. Wang, T., Chang, X., Liu, B.: Performability analysis for iaas cloud data center. In: 2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 91–94. IEEE (2016)

  58. Xia, Y., Zhou, M., Luo, X., Zhu, Q., Li, J., Huang, Y.: Stochastic modeling and quality evaluation of infrastructure-as-a-service clouds. IEEE Trans. Autom. Sci. Eng. 12(1), 162–170 (2015)

    Article  Google Scholar 

  59. Xu, Y., Li, K., He, L., Zhang, L., Li, K.: A hybrid chemical reaction optimization scheme for task scheduling on heterogeneous computing systems. IEEE Trans. Parallel Distrib. Syst. 26 (12), 3208–3222 (2015)

    Article  Google Scholar 

  60. Zhao, C., Zhang, S., Liu, Q., Xie, J., Hu, J.: Independent tasks scheduling based on genetic algorithm in cloud computing. In: 2009. Wicom’09. 5th International Conference on Wireless Communications, Networking and Mobile Computing, pp. 1–4. IEEE (2009)

  61. Zhao, H.W., Tian, L.W.: Resource schedule algorithm based on artificial fish swarm in cloud computing environment. In: Applied Mechanics and Materials, vol. 635, pp. 1614–1617. Trans Tech Publ (2014)

  62. Zheng, W., Sakellariou, R.: Stochastic dag scheduling using a monte carlo approach. J. Parallel Distrib. Comput. 73(12), 1673–1689 (2013)

    Article  MATH  Google Scholar 

  63. Zheng, W., Wang, C., Zhang, D.: A randomization approach for stochastic workflow scheduling in clouds. Sci. Program. 2016, Article ID 9136107 (2016)

  64. Zheng, Z., Wang, R., Zhong, H., Zhang, X.: An approach for cloud resource scheduling based on parallel genetic algorithm. In: 2011 3rd International Conference on Computer Research and Development (ICCRD), vol. 2, pp. 444–447. IEEE (2011)

  65. Zhou, A., Wang, S., Sun, Q., Zou, H., Yang, F.: Ftcloudsim: a simulation tool for cloud service reliability enhancement mechanisms. In: Proceedings Demo & Poster Track of ACM/IFIP/USENIX International Middleware Conference, p. 2. ACM (2013)

  66. Zhu, X., Wang, J., Guo, H., Zhu, D., Yang, L.T., Liu, L.: Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Trans. Parallel Distrib. Syst. 27(12), 3501–3517 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Danilo Oliveira.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oliveira, D., Brinkmann, A., Rosa, N. et al. Performability Evaluation and Optimization of Workflow Applications in Cloud Environments. J Grid Computing 17, 749–770 (2019). https://doi.org/10.1007/s10723-019-09476-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-019-09476-0

Keywords

Navigation