Abstract
Due to the increasing number of constituting jobs and input data size, the execution of modern complex workflow-based applications on cloud requires a large number of virtual machines (VMs), which makes the cost a great concern. Under the constraints of VM processing and storage capabilities and communication bandwidths between VMs, how to quickly figure out a cost-optimal resource provisioning and scheduling solution for a given cloud workflow is becoming a challenge. The things become even worse when taking the infrastructure-related failures with transient characteristics into account. To address this problem, this paper proposes a soft error aware VM selection and task scheduling approach that can achieve near-optimal the lowest possible cost. Under the reliability and completion time constraints by tenants, our approach can figure out a set of VMs with specific CPU and memory configurations and generate a cost-optimal schedule by allocating tasks to appropriate VMs. Comprehensive experimental results on well-known scientific workflow benchmarks show that compared with state-of-the-art methods, our approach can achieve up to 66% cost reduction while satisfying both reliability and completion time constraints.
Supported by the grants from National Key Research and Development Program of China (No. 2018YFB2101300), Natural Science Foundation of China (No. 61872147) and National Science Foundation (No. CCF-1900904, No. CCF-1619243, No. CCF-1537085 (CAREER)).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, X., et al.: The Design of Cloud Workflow Systems. Springer, New York (2012). https://doi.org/10.1007/978-1-4614-1933-4
Vishwanath, K.V., Nagappan, N.: Characterizing cloud computing hardware reliability. In: Proceedings of ACM Symposium on Cloud Computing (SoCC), pp. 193–204 (2010)
Wu, T., Gu, H., Zhou, J., Wei, T., Liu, X., Chen, M.: Soft error-aware energy-efficient task scheduling for workflow applications in DVFS-enabled cloud. J. Syst. Archit. 84, 12–27 (2018)
Wei, T., Chen, X., Hu, S.: Reliability-driven energy-efficient task scheduling for multiprocessor real-time systems. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. (TCAD) 30(10), 1569–1573 (2011)
Topcuoglu, H., Hariri, S., Wu, M.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. (TPDS) 13(3), 260–274 (2002)
Pandey, S., Wu, L., Guru, S.M., Buyya, R.: A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In: Proceedings of International Conference on Advanced Information Networking and Applications, pp. 400–407 (2010)
Qiu, M., Sha, E.H.M.: Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems. ACM Trans. Des. Autom. Electron. Syst. (TODAES) 14(2), 1–30 (2009)
Zhang, M., Li, H., Liu, L., Buyya, R.: An adaptive multi-objective evolutionary algorithm for constrained workflow scheduling in Clouds. Distrib. Parallel Databases 36(2), 339–368 (2018)
Sahni, J., Vidyarthi, D.P.: A cost-effective deadline-constrained dynamic scheduling algorithm for scientific workflows in a cloud environment. IEEE Trans. Cloud Comput. 6(1), 2–18 (2015)
Chen, M., Huang, S., Fu, X., Liu, X., He, J.: Statistical model checking-based evaluation and optimization for cloud workflow resource allocation. IEEE Trans. Cloud Comput. 1 (2016)
Wang, X., Yeo, C.S., Buyya, R., Su, J.: Optimizing the makespan and reliability for workflow applications with reputation and a look-ahead genetic algorithm. Future Gener. Comput. Syst. 27(8), 1–18 (2011)
Wen, Z., Cala, J., Watson, P., Romanovsky, A.: Cost effective, reliable, and secure workflow deployment over federated clouds. In: Proceedings of IEEE International Conference on Cloud Computing, pp. 604–612 (2015)
Han, L., Canon, L., Casanova, H., Robert, Y., Vivien, F.: Checkpointing workflows for fail-stop errors. IEEE Trans. Comput. 67(8), 1105–1120 (2018)
Zhang, L., Li, K., Li, C., Li, K.: Bi-objective workflow scheduling of the energy consumption and reliability in heterogeneous computing systems. Inf. Sci. 379, 241–256 (2016)
Whitley, D.: A genetic algorithm tutorial. Stat. Comput. 4(2), 65–85 (1994)
Zhang, X., Wu, T., Chen, M., Wei, T., Zhou, J., Hu, S., Buyya, R.: Energy-aware virtual machine allocation for cloud with resource reservation. J. Syst. Softw. 147, 147–161 (2019)
Gai, K., Qiu, M., Zhao, H.: Cost-aware multimedia data allocation for heterogeneous memory using genetic algorithm in cloud computing. IEEE Trans. Cloud Comput. 1 (2016)
Chen, W., Deelman, E.: WorkflowSim: a toolkit for simulating scientific workflows in distributed environments. In: Proceedings of International Conference on E-Science, pp. 1–8 (2012)
Bharathi, S., Chervenak, A., Deelman, E., Mehta, G., Su, M., Vahi, K.: Characterization of scientific workflows. In: Proceedings of International Workshop on Workflows in Support of Large-Scale Science, pp. 1–10 (2008)
Da Silva, R.F., Chen, W., Juve, G., Vahi, K., Deelman, E.: Community resources for enabling research in distributed scientific workflows. In: Proceedings of International Conference on e-Science, pp. 177–184 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Cao, E. et al. (2020). Reliability Aware Cost Optimization for Memory Constrained Cloud Workflows. In: Wen, S., Zomaya, A., Yang, L.T. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2019. Lecture Notes in Computer Science(), vol 11945. Springer, Cham. https://doi.org/10.1007/978-3-030-38961-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-38961-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38960-4
Online ISBN: 978-3-030-38961-1
eBook Packages: Computer ScienceComputer Science (R0)