Abstract
In this paper we study the problem of automated cloud application deployment and configuration. Transient failures are commonly found in current cloud infrastructures, attributed to the complexity of the software and hardware stacks utilized. These errors affect cloud application deployment, forcing the users to manually check and intervene in the deployment process. To address this challenge, we propose a simple yet powerful deployment methodology with error recovery features that bases its functionality on identifying the script dependencies and re-executing the appropriate configuration scripts. To guarantee the idempotent script execution, we adopt a filesystem snapshot mechanism that enables our approach to revert to a healthy filesystem state in case of failed script executions. Our experimental analysis indicates that our approach can resolve any transient deployment failure appearing during the deployment phase, even in highly unpredictable cloud environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that message transmission might not be instant (as implied by the Figure) since consumption of a specific message might occur much later than the message post, but the arrows are depicted perpendicular to the time axis for simplicity.
References
Ansible. http://www.ansible.com/home
AWS CloudFormation. http://aws.amazon.com/cloudformation/
AWS Elastic BeanStalk. http://aws.amazon.com/elasticbeanstalk/
AWS Incident. https://goo.gl/f959fl
AWS Instances Boot Times. http://goo.gl/NQ1qNw
AWS Maintenance. https://aws.amazon.com/maintenance-help/
Docker Container. https://www.docker.com/
Docker: Select a storage driver. https://goo.gl/o383To
Google App Engine Incident. https://goo.gl/ICI0Mo
Juju. https://juju.ubuntu.com/
Openstack Heat. https://wiki.openstack.org/wiki/Heat
Openstack Sahara. https://wiki.openstack.org/wiki/Sahara
Overlay Filesystem. https://goo.gl/y0H76w
Puppet. http://puppetlabs.com/
Rackspace SLAs. https://www.rackspace.com/information/legal/cloud/sla
Vagrant. https://www.vagrantup.com/
VMware vCloud Automation Center Documentation Center. http://goo.gl/YkKNic
Jennings, B., Stadler, R.: Resource management in clouds: survey and research challenges. J. Netw. Syst. Manag. 23(3), 567–619 (2015)
Juve, G., Deelman, E.: Automating application deployment in infrastructure clouds. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom), pp. 658–665. IEEE (2011)
Katsuno, Y., Takahashi, H.: An automated parallel approach for rapid deployment of composite application servers. In: 2015 IEEE International Conference on Cloud Engineering (IC2E), pp. 126–134. IEEE (2015)
Liu, C., Mao, Y., Van der Merwe, J., Fernandez, M.: Cloud resource orchestration: a data-centric approach. In: Proceedings of the biennial Conference on Innovative Data Systems Research (CIDR). pp. 1–8 (2011)
Lu, Q., Zhu, L., Xu, X., Bass, L., Li, S., Zhang, W., Wang, N.: Mechanisms and architectures for tail-tolerant system operations in cloud. In: 6th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 14) (2014)
Mell, P., Grance, T.: The NIST Definition of Cloud Computing (2011)
Potharaju, R., Jain, N.: When the network crumbles: an empirical study of cloud network failures and their impact on services. In: Proceedings of the 4th Annual Symposium on Cloud Computing, p. 15. ACM (2013)
Rodeh, O., Bacik, J., Mason, C.: Btrfs: the linux b-tree filesystem. ACM Trans. Storage (TOS) 9(3), 9 (2013)
Tsoumakos, D., Konstantinou, I., Boumpouka, C., Sioutas, S., Koziris, N.: Automated, elastic resource provisioning for nosql clusters using tiramola. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 34–41. IEEE (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Giannakopoulos, I., Konstantinou, I., Tsoumakos, D., Koziris, N. (2017). Recovering from Cloud Application Deployment Failures Through Re-execution. In: Sellis, T., Oikonomou, K. (eds) Algorithmic Aspects of Cloud Computing. ALGOCLOUD 2016. Lecture Notes in Computer Science(), vol 10230. Springer, Cham. https://doi.org/10.1007/978-3-319-57045-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-57045-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57044-0
Online ISBN: 978-3-319-57045-7
eBook Packages: Computer ScienceComputer Science (R0)