Abstract
Cloud-native applications consist of highly specialized and decoupled services that can be deployed, scaled and managed independently. Maintaining such applications available is a complex task for operators, because software defects and other kinds of faults can be challenging to diagnose and repair to quickly resume operations. Autonomic service operation is therefore a promising approach. However, there are risks associated to guaranteeing safe autonomic actuation, which must be managed. This paper discusses the challenges identified in the context of the development of a platform for autonomic service operation and describe the software architecture of the platform. Results show mean times to detect, diagnose and repair failures in the order of tens of seconds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Apache Software Foundation: Apache Kafka. https://kafka.apache.org/. Accessed 01 June 2021
Cerveira, F., Barbosa, R., Madeira, H., Araújo, F.: The effects of soft errors and mitigation strategies for virtualization servers. IEEE Trans. Cloud Comput. (2020)
Florio, L., Nitto, E.D.: GRU: an approach to introduce decentralized autonomic behavior in microservices architectures (2016). https://doi.org/10.1109/ICAC.2016.25
Gunawi, H.S., et al.: Why does the cloud stop computing? lessons from hundreds of service outages (2016). https://doi.org/10.1145/2987550.2987583
Instana: Stan’s robot shop, a sample microservice application (2021). https://github.com/instana/robot-shop
Jamshidi, P., Pahl, C., Mendonça, N.C., Lewis, J., Tilkov, S.: Microservices: the journey so far and challenges ahead. IEEE Softw. 35(3), 24–35 (2018)
Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36, 41–50 (2003). https://doi.org/10.1109/MC.2003.1160055
Liu, H., Lu, S., Musuvathi, M., Nath, S.: What bugs cause production cloud incidents? (2019). https://doi.org/10.1145/3317550.3321438
Wu, L., Tordsson, J., Acker, A., Kao, O.: MicroRAS: Automatic recovery in the absence of historical failure data for microservice systems (2020)
Acknowledgements
This work has been funded through the FCT - Foundation for Science and Technology, I.P., within the scope of project CISUC - UID/CEC/00326/2020, by the European Social Fund, through the Regional Operational Program Centro 2020, and by the AESOP project (P2020-31/SI/2017, No. 040004).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tomás, J. et al. (2021). Autonomic Service Operation for Cloud Applications: Safe Actuation and Risk Management. In: Adler, R., et al. Dependable Computing - EDCC 2021 Workshops. EDCC 2021. Communications in Computer and Information Science, vol 1462. Springer, Cham. https://doi.org/10.1007/978-3-030-86507-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-86507-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86506-1
Online ISBN: 978-3-030-86507-8
eBook Packages: Computer ScienceComputer Science (R0)