ABSTRACT
To achieve dependability, system designers often resort to fault-tolerance mechanisms. The evaluation of these mechanisms requires the observation of failures, which typically are rare events. To increase the failure rate, practitioners use fault injection techniques, leading to an increased occurrence of failures and allowing the assessment of the system's dependability properties. While many fault injection tools exist for this end, they are usually limited in scope, applicability, and in their configuration abilities for microservice applications.
We propose a generalist and extensible tool named "Defektor" capable of controlling a fault injection campaign on different types of applications, especially suited for microservice-based applications, compatible with different container orchestration technologies and different fault injection tools. The Defektor configuration follows a high-level approach, based on an injection campaign plan specifying the instructions for the Defektor operation and the parameters of the fault injection campaign. Defektor automates the entire workflow, consisting of defining the campaign plan, generating a workload, specifying and injecting the faults, collecting data, aiding the experiment's repeatability, improving the consistency of results, and saving a considerable amount of time.
- Baptista, G., Bento, A., Correia, J.: Defektor, https://github.com/SysOBs/defektorGoogle Scholar
- Basiri, A., Behnam, N., De Rooij, R., Hochstein, L., Kosewski, L., Reynolds, J., Rosenthal, C.: Chaos engineering. IEEE Software 33(3), 35--41 (2016)Google ScholarDigital Library
- Brown, S.: The C4 model for visualising software architecture, https://c4model.com/Google Scholar
- Carreira, J., Madeira, H., Silva, J.G.: Xception: A technique for the experimental evaluation of dependability in modern computers. IEEE Trans. Softw. Eng. 24(2), 125--136 (Feb 1998). https://doi.org/10.1109/32.666826 Google ScholarDigital Library
- CNCF SURVEY 2019: Deployments are getting larger as cloud native adoption becomes mainstream (2019), https://www.cncf.io/wp-content/uploads/2020/08/CNCF_Survey_Report.pdfGoogle Scholar
- Durães, J. and Vieira, M. and Madeira H.: Dependability benchmarking of web-servers. In: Lecture Notes in Computer Science. SAFECOMP 2004, vol. 3219, pp. 297--310 (2004). Google ScholarCross Ref
- Heorhiadi, V., Rajagopalan, S., Jamjoom, H., Reiter, M.K., Sekar, V.: Gremlin: Systematic resilience testing of microservices. In: 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS). pp. 57--66 (2016). Google ScholarCross Ref
- Instana: Robot shop: Sample microservice application (2021), https://github.com/instana/robot-shopGoogle Scholar
- Kanawati, G., Kanawati, N., Abraham, J.: Ferrari: a tool for the validation of system dependability properties. In: [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing. pp. 336--344 (1992). Google ScholarCross Ref
- What is Kubernetes?, https://kubernetes.ioGoogle Scholar
- Litmus: Chaos engineering for your kubernetes, https://docs.litmuschaos.io/docs/introduction/what-is-litmusGoogle Scholar
- Merkel, D.: Docker: Lightweight linux containers for consistent development and deployment. Linux J. 2014(239) (Mar 2014)Google ScholarDigital Library
- Moraes, R. and Duraes, J. and Barbosa, R. and Martins, E. and Madeira, H.: Experimental risk assessment and comparison using software fault injection. In: 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07). pp. 512--521 (2007). Google ScholarDigital Library
- Netflix: Simianarmy, https://github.com/Netflix/SimianArmyGoogle Scholar
- Vieira, Marco and Madeira, Henrique: A dependability benchmark for oltp application environments. In: Proceedings of the 29th International Conference on Very Large Data Bases - Volume 29. p. 742--753. VLDB '03, VLDB Endowment (2003)Google ScholarCross Ref
- Wu, N., Zuo, D., Zhang, Z.: An extensible fault tolerance testing framework for microservice-based cloud applications. In: Proceedings of the 4th International Conference on Communication and Information Processing. pp. 38--42 (2018)Google ScholarDigital Library
Index Terms
- Defektor: An Extensible Tool for Fault Injection Campaign Management in Microservice Systems
Recommendations
An extensible fault tolerance testing framework for microservice-based cloud applications
ICCIP '18: Proceedings of the 4th International Conference on Communication and Information ProcessingA growing number of enterprises are beginning to adopt the microservice architecture to build their applications in clouds. The microservice architecture breakdowns the traditional development pattern of monolithic applications. The heterogeneity of the ...
Fault Injection and Dependability Evaluation of Fault-Tolerant Systems
The authors describe a dependability evaluation method based on fault injection that establishes the link between the experimental evaluation of the fault tolerance process and the fault occurrence process. The main characteristics of a fault injection ...
Stress-Based and Path-Based Fault Injection
The objective of fault injection is to mimic the existence of faults and to force the exercise of the fault tolerance mechanisms of the target system. To maximize the efficacy of each injection, the locations, timing, and conditions for faults being ...
Comments