Abstract
Context
Despite being beneficial for managing computing infrastructure at scale, Ansible scripts include security weaknesses, such as hard-coded passwords. Security weaknesses can propagate into tasks, i.e., code constructs used for managing computing infrastructure with Ansible. Propagation of security weaknesses into tasks makes the provisioned infrastructure susceptible to security attacks. A systematic characterization of task infection, i.e., the propagation of security weaknesses into tasks, can aid practitioners and researchers in understanding how security weaknesses propagate into tasks and derive insights for practitioners to develop Ansible scripts securely.
Objective
The goal of the paper is to help practitioners and researchers understand how Ansible-managed computing infrastructure is impacted by security weaknesses by conducting an empirical study of task infections in Ansible scripts.
Method
We conduct an empirical study where we quantify the frequency of task infections in Ansible scripts. Upon detection of task infections, we apply qualitative analysis to determine task infection categories. We also conduct a survey with 23 practitioners to determine the prevalence and severity of identified task infection categories. With logistic regression analysis, we identify development factors that correlate with presence of task infections.
Results
In all, we identify 1,805 task infections in 27,213 scripts. We identify six task infection categories: anti-virus, continuous integration, data storage, message broker, networking, and virtualization. From our survey, we observe tasks used to manage data storage infrastructure perceived to have the most severe consequences. We also find three development factors, namely age, minor contributors, and scatteredness to correlate with the presence of task infections.
Conclusion
Our empirical study shows computing infrastructure managed by Ansible scripts to be impacted by security weaknesses. We conclude the paper by discussing the implications of our findings for practitioners and researchers.
Similar content being viewed by others
Data Availibility Statement
Dataset and source code used in our paper is publicly available online (Rahman 2023).
Notes
https://pyyaml.org/
https://www.vaultproject.io/
References
Agrawal A, Rahman A, Krishna R, Sobran A, Menzies T (2018) We don’t need another hero?: The impact of “heroes” on software development. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ACM, New York, NY, USA, ICSE-SEIP ’18, pp 245–253. https://doi.org/10.1145/3183519.3183549
Aho AV, Sethi R, Ullman JD (1986) Compilers, principles, techniques. Addison wesley 7(8):9
Akond R, Laurie W (2019) Source code properties of defective infrastructure as code scripts. Inf Softw Technol. https://doi.org/10.1016/j.infsof.2019.04.013
Ansible (2020) Ansible Documentation. https://docs.ansible.com/, [Online; accessed 19-December-2020]
Ansible (2022) Ansible best practices. https://docs.ansible.com/ansible/2.8/, [Online; accessed 10-Sep-2022]
Banavar G, Chandra TD, Strom RE, Sturman DC (1999) A case for message oriented middleware. In: Proceedings of the 13th international symposium on distributed computing, Springer-Verlag, Berlin, Heidelberg, pp 1–18
Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code! examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European conference on foundations of software engineering, association for computing machinery, New York, NY, USA, ESEC/FSE ’11, pp 4–14. https://doi.org/10.1145/2025113.2025119
Borovits N, Kumara I, Di Nucci D, Krishnan P, Palma SD, Palomba F, Tamburri DA, Heuvel WJvd, (2022) Findici: Using machine learning to detect linguistic inconsistencies between code and natural language descriptions in infrastructure-as-code. Empir Softw Eng 27(7):178
Carver JC (2010) Towards reporting guidelines for experimental replications: a proposal. In: 1st international workshop on replication in empirical software engineering, vol 1, pp 1–4
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46. https://doi.org/10.1177/001316446002000104
Cohen P, West SG, Aiken LS (2014) Applied multiple regression/correlation analysis for the behavioral sciences. Psychology Press
Cozens B (2022) 10 habits of great ansible users. https://www.redhat.com/sysadmin/10-great-ansible-practices, [Online;accessed 10-Sep-2022]
Cramer D, Howitt DL (2004) The Sage dictionary of statistics: a practical resource for students in the social sciences. Sage
Da Silva FQ, Suassuna M, França ACC, Grubb AM, Gouveia TB, Monteiro CV, dos Santos IE (2014) Replication of empirical studies in software engineering research: a systematic mapping study. Empir Softw Eng 19:501–557
Dalla Palma S, Di Nucci D, Palomba F, Tamburri DA (2020) Toward a catalog of software quality metrics for infrastructure code. J Syst Softw 170:110726
Dalla Palma S, Di Nucci D, Palomba F, Tamburri DA (2022) Within-project defect prediction of infrastructure-as-code using product and process metrics. IEEE Trans Softw Eng 48(6):2086–2104. https://doi.org/10.1109/TSE.2021.3051492
Davis V (2019) Ansible role patterns and anti-patterns by lee garrett, its debian maintainer. https://hub.packtpub.com/ansible-role-patterns-and-anti-patterns-by-lee-garrett-its-debian-maintainer/, [Online; accessed 11-Sep-2022]
Droms R (1999) Automated configuration of tcp/ip with dhcp. IEEE Internet Comput 3(4):45–53. https://doi.org/10.1109/4236.780960
Duvall P, Matyas SM, Glover A (2007) Continuous integration: improving software quality and reducing risk (The Addison-Wesley Signature Series). Addison-Wesley Professional
Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting Empirical Methods for Software Engineering Research. Springer, London, London, pp 285–311
Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press
Greenwood PE, Nikulin MS (1996) A guide to chi-squared testing, vol 280. Wiley
Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st international conference on software engineering, IEEE Computer Society, Washington, DC, USA, ICSE ’09, pp 78–88. https://doi.org/10.1109/ICSE.2009.5070510
Hortlund A (2021) Security smells in open-source infrastructure as code scripts: A replication study
Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley
Hu H, Bu Y, Wong K, Sood G, Smiley K, Rahman A (2023) Characterizing static analysis alerts for terraform manifests: An experience report. In: 2023 IEEE secure development conference (SecDev), IEEE computer society, Los Alamitos, CA, USA, pp 7–13. https://doi.org/10.1109/SecDev56634.2023.00014. https://doi.ieeecomputersociety.org/10.1109/SecDev56634.2023.00014
Humble J, Farley D (2010) Continuous delivery: reliable software releases through build, test, and deployment automation, 1st edn. Addison-Wesley Professional
Jenkins (2022) Jenkins. https://www.jenkins.io/, [Online; accessed 23-Jan-2022]
Kitchenham BA, Pfleeger SL (2008) Personal Opinion Surveys, Springer London, London, pp 63–92. https://doi.org/10.1007/978-1-84800-044-5_3
Kokuryo S, Kondo M, Mizuno O (2020) An empirical study of utilization of imperative modules in ansible. In: 2020 IEEE 20th international conference on software quality, reliability and security (QRS), pp 442–449. https://doi.org/10.1109/QRS51102.2020.00063
Krein JL, Knutson CD (2010) A case for replication : synthesizing research methodologies in software engineering
Krishna R, Agrawal A, Rahman A, Sobran A, Menzies T (2018) What is the connection between issues, bugs, and enhancements?: Lessons learned from 800+ software projects. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ACM, New York, NY, USA, ICSE-SEIP ’18, pp 306–315. https://doi.org/10.1145/3183519.3183548
Labs P (2021) Puppet Documentation. https://docs.puppet.com/, [Online;accessed 01-July-2021]
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174. http://www.jstor.org/stable/2529310
Lombardi MM, Oblinger DG (2007) Authentic learning for the 21st century: an overview. Educause learning initiative 1(2007):1–12
Long JS, Freese J (2006) Regression models for categorical dependent variables using Stata, vol 7. Stata Press
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60. http://www.jstor.org/stable/2236101
Meli M, McNiece MR, Reaves B (2019) How bad can it git? characterizing secret leakage in public github repositories. In: NDSS
Menard S (2002) Applied logistic regression analysis. 106, Sage
Miller M (2019) Hardcoded and Embedded Credentials are an IT Security Hazard–Here’s What You Need to Know. https://www.beyondtrust.com/blog/entry/hardcoded-and-embedded-credentials-are-an-it-security-hazard-heres-what-you-need-to-know, [Online; accessed 17-Jan-2022]
Mohammad Mehedi H, Rahman A (2022) As code testing: Characterizing test quality in open source ansible development. In: 2022 15th IEEE Conference on Software Testing, Verification and Validation (ICST), IEEE Computer Society, Los Alamitos, CA, USA. https://akondrahman.github.io/publication/icst2022
Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005., pp 284–292. https://doi.org/10.1109/ICSE.2005.1553571
Opdebeeck R, Zerouali A, De Roover C (2022) Smelly variables in ansible infrastructure code: Detection, prevalence, and lifetime. In: 2022 IEEE/ACM 18th international conference on mining software repositories (MSR), IEEE
Opdebeeck R, Zerouali A, Roover CD (2023) Control and data flow in security smell detection for infrastructure as code: Is it worth the effort? In: 2023 IEEE/ACM 19th international conference on mining software repositories (MSR)
redhat performance (2022) redhat-performance/satperf. https://github.com/redhat-performance/satperf, [Online; accessed 02-July-2022]
Radjenović D, Heričko M, Torkar R, Živkovič A (2013) Software fault prediction metrics: a systematic literature review. Inf Softw Technol 55(8):1397–1418
Rahman A (2023) Verifiability package for paper. https://figshare.com/s/c9d7b8aa973f53f02234, [Online; accessed 25-August-2023]
Rahman A, Parnin C (2023) Detecting and characterizing propagation of security weaknesses in puppet-based infrastructure management. IEEE Trans Softw Eng 49(06):3536–3553. https://doi.org/10.1109/TSE.2023.3265962
Rahman A, Williams L (2021) Different kind of smells: Security smells in infrastructure as code scripts. IEEE Security Privacy 19(3):33–41. https://doi.org/10.1109/MSEC.2021.3065190
Rahman A, Agrawal A, Krishna R, Sobran A (2018a) Characterizing the influence of continuous integration: Empirical results from 250+ open source and proprietary projects. In: Proceedings of the 4th ACM SIGSOFT International Workshop on Software Analytics, ACM, New York, NY, USA, SWAN 2018, pp 8–14. https://doi.org/10.1145/3278142.3278149
Rahman A, Mahdavi-Hezaveh R, Williams L (2018) A systematic mapping study of infrastructure as code research. Inf Softw Technol. https://doi.org/10.1016/j.infsof.2018.12.004. https://www.sciencedirect.com/science/article/pii/S0950584918302507
Rahman A, Parnin C, Williams L (2019) The seven sins: security smells in infrastructure as code scripts. In: 2019 IEEE/ACM 41st international conference on software engineering (ICSE), IEEE, pp 164–175
Rahman A, Farhana E, Parnin C, Williams L (2020a) Gang of eight: a defect taxonomy for infrastructure as code scripts. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering, association for computing machinery, New York, NY, USA, ICSE ’20, p 752–764. https://doi.org/10.1145/3377811.3380409
Rahman A, Farhana E, Williams L (2020) The ’as code’activities: development anti-patterns for infrastructure as code. Empir Softw Eng 25(5):3430–3467
Rahman A, Barsha FL, Morrison P (2021a) Shhh!: 12 practices for secret management in infrastructure as code. In: 2021 IEEE secure development conference (SecDev), pp 56–62. https://doi.org/10.1109/SecDev51306.2021.00024
Rahman A, Rahman MR, Parnin C, Williams L (2021b) Security smells in ansible and chef scripts: a replication study. ACM Trans Softw Eng Methodol 30(1). https://doi.org/10.1145/3408897
Rahman A, Shamim SI, Shahriar H, Wu F (2022) Can we use authentic learning to educate students about secure infrastructure as code development?, Association for Computing Machinery, New York, NY, USA. https://akondrahman.github.io/publication/iticse2022
Rahman F, Devanbu P (2013) How, and why, process metrics are better. In: 2013 35th international conference on software engineering (ICSE), pp 432–441. https://doi.org/10.1109/ICSE.2013.6606589
RedHat (2022a) Customer Case Study - NEC. https://www.ansible.com/hubfs/pdf/Ansible-Case-Study-NEC.pdf, [Online; accessed 12-Sep-2022]
RedHat (2022b) Customer Case Study - NetApp. https://www.ansible.com/hubfs/2018_Content/RH-netapp-case-study.pdf, [Online; accessed 02-Oct-2022]
Reis S, Abreu R, d’Amorim M, Fortunato D (2023) Leveraging practitioners’ feedback to improve a security linter. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE ’22. https://doi.org/10.1145/3551349.3560419
Ryan J (2022) Ansible automation platform: private automation hub. https://people.redhat.com/bdumont/Central-Region-Lunch-n-Learns/Ansible_Automation_Platform_Private_Automation_Hub.pdf, [Online; accessed 10-Dec-2022]
Saavedra N, Ferreira JaF (2023) Glitch: Automated polyglot security smell detection in infrastructure as code. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE ’22. https://doi.org/10.1145/3551349.3556945
Saldaña J (2015) The coding manual for qualitative researchers. Sage
Schwarz J (2019) Hardcoded and Embedded Credentials are an IT Security Hazard –Here’s What You Need to Know. https://www.beyondtrust.com/blog/entry/hardcoded-and-embedded-credentials-are-an-it-security-hazard-heres-what-you-need-to-know, [Online; accessed 02-July-2021]
Shull FJ, Carver JC, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13:211–218
Smith E, Loftin R, Murphy-Hill E, Bird C, Zimmermann T (2013) Improving developer participation rates in surveys. In: 2013 6th international workshop on cooperative and human aspects of software engineering (CHASE), pp 89–92. https://doi.org/10.1109/CHASE.2013.6614738
Smith J, Johnson B, Murphy-Hill E, Chu B, Lipford HR (2015) Questions developers ask while diagnosing potential security vulnerabilities with static analysis. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, association for computing machinery, New York, NY, USA, ESEC/FSE 2015, p 248–259. https://doi.org/10.1145/2786805.2786812
Tan PN, Steinbach M, Kumar V (2005) Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Boston, MA, USA
Brikman Y (2016) Why we use terraform and not chef, puppet, ansible, saltstack, or cloudformation. https://blog.gruntwork.io/why-we-use-terraform-and-not-chef-puppet-ansible-saltstack-or-cloudformation-7989dad2865c, [Online; accessed 24-July-2023]
Acknowledgements
We thank the PASER group at Auburn University for their valuable feedback. This research was partially funded by the U.S. National Science Foundation (NSF) Award # 2247141, Award # 2310179, Award # 2312321, and the U.S. National Security Agency (NSA) Award # H98230-21-1-0175.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Communicated by: Mika Mäntylä.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rahman, A., Bose, D.B., Zhang, Y. et al. An empirical study of task infections in Ansible scripts. Empir Software Eng 29, 34 (2024). https://doi.org/10.1007/s10664-023-10432-6
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10432-6