research-article

Asserting reliable convergence for configuration management scripts

Authors:

Oliver Hanappi,

Waldemar Hummer,

Schahram DustdarAuthors Info & Claims

OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

Pages 328 - 343

https://doi.org/10.1145/2983990.2984000

Published: 19 October 2016 Publication History

Abstract

The rise of elastically scaling applications that frequently deploy new machines has led to the adoption of DevOps practices across the cloud engineering stack. So-called configuration management tools utilize scripts that are based on declarative resource descriptions and make the system converge to the desired state. It is crucial for convergent configurations to be able to gracefully handle transient faults, e.g., network outages when downloading and installing software packages. In this paper we introduce a conceptual framework for asserting reliable convergence in configuration management. Based on a formal definition of configuration scripts and their resources, we utilize state transition graphs to test whether a script makes the system converge to the desired state under different conditions. In our generalized model, configuration actions are partially ordered, often resulting in prohibitively many possible execution orders. To reduce this problem space, we define and analyze a property called preservation, and we show that if preservation holds for all pairs of resources, then convergence holds for the entire configuration. Our implementation builds on Puppet, but the approach is equally applicable to other frameworks like Chef, Ansible, etc. We perform a comprehensive evaluation based on real world Puppet scripts and show the effectiveness of the approach. Our tool is able to detect all idempotence and convergence related issues in a set of existing Puppet scripts with known issues as well as some hitherto undiscovered bugs in a large random sample of scripts.

References

[1]

J.-P. Arcangeli, R. Boujbel, and S. Leriche. Automatic deployment of distributed software systems: Definitions and state of the art. Journal of Systems and Software, 103, 2015.

Digital Library

[2]

A. Arnold. Finite Transition Systems: Semantics of Communicating Systems. Prentice Hall, 1994.

Digital Library

[3]

G. Brightwell and P. Winkler. Counting Linear Extensions is #P-complete. In 23rd Annual ACM Symposium on Theory of Computing (STOC), pages 175–181, 1991.

Digital Library

[4]

R. Bubley and M. Dyer. Faster random generation of linear extensions. Discrete Mathematics, 201, 1999.

Digital Library

[5]

M. Burgess. CFEngine: a site configuration engine. Computing Systems, 8(3), 1995.

[6]

M. Burgess and A. Couch. Modeling Next Generation Configuration Management Tools. In 20th Int. Conference on Large Installation System Administration (LISA), 2006.

Digital Library

[7]

Chef Software, Inc. Ohai. https://docs.chef.io/ohai. html, 2015.

[8]

J. Collard, N. Gupta, R. Shambaugh, A. Weiss, and A. Guha. On Static Verification of Puppet System Configurations. CoRR, 2015.

[9]

A. Couch and M. Chiarini. Dynamic Consistency Analysis for Convergent Operators. In Resilient Networks and Services. 2008.

Digital Library

[10]

A. Couch and N. Daniels. The Maelstrom: Network Service Debugging via ”Ineffective Procedures”. In 15th USENIX Conference on Large Installation System Administration (LISA), pages 63–78, 2001.

Digital Library

[11]

A. Couch and Y. Sun. On the Algebraic Structure of Convergence. In Self-Managing Distributed Systems, pages 28–40, 2003.

[12]

A. Couch and Y. Sun. On observed reproducibility in network configuration management. Science of Computer Programming, 2004.

[13]

T. Delaet, W. Joosen, and B. Vanbrabant. A Survey of System Configuration Tools. In 24th International Conference on Large Installation System Administration (LISA). USENIX Association, 2010.

Digital Library

[14]

S. Erdweg, M. Lichter, and M. Weiel. A sound and optimal incremental build system with dynamic dependencies. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 89–106, 2015.

Digital Library

[15]

F. Erich, C. Amrit, and M. Daneva. A Mapping Study on Cooperation between Information System Development and Operations. In Product-Focused Software Process Improvement. 2014.

[16]

A. Gambi, W. Hummer, H.-L. Truong, and S. Dustdar. Testing Elastic Computing Systems. IEEE Internet Computing, 17(6):76–82, 2013.

Digital Library

[17]

R. Harrison. How to Avoid Puppet Dependency Nightmares With Defines. https://blog.openshift.com/how-toavoid-puppet-dependency-nightmares-with-defines, retrieved on 12/15/2015, 2013.

[18]

L. Hochstein. Ansible: Up and Running. O’Reilly Media, Inc., 2014.

Digital Library

[19]

W. Hummer, F. Rosenberg, F. Oliveira, and T. Eilam. Testing Idempotence for Infrastructure as Code. In 14th ACM/ IFIP/USENIX International Middleware Conference. 2013.

[20]

M. Hüttermann. DevOps for developers. Apress, 2012.

[21]

S. Krum, W. Hevelingen, B. Kero, J. Turnbull, and J. Mc-Cune. Pro Puppet. Apress, 2013.

Digital Library

[22]

J. Loope. Managing Infrastructure with Puppet. O’Reilly Media, Inc., 2011.

Digital Library

[23]

D. Merkel. Docker: Lightweight Linux Containers for Consistent Development and Deployment. Linux Journal, 2014(239), Mar. 2014.

Digital Library

[24]

M. Miglierina. Application Deployment and Management in the Cloud. In 2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2014.

[25]

S. Ntafos and S. Hakimi. On Path Cover Problems in Digraphs and Applications to Program Testing. IEEE Transactions on Software Engineering, SE-5(5):520–529, 1979.

Digital Library

[26]

J. Offutt, S. Liu, A. Abdurazik, and P. Ammann. Generating test data from state-based specifications. Software Testing, Verification and Reliability, 13(1):25–53, 2003.

[27]

H. Powell. ZFS and Btrfs: A Quick Introduction to Modern Filesystems. Linux J., 2012(218), June 2012.

Digital Library

[28]

V. Sobeslav and A. Komarek. OpenSource Automation in Cloud Computing. In 4th International Conference on Computer Engineering and Networks, pages 805–812. 2015.

[29]

D. Spinellis. Don’t Install Software by Hand. IEEE Software, 2012.

Digital Library

[30]

M. Taylor and S. Vargo. Learning Chef: A Guide to Configuration Management and Automation. O’Reilly Media, 2014.

Digital Library

[31]

J. Tretmans. Model Based Testing with Labelled Transition Systems. In Formal Methods and Testing, pages 1–38. Springer, 2008.

Digital Library

[32]

L. Valiant. The complexity of computing the permanent. Theoretical Computer Science, 8(2), 1979.

[33]

F. van Ham, H. van de Wetering, and J. van Wijk. Interactive visualization of state transition systems. IEEE Transactions on Visualization and Computer Graphics, 8(4):319– 329, 2002.

Digital Library

[34]

J. Wettinger, U. Breitenbücher, and F. Leymann. Compensation-Based vs. Convergent Deployment Automation for Services Operated in the Cloud. In 12th International Conference on Service-Oriented Computing (ICSOC), pages 336–350, 2014.

Cited By

Drosos GSotiropoulos TAlexopoulos GMitropoulos DSu Z(2024)When Your Infrastructure Is a Buggy Program: Understanding Faults in Infrastructure as Code EcosystemsProceedings of the ACM on Programming Languages10.1145/36897998:OOPSLA2(2490-2520)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689799
Yıldıran NOh JLawall JGazzillo P(2024)Maximizing Patch Coverage for Testing of Highly-Configurable Software without Exploding Build TimesProceedings of the ACM on Software Engineering10.1145/36437461:FSE(427-449)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643746
Shimizu RNunomura YKanuka H(2024)Test-suite-guided discovery of least privilege for cloud infrastructure as codeAutomated Software Engineering10.1007/s10515-024-00420-531:1Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1007/s10515-024-00420-5
Show More Cited By

Index Terms

Asserting reliable convergence for configuration management scripts
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Project and people management
2. Software and its engineering
  1. Software creation and management
    1. Software development process management
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. General programming languages
      1. Language features
    2. Software configuration management and version control systems

Recommendations

Asserting reliable convergence for configuration management scripts
OOPSLA '16

The rise of elastically scaling applications that frequently deploy new machines has led to the adoption of DevOps practices across the cloud engineering stack. So-called configuration management tools utilize scripts that are based on declarative ...
Private Cloud Configuration with MetaConfig
CLOUD '11: Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing

With the advent of private clouds, the challenge of configuring a mix of physical and virtual machines is no longer reserved to a few system administrator gurus. How to assign virtual machines onto physical machines to leverage the available resources? ...
Lpar configuration and management: working with ibm eserver

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

October 2016

915 pages

ISBN:9781450344449

DOI:10.1145/2983990

General Chair:
Eelco Visser
Delft University of Technology, Netherlands
,
Program Chair:
Yannis Smaragdakis
University of Athens, Greece

ACM SIGPLAN Notices Volume 51, Issue 10
OOPSLA '16
October 2016
915 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3022671
Editor:
Matthew Fluet
Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

In-Cooperation

SIGAda: ACM Special Interest Group on Ada Programming Language

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Distinguished Paper

Author Tags

Qualifiers

Research-article

Conference

SPLASH '16

Sponsor:

SIGPLAN

SPLASH '16: Conference on Systems, Programming, Languages, and Applications: Software for Humanity

November 2 - 4, 2016

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 268 of 1,244 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
492
Total Downloads

Downloads (Last 12 months)34
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Drosos GSotiropoulos TAlexopoulos GMitropoulos DSu Z(2024)When Your Infrastructure Is a Buggy Program: Understanding Faults in Infrastructure as Code EcosystemsProceedings of the ACM on Programming Languages10.1145/36897998:OOPSLA2(2490-2520)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689799
Yıldıran NOh JLawall JGazzillo P(2024)Maximizing Patch Coverage for Testing of Highly-Configurable Software without Exploding Build TimesProceedings of the ACM on Software Engineering10.1145/36437461:FSE(427-449)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643746
Shimizu RNunomura YKanuka H(2024)Test-suite-guided discovery of least privilege for cloud infrastructure as codeAutomated Software Engineering10.1007/s10515-024-00420-531:1Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1007/s10515-024-00420-5
Saavedra NFerreira J(2022)GLITCH: Automated Polyglot Security Smell Detection in Infrastructure as CodeProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556945(1-12)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3556945
Badalyan DBorisenko O(2022)Ansible execution control in Python and Golang for cloud orchestrationSoftwareX10.1016/j.softx.2022.10112619(101126)Online publication date: Jul-2022
https://doi.org/10.1016/j.softx.2022.101126
Oh JYıldıran NBraha JGazzillo PSpinellis DGousios GChechik MDi Penta M(2021)Finding broken Linux configuration specifications by statically analyzing the Kconfig languageProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468578(893-905)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3468578
Dai TKarve AKoper GZeng SFonseca RDelimitrou COoi B(2020)Automatically detecting risky scripts in infrastructure codeProceedings of the 11th ACM Symposium on Cloud Computing10.1145/3419111.3421303(358-371)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3419111.3421303
Sotiropoulos TMitropoulos DSpinellis DRothermel GBae D(2020)Practical fault detection in puppet programsProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380384(26-37)Online publication date: 27-Jun-2020
https://dl.acm.org/doi/10.1145/3377811.3380384
Kokuryo SKondo MMizuno O(2020)An Empirical Study of Utilization of Imperative Modules in Ansible2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS51102.2020.00063(442-449)Online publication date: Dec-2020
https://doi.org/10.1109/QRS51102.2020.00063
Shimizu RKanuka H(2020)Test-Based Least Privilege Discovery on Cloud Infrastructure as Code2020 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)10.1109/CloudCom49646.2020.00007(1-8)Online publication date: Dec-2020
https://doi.org/10.1109/CloudCom49646.2020.00007
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten