ABSTRACT
As software systems become more complex and configurable, failures due to misconfigurations are becoming a critical problem. Such failures often have serious functionality, security and financial consequences. Further, diagnosis and remediation for such failures require reasoning across the software stack and its operating environment, making it difficult and costly. We present a framework and tool called EnCore to automatically detect software misconfigurations. EnCore takes into account two important factors that are unexploited before: the interaction between the configuration settings and the executing environment, as well as the rich correlations between configuration entries. We embrace the emerging trend of viewing systems as data, and exploit this to extract information about the execution environment in which a configuration setting is used. EnCore learns configuration rules from a given set of sample configurations. With training data enriched with the execution context of configurations, EnCore is able to learn a broad set of configuration anomalies that spans the entire system. EnCore is effective in detecting both injected errors and known real-world problems - it finds 37 new misconfigurations in Amazon EC2 public images and 24 new configuration problems in a commercial private cloud. By systematically exploiting environment information and by learning correlation rules across multiple configuration settings, EnCore detects 1.6x to 3.5x more misconfiguration anomalies than previous approaches.
- Misconfiguration brings down entire .se domain in Sweden. http://www.circleid.com.Google Scholar
- Google Code Style Guide. http://googlestyleguide.googlecode.com.Google Scholar
- Internet Assigned Numbers Authority. http://www.iana.org.Google Scholar
- Lint, a C program Verifier. http://www.unix.com/manpage/FreeBSD/1/lint.Google Scholar
- Mysql log security. http://www.securityfocus.com/advisories/3803.Google Scholar
- PHP configuration error. http://stackoverflow.com/questions/7754133.Google Scholar
- PyLint. http://www.logilab.org/project/pylint/.Google Scholar
- RapidMiner. http://www.rapid-i.com.Google Scholar
- At what point does a config file become a programming language? http://stackoverflow.com/questions/648246/at-whatpoint-does-a-config-file-become-a-programming-language.Google Scholar
- Weka. http://www.cs.waikato.ac.nz/ml/weka.Google Scholar
- G. Ammons, V. Bala, T. Mummert, D. Reimer, and X. Zhang. Virtual machine images as structured data: the Mirage image library. In HotCloud, 2011. Google ScholarDigital Library
- S. Anand, D. Bell, and J. Hughes. The Role of Domain Knowledge in Data Mining. In Proceedings of 4th International Conference on Information and Knowledge Management (CIKM'95), December 1995. Google ScholarDigital Library
- M. Attariyan and J. Flinn. Using Causality to Diagnose Configuration Bugs. In Proceedings of 2008 USENIX Annual Technical Conference, June 2008. Google ScholarDigital Library
- M. Attariyan and J. Flinn. Automating Configuration Troubleshooting with Dynamic Information Flow Analysis. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI'10), October 2010. Google ScholarDigital Library
- M. Attariyan, M. Chow, and J. Flinn. X-ray: Diagnosing Performance Misconfigurations in Production Software. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI'12), October 2012. Google ScholarDigital Library
- P. S. Bradley and O. L. Mangasarian. Feature Selection via Concave Minimiation and Support Vector Machiens. In Proceedings of the 5th International Conference on Machine Learning (ICML'98), July 1998. Google ScholarDigital Library
- C. Cadar, D. Dunbar, and D. Engler. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In Proceeedings of the 8th USENIX conference on Operating Systems Design and Implementation (OSDI'08), December 2008. Google ScholarDigital Library
- D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. In Proceedings of the 18th ACMSymposium on Operating Systems Principles (SOSP'01), October 2001. Google ScholarDigital Library
- J. Gray. Dependability in the Internet Era, 2001. Keynote presentation at the 2nd HDCC Workshop.Google Scholar
- J. Han, J. Pei, and Y. Yin. Mining Frequent Pattern without Candidate Generation. In Proceedings of the 2000 ACMInternational conference on Management of Data (SIGMOD'00), May 2000. Google ScholarDigital Library
- M. Hong, Z. Lu, and Y. Fuqing. A Component-based software configuration management model and its supporting system. In Proceedings of the 24th International Conference on Software Engineering (ICSE'02), May 2002.Google Scholar
- Q. Huang, H. J. Wang, and N. Borisov. Privacy-Preserving Friends Troubleshooting Network. In Proceedings of the 12th Network and Distributed System Security Symposium (NDSS'05), February 2005.Google Scholar
- R. J. and B. Jr. Efficiently Mining Long Patterns from Database. In Proceedings ACM SIGMOD International Conference on Management of Data (SIGMOD'98), June 1998. Google ScholarDigital Library
- R. Johnson. More details on today's outage. http://www.facebook.com/notes/facebook-engineering/more-details-on-todaysoutage/431441338919.Google Scholar
- L. Keller, P. Upadhyaya, and G. Candea. ConfErr: A Tool for Assessing Resilience to Human Configuration Errors. In Proceedings of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'08), June 2008.Google ScholarCross Ref
- E. Kohler, B. Chen, M. F. Kaashoek, R. Morris, and M. Poletto. Programming language techniques for modular router configurations. Technical Report MIT-LCS-TR-812, MIT Laboratory for Computer Science, August 2000.Google Scholar
- D. Lutterkort. Augeas - a Configuration API. In 2008 Linux Symposium, 2008.Google Scholar
- J. Mason. Against The Use Of Programming Languages in Configuration Files. http://taint.org/2011/02/18/001527a.html.Google Scholar
- K. Nagaraja, F. Oliveria, R. Bianchini, R. P. Martin, and T. D. Nguyen. Understanding and Dealing with Operator Mistakes in Internet Services. In Proceedings of the 6th USENIX Conference on Operating Systems Design and Implementation (OSDI'04), December 2004. Google ScholarDigital Library
- D. Oppenheimer, A. Ganapathi, and D. A. Patterson. Why Do Internet Services Fail, and What Can Be Done About It? In Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems (USITS'03), March 2003. Google ScholarDigital Library
- A. Rabkin and R. Katz. Static Extraction of Program Configuration Options. In Proceedings of the 33th International Conference on Software Engineering (ICSE'11), May 2011. Google ScholarDigital Library
- A. Rabkin and R. Katz. Precomputing Possible Configuration Error Diagnoses. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ICSE'11), May 2011. Google ScholarDigital Library
- A. Rakesh and S. Ramakrishnan. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), September 1994.Google Scholar
- V. Ramachandran, M. Gupta, M. Sethi, and S. R. Chowdhury. Determining Configuration Parameter Dependencies via Analysis of Configuration Data from Multi-tiered Enterprise Applications. In Proceedings of the 6th International Conference on Autonomic Computing and Communications (ICAC'09), June 2009. Google ScholarDigital Library
- B. S., M. R., U. J., and T. S. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proceedings ACM SIGMOD International Conference on Management of Data (SIGMOD'97), May 1997. Google ScholarDigital Library
- C. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 27:379--423, July 1984.Google ScholarCross Ref
- K. Smets and J. Vreeken. SLIM: Directly Mining Descriptive Patterns. In Proceedings of 2012 SIAM International Conference on Data Mining (SDM'12), April 2012.Google ScholarCross Ref
- R. Srikant and R. Agrawl. Mining Quantative Association Rules in Large Relational Tables. In Proceedings ACM SIGMOD International Conference on Management of Data (SIGMOD'96), June 1996. Google ScholarDigital Library
- Y.-Y. Su, M. Attariyan, and J. Flinn. AutoBash: Improving Configuration Management with Operating System Causality Analysis. In Proceedings of the 21st ACM Symposium on Operating Sytems Principles (SOSP'07), October 2007. Google ScholarDigital Library
- S. Traugott and J. Huddleston. Bootstrapping an Infrastructure. In Proceedings of the 13th Systems Administration Conference (LISA'99), November 1999. Google ScholarDigital Library
- H. J. Wang, J. C. Platt, Y. Chen, R. Zhang, and Y.-M. Wang. Automatic Misconfiguration Troubleshooting with PeerPressure. In Proceedings of the 6th USENIX Conference on Operating Systems Design and Implementation (OSDI'04), December 2004. Google ScholarDigital Library
- Y.-M. Wang, C. Verbowski, J. Dunagan, Y. Chen, H. Wang, C. Yuan, and Z. Zhang. STRIDER: A Black-box, Statebased Approach to Change and Configuration Management and Support. In Proceedings of the 17th Large Installation Systems Admistration Conference (LISA'03), October 2003. Google ScholarDigital Library
- M. Welsh. What I wish systems researchers would work on. http://http://matt-welsh.blogspot.com/2013/05.Google Scholar
- T. Xu, J. Zhang, P. Huang, J. Zheng, T. Sheng, D. Yuan, Y. Zhou, and S. Pasupathy. Do not Blame Users for Misconfigurations. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP'13). Google ScholarDigital Library
- W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan. Detecting Large-Scale System Problems by Mining Console Logs. In Proceedings of the 2009 Symposium on Operating Systems Principles, 2009. Google ScholarDigital Library
- Z. Yin, X.Ma, J. Zheng, Y. Zhou, L. N. Bairavasundaram, and S. Pasupathy. An Empirical Study on Configuration Errors in Commercial and Open Source Systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP'11), October 2011. Google ScholarDigital Library
- D. Yuan, Y. Xie, R. Panigrahy, J. Yang, C. Verbowsky, and A. Kumar. Context-based Online Configuration-Error Detection. In Proceedings of 2011 USENIX Anuual Technical Conference, June 2011. Google ScholarDigital Library
Index Terms
EnCore: exploiting system environment and correlation information for misconfiguration detection
Recommendations
Do not blame users for misconfigurations
SOSP '13: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems PrinciplesSimilar to software bugs, configuration errors are also one of the major causes of today's system failures. Many configuration issues manifest themselves in ways similar to software bugs such as crashes, hangs, silent failures. It leaves users clueless ...
Systems Approaches to Tackling Configuration Errors: A Survey
In recent years, configuration errors (i.e., misconfigurations) have become one of the dominant causes of system failures, resulting in many severe service outages and downtime. Unfortunately, it is notoriously difficult for system users (e.g., ...
EnCore: exploiting system environment and correlation information for misconfiguration detection
ASPLOS '14As software systems become more complex and configurable, failures due to misconfigurations are becoming a critical problem. Such failures often have serious functionality, security and financial consequences. Further, diagnosis and remediation for such ...
Comments