Skip to main content
Log in

On SAT instance classes and a method for reliable performance experiments with SAT solvers

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

A recent series of experiments with a group of state-of-the-art SAT solvers and several well-defined classes of problem instances reports statistically significant performance variability for the solvers. A systematic analysis of the observed performance data, all openly archived on the Web, reveals distributions which we classify into three broad categories: (1) readily characterized with a simple χ2-test, (2) requiring more in-depth analysis by a statistician, (3) incomplete, due to time-out limit reached by specific solvers. The first category includes two well-known distributions: normal and exponential; we use simple first-order criteria to decide the second category and label the distributions as near-normal, near-exponential and heavy-tail. We expect that good models for some if not most of these may be found with parameters that fit either generalized gamma, Weibull, or Pareto distributions. Our experiments show that most SAT solvers exhibit either normal or exponential distribution of execution time (runtime) on many equivalence classes of problem instances. This finding suggests that the basic mathematical framework for these experiments may well be the same as the one used to test the reliability or lifetime of hardware components such as lightbulbs, A/C units, etc. A batch of N replicated hardware components represents an equivalence class of N problem instances in SAT, a controlled operating environment A represents a SAT solver A, and the survival function \(\mathcal{R}^A \left( x \right)\) (where x represents the lifetime) is the complement of the solvability function \(\mathcal{S}^A \left( x \right) = 1--\mathcal{R}^A \left( x \right)\) where x may represent runtime, implications, backtracks, etc. As demonstrated in the paper, a set of unrelated benchmarks or randomly generated SAT instances available today cannot measure the performance of SAT solvers reliably — there is no control on their ‘hardness’. However, equivalence class instances as defined in this paper are, in effect, replicated instances of a specific reference instance. The proposed method not only provides a common platform for a systematic study and a reliable improvement of deterministic and stochastic SAT solvers alike but also supports the introduction and validation of new problem instance classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. S. Baase and A. Van Gelder, Computer Algorithms, 3rd edition (Addison-Wesley, Reading, MA, 2000).

    Google Scholar 

  2. G.E.P. Box, W.G. Hunter and J.S. Hunter, Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building (Wiley, 1978).

  3. F. Brglez, Design of experiments to evaluate CAD algorithms: which improvements are due to improved heuristic and which are merely due to chance?, Technical Report 1998-TR@CBL-04-Brglez, CBL, CS Dept., NCSU, Raleigh, NC (April 1998). Also available at http://www.cbl.ncsu.edu/publications/#1998-TR@CBL-04-Brglez.

  4. F. Brglez and R. Drechsler, Design of experiments in CAD: Context and new data sets for ISCAS’99, in: Proc. of IEEE 1999 International Symposium on Circuits and Systems — ISCAS’99 (May 1999). A reprint is accessible from http://www.cbl.ncsu.edu/publications/#I999-ISCAS-Brglez.

  5. F. Brglez, X.Y. Li and M. Stallmann, The role of a skeptic agent in testing and benchmarking of SAT algorithms, in: Proc. of Fifth International Symposium on the Theory and Applications of Satisfiability Testing, http://www.cbl.ncsu.edu/publications/ (May 2002).

  6. F. Brglez, M.F. Stallmann and X.Y. Li, SATbed — An environment for reliable performance experiments with SAT instance classes and algorithms, in: Proc. of SAT 2003, Sixth International Symposium on the Theory and Applications of Satisfiability Testing, Portofino, Italy, ed. S.M. Ligure (May 5–8, 2003). A revised version available at http://www.cbl.ncsu.edu/publications/.

  7. F. Brglez, M. Stallmann and X.Y. Li, SATbed home page: A tutorial, a user guide, a software archive, archives of SAT instance classes and experimental results, http://www.cbl.ncsu.edu/OpenExperiments/SAT/ (2003).

  8. C. Coarfa, D.D. Demopoulos, A.S.M. Aguirre, D. Subramanian and M.Y. Vardi, Random 3-SAT: The plot thickens, in: Principles and Practice of Constraint Programming (2000) pp. 143–159.

  9. S. Cook and D. Mitchell, Finding hard instances of the satisfiability problem: A survey, http://dream.dai.ed.ac.uk/group/tw/sat/sat-survey3.ps (1997).

  10. E.L. Crow, F.A. Davis and M.W. Maxfield, Statistics Manual (Dover, New York, 1960).

    MATH  Google Scholar 

  11. M. Davis, G. Logemann and D. Loveland, A machine program for theorem-proving, Communications of the ACM 5(7) (1962) 394–397.

    Article  MATH  MathSciNet  Google Scholar 

  12. S. Davis and M. Putnam, A computing procedure for quantification theory, Journal of the Association for Computing Machinery 7(3) (1960) 201–215.

    Article  MATH  MathSciNet  Google Scholar 

  13. I.P. Gent and T. Walsh, The search for satisfaction, http://dream.dai.ed.ac.uk/group/tw/sat/sat-survey2.ps.

  14. D. Ghosh, Generation of tightly controlled equivalence classes for experimental design of heuristics for graph-based NP-hard problems, PhD thesis, Electrical and Computer Engineering, North Carolina State University, Raleigh, NC (May 2000). Also available at http://www.cbl.ncsu.edu/publications/#2000-Thesis-PhD-Ghosh.

    Google Scholar 

  15. D. Ghosh and F. Brglez, Equivalence classes of circuit mutants for experimental design, in: Proc.of Intl. Symp. Circuits and Systems (ISCAS) (May–June 1999). Also available at http://www.cbl.ncsu.edu/publications/#1999-ISCAS-Ghosh.

  16. S.W. Golomb, On the classification of boolean functions, IRE Transactions on Information Theory 5 (1959) 176–186.

    Article  Google Scholar 

  17. C.P. Gomes, B. Selman and N. Crato, Heavy-tailed distributions in combinatorial search, in: Principles and Practice of Constraint Programming (1997) pp. 121–135.

  18. J. Gu, P. Purdom, J. Franco and B. Wah, Algorithms for the satisfiability (SAT) problem: A survey, in: DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 35 (1997) pp. 19–152, http://dream.dai.ed.ac.uk/group/tw/sat/sat-survey.ps.

  19. N.C. Gupta and D.S, Nau, On the complexity of blocks-world planning, Artificial Intelligence 56(2–3) (1992) 223–254.

    Article  MATH  MathSciNet  Google Scholar 

  20. J.E. Harlow and F. Brglez, Design of experiments for evaluation of BDD packages using controlled circuit mutations, in: Proc. of the International Conference on Formal Methods in Computer-Aided Design (FMCAD’98), Lecture Notes in Computer Science, Vol. 1522 (Springer, 1998) pp. 64–81. Also available from http://www.cbl.ncsu.edu/publications/#1998-FMCAD-Harlow.

  21. J.E. Harlow and F. Brglez, Design of experiments and evaluation of BDD ordering heuristics, International Journal on Software Tools for Technology Transfer (STTT): Special Issue on BDDs (2001).

  22. E. Hirsch and A. Kojevnikov, UnitWalk: A new SAT solver that uses local search guided by unit clause elimination, 2001, PDMI preprint 9/2001, Steklov Institute of Mathematics at St. Petersburg (2001).

  23. J. Hooker, Testing heuristics: We have it all wrong, Journal of Heuristics 1 (1996) 33–42.

    Article  Google Scholar 

  24. J.N. Hooker, Needed: An empirical science of algorithms, Operations Research 42(2) (1994) 201–212.

    Article  MATH  Google Scholar 

  25. H.H. Hoos and T. Stützle, Evaluating Las Vegas algorithms — pitfalls and remedies, in: Proc. of UAI-98 (Morgan Kaufmann, San Mateo, CA, 1998) pp. 238–245.

  26. H.H. Hoos and T. Stützle, Local search algorithms for SAT: An empirical evaluation, Journal of Automated Reasoning 24 (2000).

  27. H.H. Hoos and T. Stützle, SATLIB: An online resource for research on SAT, in: Proc. of SAT’2000 (IOS Press, 2000) pp. 283–292, http://www.satlib.org.

  28. F. Jense, Electronic Component Reliability: Fundamentals, Modelling, Evaluation, and Assurance (Wiley, New York, 1996).

    Google Scholar 

  29. N. Kapur, D. Ghosh and F. Brglez, Towards a new benchmarking paradigm in EDA: analysis of equivalence class mutant circuit distributions, in: Proc. of ACM International Symposium on Physical Design (April 1997).

  30. H. Kautz, D. McAllester and B. Selman, Encoding plans in propositional logic, in: KR’96: Principles of Knowledge Representation and Reasoning (1996) pp. 374–384. The SATPLAN benchmark set is available from http://sat.inesc.pt/benchmarks/cnf/satplan/.

  31. X.Y. Li, M.F. Stallmann and F. Brglez, QingTing: A local search SAT solver using an effective switching strategy and an efficient unit propagation, in: Proc. of 2003-SAT Issue on Satisfiability Testing, Lecture Notes in Computer Science, Vol. 2919 (2003) pp. 53–68. A significant revision of the paper published in Proc. of Sixth International Symposium on the Theory and Applications of Satisfiability Testing, Portofino, Italy, ed. S.M. Ligure (May 5–8, 2003). Available at http://www.cbl.ncsu.edu/publications/.

  32. H. Lilliefors, On the Kolmogorov-Smirnov test for the exponential distribution with mean unknown, Journal of the American Statistical Association 64 (1969) 387–389.

    Article  Google Scholar 

  33. J.P. Marques-Silva, On selecting problem instances for evaluating satisfiability algorithms, in: Proc. of ECAI Workshop on Empirical Methods in Artificial Intelligence (ECAI-EMAI), 2000.

  34. D.A. McAllester, B. Selman and H.A. Kautz, Evidence for invariants in local search, in: Proc. of AAAI/IAAI (1997) pp. 321–326.

  35. D. Mitchell, A remark on benchmarks and analysis, in: Proc. of IJCAI-99 Workshop on Empirical AI (1999).

  36. M. Mitzenmacher, A Brief History of Generative Models for Power Law and Lognormal Distributions (Allerton, 2001).

  37. M. Moskewicz, C. Madigan, Y. Zhao, L. Zhang and S. Malik, Chaff: Engineering an efficient SAT solver, in: IEEE/ACM Design Automation Conference (DAC) (2001). Version 1.0 of Chaff is available from http://www.ee.princeton.edu/chaff/zchaff/zchaff.2001.2.17.src.tar.gz.

  38. I. Olkin, L.J. Gleser and C. Derman, Probability, Models, and Applications (Macmillan, New York, 1960).

    Google Scholar 

  39. J.A. Osborne and T.A. Severini, Inference for exponential order statistic models based on an integrated likelihood function, Journal of the American Statistical Association 95 (2000) 1220–1228.

    Article  MATH  MathSciNet  Google Scholar 

  40. J.A. Osborne and T.A. Severini, The Lorenz curve for model assessment in exponential order statistic models, Journal of Statistical Computation and Simulation 72 (2002) 87–97.

    Article  MathSciNet  Google Scholar 

  41. F. Prochan, Theoretical explanation of observed decreasing failure rate, Technometrics 5 (1963) 375.

    Article  Google Scholar 

  42. Sat-Ex: The experimentation web site around the satisfiability, http://www.lri.fr/ ~simon/satex/satex.php3.

  43. SATLIB — The Satisfiability Library, http://www.satlib.org (2003).

  44. SAT Live! Up-to-date links for the SATisfiability problem, http://www.satlive.org.

  45. B. Selman, D.G. Mitchell and H.J. Levesque, Generating hard satisfiability problems, Artificial Intelligence 81(1–2) (1996) 17–29.

    Article  MathSciNet  Google Scholar 

  46. M. Stallmann, F. Brglez and D. Ghosh, Heuristics and experimental design for bigraph crossing number minimization, in: Proc. of the First Workshop on Algorithm Engineering and Experimentation (ALENEX 99) (January 1999). Also available at http://www.cbl.ncsu.edu/publications/.

  47. M. Stallmann, F. Brglez and D. Ghosh, Heuristics, experimental subjects and treatment evaluation in bigraph crossing minimization, Journal on Experimental Algorithmics (2001). Also available at http://www.cbl.ncsu.edu/publications/#2001-JEA-Stallmann.

  48. M.A. Trick, Second DIMACS challenge test problems, in: DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 26 (1993) pp. 653–657. The SAT benchmark sets are available at ftp://dimacs.rutgers.edu/pub/challenge/satisfiability.

  49. D.B. West, Introduction to Graph Theory (Prentice-Hall, Englewood Cliffs, NJ, 1996).

    MATH  Google Scholar 

  50. J. Whittemore, J. Kim and K. Sakallah, SATIRE: a new incremental satisfiability engine, in: IEEE/ACMDesign Automation Conference (DAC) (2001). Version 1.0.0 of SATIRE is available from http://andante.eecs.umich.edu/satire/Satire.tgz.

  51. H. Zhang, SATO: An efficient propositional proven, in: Conference on Automated Deduction (1997) pp. 272–275. Version 3.2 of SATO is available from ftp://cs.uiowa.edu/pub/hzhang/sato/sato.tar.gz.

  52. H. Zhang and M.E. Stickel, Implementing the Davis-Putnam Method (Kluwer Academic, Dordrecht, 2000).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brglez, F., Li, X.Y. & Stallmann, M.F. On SAT instance classes and a method for reliable performance experiments with SAT solvers. Ann Math Artif Intell 43, 1–34 (2005). https://doi.org/10.1007/s10472-005-0417-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-005-0417-5

Keywords

Navigation