skip to main content
research-article

Toward an Understanding of Long-tailed Runtimes of SLS Algorithms

Published: 13 December 2022 Publication History

Abstract

The satisfiability problem (SAT) is one of the most famous problems in computer science. Traditionally, its NP-completeness has been used to argue that SAT is intractable. However, there have been tremendous practical advances in recent years that allow modern SAT solvers to solve instances with millions of variables and clauses. A particularly successful paradigm in this context is stochastic local search (SLS).
In most cases, there are different ways of formulating the underlying SAT problem. While it is known that the precise formulation of the problem has a significant impact on the runtime of solvers, finding a helpful formulation is generally non-trivial. The recently introduced GapSAT solver [Lorenz and Wörz 2020] demonstrated a successful way to improve the performance of an SLS solver on average by learning additional information, which logically entails from the original problem. Still, there were also cases in which the performance slightly deteriorated. This justifies in-depth investigations into how learning logical implications affects runtimes for SLS algorithms.
In this work, we propose a method for generating logically equivalent problem formulations, generalizing the ideas of GapSAT. This method allows a rigorous mathematical study of the effect on the runtime of SLS SAT solvers. Initially, we conduct empirical investigations. If the modification process is treated as random, then Johnson SB distributions provide a perfect characterization of the hardness. Since the observed Johnson SB distributions approach lognormal distributions, our analysis also suggests that the hardness is long-tailed.
As a second contribution, we theoretically prove that restarts are useful for long-tailed distributions. This implies that incorporating additional restarts can further refine all algorithms employing above mentioned modification technique.
Since the empirical studies compellingly suggest that the runtime distributions follow Johnson SB distributions, we also investigate this property on a theoretical basis. We succeed in proving that the runtimes for the special case of Schöning’s random walk algorithm [Schöning 2002] are approximately Johnson SB distributed.

References

[1]
John Aitchison and J. A. C. Brown. 1963. The Lognormal Distribution—With Special Reference to Its Uses in Economics (2nd ed.). Cambridge University Press.
[2]
Alejandro Arbelaez, Charlotte Truchet, and Philippe Codognet. 2013. Using sequential runtime distributions for the parallel speedup prediction of SAT local search. Theory Pract. Logic Program. 13, 4–5 (2013), 625–639.
[3]
Gilles Audemard and Laurent Simon. 2009. Predicting learnt clauses quality in modern SAT solvers. In Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI’09). 399–404.
[4]
Fahiem Bacchus, Jeremias Berg, Matti Järvisalo, and Ruben Martins (Eds.). 2021. In Proceedings of the MaxSAT Evaluation 2021: Solver and Benchmark Descriptions. Department of Computer Science Report Series B, Vol. B-2021-2. University of Helsinki.
[5]
Adrian Balint. 2015. Original Implementation of probSAT. Retrieved from https://github.com/adrianopolus/probSAT.
[6]
Adrian Balint and Norbert Manthey. 2016. Dimetheus. In Proceedings of the SAT Competition: Solver and Benchmark Descriptions(Department of Computer Science Report Series B, Vol. B-2016-1). University of Helsinki, 37–38.
[7]
Adrian Balint and Uwe Schöning. 2012. Choosing probability distributions for stochastic local search and the role of make versus break. In Proceedings of the 15th International Conference on Theory and Applications of Satisfiability Testing (SAT’12)(Lecture Notes in Computer Science, Vol. 7317). Springer, 16–29.
[8]
Tomáš Balyo and Lukás Chrpa. 2018. Using algorithm configuration tools to generate hard SAT benchmarks. In Proceedings of the 11th International Symposium on Combinatorial Search (SOCS’18). AAAI Press, 133–137.
[9]
Wolfgang Barthel, Alexander K. Hartmann, Michele Leone, Federico Ricci-Tersenghi, Martin Weigt, and Riccardo Zecchina. 2002. Hiding solutions in random satisfiability problems: A statistical mechanics approach. Phys. Rev. Lett. 88, 188701 (2002), 1–4.
[10]
Robert G. Bartle and Donald R. Sherbert. 2000. Introduction to Real Analysis (3rd ed.). Wiley New York.
[11]
Armin Biere. 2014. Yet another local search solver and Lingeling and friends entering the SAT competition. In Proceedings of the SAT Competition: Solver and Benchmark Descriptions(Department of Computer Science Report Series B, Vol. B-2014-2). University of Helsinki, 39–40.
[12]
Armin Biere. 2017. CaDiCaL, Lingeling, Plingeling, Treengeling, YalSAT entering the SAT. In Proceedings of the SAT Competition: Solver and Benchmark Descriptions(Department of Computer Science Report Series B, Vol. B-2017-1). University of Helsinki, 14–15.
[13]
Armin Biere, Alessandro Cimatti, Edmund M. Clarke, Ofer Strichman, and Yunshan Zhu. 2003. Bounded model checking. Adv. Comput. 58 (2003), 117–148.
[14]
Armin Biere, Katalin Fazekas, Mathias Fleury, and Maximillian Heisinger. 2020. CaDiCal, Kissat, Paracooba, Plingeling and Treengeling entering the SAT competition. In Proceedings of the SAT Competition: Solver and Benchmark Descriptions(Department of Computer Science Report Series B, Vol. B-2020-1), Tomas Balyo, Nils Froleyks, Marijn Heule, Markus Iser, Matti Järvisalo, and Martin Suda (Eds.). University of Helsinki, 51–53.
[15]
Armin Biere, Mathias Fleury, and Maximillian Heisinger. 2021. CaDiCal, Kissat, Paracooba entering the SAT competition. In Proceedings of the SAT Competition: Solver and Benchmark Descriptions(Department of Computer Science Report Series B, Vol. B-2021-1), Tomas Balyo, Nils Froleyks, Marijn Heule, Markus Iser, Matti Järvisalo, and Martin Suda (Eds.). University of Helsinki, 10–13.
[16]
Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh (Eds.). 2009. Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications, Vol. 185. IOS Press.
[17]
Shaowei Cai, Chuan Luo, and Kaile Su. 2015. CCAnr: A configuration checking-based local search solver for non-random satisfiability. In Proceedings of the International Conference on Theory and Applications of Satisfiability Testing (SAT’15)(Lecture Notes in Computer Science, Vol. 9340). Springer, 1–8.
[18]
Shaowei Cai and Xindi Zhang. 2018. ReasonLS. In Proceedings of the SAT Competition: Solver and Benchmark Descriptions(Department of Computer Science Report Series B, Vol. B-2018-1). University of Helsinki, 52–53.
[19]
Shaowei Cai and Xindi Zhang. 2019. Four relaxed CDCL solvers. In Proceedings of SAT Race: Solver and Benchmark Descriptions(Department of Computer Science Report Series B, Vol. B-2019-1), Marijn J. H. Heule, Matti Järvisalo, and Martin Suda (Eds.). University of Helsinki, 35–36.
[20]
Shaowei Cai and Xindi Zhang. 2021. Deep cooperation of CDCL and local search for SAT. In Proceedings of the 24th International Conference on Theory and Applications of Satisfiability Testing (SAT’21)(Lecture Notes in Computer Science, Vol. 12831). Springer, 64–81.
[21]
Shaowei Cai, Xindi Zhang, Mathias Fleury, and Armin Biere. 2022. Better decision heuristics in CDCL through local search and target phases. J. Artific. Intell. Res. 74 (2022), 1515–1563.
[22]
Russell Cheng. 2017. Nonstandard Parametric Statistical Inference. Oxford University Press.
[23]
Edmund M. Clarke, Armin Biere, Richard Raimi, and Yunshan Zhu. 2001. Bounded model checking using satisfiability solving. Formal Meth. Syst. Design 19, 1 (2001), 7–34.
[24]
Stephen A. Cook. 1971. The complexity of theorem-proving procedures. In Proceedings of the 3rd Annual ACM Symposium on Theory of Computing (STOC’71). 151–158.
[25]
Edwin L. Crow and Kunio Shimizu (Editors). 1988. Lognormal Distributions: Theory and Applications. Statistics: A Series of Textbooks and Monographs, Vol. 88. Marcel Dekker.
[26]
Maximilian Diemer. 2021. Source Code of GenFactorSat. Retrieved from https://github.com/madiemer/gen-factor-sat/.
[27]
Niklas Eén and Niklas Sörensson. 2004. An extensible SAT-solver. In Proceedings of the 6th International Conference on Theory and Applications of Satisfiability Testing (SAT’03), Selected Revised Papers(Lecture Notes in Computer Science, Vol. 2919). Springer, 502–518.
[28]
Tobias Eibach, Enrico Pilz, and Gunnar Völkel. 2008. Attacking Bivium using SAT solvers. In Proceedings of the 11th International Conference on Theory and Applications of Satisfiability Testing (SAT’08)(Lecture Notes in Computer Science, Vol. 4966). Springer, 63–76.
[29]
Sergey Foss, Dmitry Korshunov, and Stan Zachary. 2011. An Introduction to Heavy-tailed and Subexponential Distributions. Vol. 6. Springer.
[30]
Daniel Frost, Irina Rish, and Lluís Vila. 1997. Summarizing CSP hardness with continuous probability distributions. In Proceedings of the 14th National Conference on Artificial Intelligence and 9th Innovative Applications of Artificial Intelligence Conference (AAAI/IAAI’97). 327–333.
[31]
Oliver Gableske. 2015. Source Code of kcnfgen (Version 1.0). Retrieved from https://www.gableske.net/downloads/kcnfgen_v1.0.tar.gz.
[32]
Carla P. Gomes and Bart Selman. 1997. Algorithm portfolio design: Theory vs. practice. In Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence (UAI’97). 190–197.
[33]
Carla P. Gomes, Bart Selman, Nuno Crato, and Henry A. Kautz. 2000. Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24 (2000), 67–100. Related version in CP’97.
[34]
Holger H. Hoos and Thomas Stützle. 1998. Evaluating Las Vegas algorithms: Pitfalls and remedies. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI’98). 238–245.
[35]
Sextus Empiricus (https://stats.stackexchange.com/users/164061/sextus empiricus). 2018. Reciprocal of Shifted Lognormal Random Variable. Cross Validated (Stats Stack Exchange). Retrieved from https://stats.stackexchange.com/q/379626.
[36]
Norman Lloyd Johnson. 1949. Bivariate distributions based on simple translation systems. Biometrika 36, 3–4 (1949), 297–304.
[37]
Norman Lloyd Johnson. 1949. Systems of frequency curves generated by methods of translation. Biometrika 36, 1–2 (1949), 149–176.
[38]
Norman L. Johnson, Samuel Kotz, and Narayanaswamy Balakrishnan. 1994. Continuous Univariate Distributions, Volume 1 (2nd ed.). John Wiley & Sons.
[39]
Alexis C. Kaporis, Lefteris M. Kirousis, and Yannis C. Stamatiou. 2000. A note on the non-colorability threshold of a random graph. Electr. J. Combinat. 7, 1 (2000).
[40]
John M. Lachin. 2011. Biostatistical Methods: The Assessment of Relative Risks (2nd ed.). John Wiley & Sons.
[41]
Eduardo Lalla-Ruiz and Stefan Voss. 2016. Improving solver performance through redundancy. J. Syst. Sci. Syst. Eng. 25, 3 (2016), 303–325.
[42]
Massimo Lauria, Jan Elffers, Jakob Nordström, and Marc Vinyals. 2017. CNFgen: A generator of crafted benchmarks. In Proceedings of the 20th International Conference on Theory and Applications of Satisfiability Testing (SAT’17). 464–473.
[43]
Jan-Hendrik Lorenz and Florian Wörz. 2020. On the effect of learned clauses on stochastic local search. In Proceedings of the 23rd International Conference on Theory and Applications of Satisfiability Testing (SAT’20)(Lecture Notes in Computer Science, Vol. 12178). Springer, 89–106. Implementation and statistical tests of GapSAT available at Zenodo
[44]
Jan-Hendrik Lorenz. 2018. Runtime distributions and criteria for restarts. In Proceedings of the 44th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM’18). Springer, 493–507.
[45]
Jan-Hendrik Lorenz and Florian Wörz. 2021. Source Code of concealSATgen. Retrieved from https://github.com/FlorianWoerz/concealSATgen/.
[46]
Inês Lynce and João Marques-Silva. 2006. SAT in bioinformatics: Making the case with haplotype inference. In Proceedings of the 9th International Conference on Theory and Applications of Satisfiability Testing (SAT’06)(Lecture Notes in Computer Science, Vol. 4121). Springer, 136–141.
[47]
Norbert Manthey. 2021. The MergeSat solver. In Proceedings of the International Conference on Theory and Applications of Satisfiability Testing (SAT’21)(Lecture Notes in Computer Science, Vol. 12831). Springer, 387–398.
[48]
João P. Marques-Silva and Karem A. Sakallah. 1996. GRASP—A new search algorithm for satisfiability. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’96). 220–227.
[49]
Stephan Mertens, Marc Mézard, and Riccardo Zecchina. 2006. Threshold values of random \(k\)-SAT from the cavity method. Random Struct. Algor. 28, 3 (2006), 340–373.
[50]
Michael Mitzenmacher and Eli Upfal. 2017. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis (2nd ed.). Cambridge University Press.
[51]
Matthew W. Moskewicz, Conor F. Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik. 2001. Chaff: Engineering an efficient SAT solver. In Proceedings of the 38th Design Automation Conference (DAC’01). 530–535.
[52]
Jayakrishnan Nair, Adam Wierman, and Bert Zwart. 2020. The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Preprint, California Institute of Technology.
[53]
Marvin Rausand, Anne Barros, and Arnljot Hoyland. 2003. System Reliability Theory: Models, Statistical Methods, and Applications (2nd ed.). John Wiley & Sons.
[54]
Irina Rish and Daniel Frost. 1997. Statistical analysis of backtracking on inconsistent CSPs. In Proceedings of the 3rd International Conference on Principles and Practice of Constraint Programming (CP’97). 150–162.
[55]
Yongshao Ruan, Eric Horvitz, and Henry A. Kautz. 2002. Restart policies with dependence among runs: A dynamic programming approach. In Proceedings of the 8th International Conference on Principles and Practice of Constraint Programming (CP’02)(Lecture Notes in Computer Science, Vol. 2470). Springer, 573–586.
[56]
Walter Rudin. 1964. Principles of Mathematical Analysis. Vol. 3. McGraw-Hill New York.
[57]
Uwe Schöning. 2002. A probabilistic algorithm for \(k\)-SAT based on limited local search and restart. Algorithmica 32, 4 (2002), 615–623. In Preliminary Version in Proceedings of the (FOCS’99).
[58]
Bart Selman, Henry A. Kautz, and Bram Cohen. 1994. Noise strategies for improving local search. In Proceedings of the 12th National Conference on Artificial Intelligence (AAAI’94). AAAI Press/MIT Press, 337–343.
[59]
Mate Soos, Karsten Nohl, and Claude Castelluccia. 2009. Extending SAT solvers to cryptographic problems. In Proceedings of the 12th International Conference on Theory and Applications of Satisfiability Testing (SAT’09)(Lecture Notes in Computer Science, Vol. 5584). Springer, 244–257.
[60]
Gunnar Völkel, Ludwig Lausser, Florian Schmid, Johann M. Kraus, and Hans A. Kestler. 2015. Sputnik: Ad hoc distributed computation. Bioinformatics 31, 8 (2015), 1298–1301.
[61]
Sven Dag Wicksell. 1917. On logarithmic correlation with an application to the distribution of ages at first marriage. Meddelanden från Lunds Astronomiska Observatorium 84 (1917), 1–21.
[62]
Florian Wörz and Jan-Hendrik Lorenz. 2021. Evidence for long-tails in SLS algorithms. In Proceedings of the 29th Annual European Symposium on Algorithms (ESA’21)(LIPIcs, Vol. 204), Petra Mutzel, Rasmus Pagh, and Grzegorz Herman (Eds.). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 82:1–82:16.
[63]
Florian Wörz and Jan-Hendrik Lorenz. 2022. Data Set for “Toward an Understanding of Long-tailed Runtimes.”We have provided all data of this article. All base instances, resolvents, and modifications can be found under 10.5281/zenodo.4715893. Visual and statistical evaluations can be found under https://github.com/FlorianWoerz/Towards-an-Understanding-of-Long-Tailed-Runtimes, where all evaluations take place in the files ./evaluation/jupyter_SB/evaluate_*.ipynb. A permanent version of this repository has been preserved under 10.5281/zenodo.6945926

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Journal of Experimental Algorithmics
ACM Journal of Experimental Algorithmics  Volume 27, Issue
December 2022
776 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/3505192
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 December 2022
Online AM: 26 October 2022
Accepted: 10 October 2022
Revised: 23 August 2022
Received: 26 November 2021
Published in JEA Volume 27

Author Tags

  1. Stochastic local search
  2. runtime distribution
  3. statistical analysis
  4. Johnson SB distribution
  5. lognormal distribution
  6. long-tailed distribution
  7. restarts
  8. SAT solving
  9. learned clauses
  10. logical entailment

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 129
    Total Downloads
  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media