Skip to main content
Log in

Sampling configurations from software product lines via probability-aware diversification and SAT solving

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Sampling a small, valid and representative set of configurations from software product lines (SPLs) is important, yet challenging due to a huge number of possible configurations to be explored. Recently, the sampling strategy based on satisfiability (SAT) solving has enjoyed great popularity due to its high efficiency and good scalability. However, this sampling offers no guarantees on diversity, especially in terms of the number of selected features, an important property to characterize a configuration. In this paper, we propose a probability-aware diversification (PaD) strategy to cooperate with SAT solving in generating diverse configurations, with the effect that valid configurations are efficiently generated by SAT solving while also maintaining diversity brought by PaD. Experimental results on 51 public SPLs show that, when working cooperatively with PaD, the performance (regarding diversity) of off-the-shelf SAT solvers has substantial improvements, with large effect sizes observed on more than 71% of all the cases. Furthermore, we propose a general search-based framework where PaD and evolutionary algorithms can work together, and instantiate this framework in the context of search-based diverse sampling and search-based multi-objective SPL configuration (where there is a practical need of generating diverse configurations). It is demonstrated by the experimental results that PaD also brings abundant performance gains to these search-based approaches. Finally, we apply PaD to a practical problem, i.e., machine learning based performance predictions of SPLs, and show that using PaD tends to improve the accuracy of performance prediction models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. In this paper, the diversity of configurations is measured by the number of selected features. Actually, it can also be defined and interpreted from other aspects, e.g., coverage of valid feature combinations Henard et al. (2014).

  2. For real-world SPLs, it occurs extremely rarely that all variables assigned by PaD violate constraints.

  3. The original name of this algorithm is ProbSATVaEA. For simplicity, we name it PbSATVaEA in this paper.

  4. The \({\widehat{A}}_{12}\) statistic is performed using the effsize package in R platform.

  5. A concept in novelty search algorithm Lehman and Stanley (2008); Lehman and Kenneth (2011)

  6. https://github.com/jualvespereira/ICPE2020

  7. https://doi.org/10.5281/zenodo.6350098

  8. https://github.com/YiXiangScut/PaD-SATSolving

References

  • Achlioptas, D., Hammoudeh, Z.S., Theodoropoulos, P.: Fast sampling of perfectly uniform satisfying assignments. In: Beyersdorff, O., Wintersteiger, C.M. (eds.) Theory and applications of satisfiability testing - SAT 2018, pp. 135–147. Springer, Cham (2018)

    Chapter  MATH  Google Scholar 

  • Al-Hajjaji, M., Krieter, S., Thüm, T., et al: IncLing: Efficient Product-Line Testing Using Incremental Pairwise Sampling. In: Proceedings of the 2016 ACM SIGPLAN International Conference on generative programming: concepts and experiences. Association for Computing Machinery, New York, NY, USA, GPCE 2016, pp 144–155 (2016)

  • Alves Pereira, J., Acher, M., Martin, H., et al: Sampling effect on performance prediction of configurable systems: A case study. In: Proceedings of the ACM/SPEC International Conference on performance engineering. association for computing machinery, New York, NY, USA, ICPE’20, p 277-288, https://doi.org/10.1145/3358960.3379137, (2020)

  • Arcuri, A., Briand, L.: A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd International Conference on Software Engineering. Association for computing machinery, New York, NY, USA, ICSE’11, pp 1–10 (2011)

  • Arcuri, A., Briand, L.: A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Software Test Verifi Reliab 24(3), 219–250 (2014)

    Article  Google Scholar 

  • Balint, A., Schöning, U.: Choosing Probability Distributions for Stochastic Local Search and the Role of Make versus Break, International Conference on theory and applications of satisfiability testing, Berlin, Heidelberg, pp 16–29 (2012)

  • Baranov, E., Legay, A., Meel, K.S.: Baital: An adaptive weighted sampling approach for improved t-wise coverage. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the foundations of software engineering. ACM, New York, NY, USA, ESEC/FSE 2020, pp 1114–1126 (2020)

  • Batory, D.: Feature models, grammars, and propositional formulas. In: Obbink H, Pohl K (eds) Proceedings of the 9th international conference software product lines, SPLC 2005. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 7–20 (2005)

  • Berre, D.L., Parrain, A.: The Sat4j library, release 2.2, system description. J. Satisf. Bool. Model. Comput. 7, 59–64 (2010)

    Google Scholar 

  • Biere, A., Brummayer, R.: Picosat essentials. Journal on Satisfiability Boolean Modeling & Computation pp 75–97 (2008)

  • Cai, S.: Faster implementation for walksat. Tech. rep, Queensland Research Lab, NICTA, Australia (2013)

  • Cai, S., Luo, C., Su, K.: Improving WalkSAT By effective tie-breaking and efficient implementation. Comput. J. 58(11), 2864–2875 (2014)

    Article  Google Scholar 

  • Chakraborty, S., Fremont, D.J., Meel, K.S., et al.: On parallel scalable uniform SAT witness generation. In: Baier, C., Tinelli, C. (eds.) Tools Algorithms for the Construct. Anal. Syst., pp. 304–319. Springer, Berlin Heidelberg, Berlin, Heidelberg (2015)

    Chapter  Google Scholar 

  • Chen, J., Nair, V., Krishna, R., et al.: ‘Sampling’ as a baseline optimizer for search-based software engineering. IEEE Trans. Softw.Eng. 45(6), 597–614 (2019)

    Article  Google Scholar 

  • Clements, P., Northrop, L.: Software product lines: practices and patterns. Addison-Wesley Longman Publishing Co., Inc (2001)

  • Coello Coello, C.A., Lamont, G.B., Veldhuizen, D.A.V.: Evolutionary algorithms for solving multi-objective problems, 2nd edn. Springer, LLC, New York, NY (2007)

    MATH  Google Scholar 

  • Cohen, M.B., Dwyer, M.B., Shi, J.: Interaction testing of highly-configurable systems in the presence of constraints. In: Proceedings of the 2007 International Symposium on software testing and analysis. association for computing machinery, New York, NY, USA, ISSTA’07, pp 129–139 (2007)

  • Cohen, M.B., Dwyer, M.B., Shi, J.: Constructing interaction test suites for highly-configurable systems in the presence of constraints: A greedy approach. IEEE Trans. Softw. Eng. 34(5), 633–650 (2008)

    Article  Google Scholar 

  • De Moura, L., Bjørner, N.: Z3: An efficient smt solver. In: Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer-Verlag, Berlin, Heidelberg, TACAS’08/ETAPS’08, pp 337–340 (2008)

  • Dutra, R., Laeufer, K., Bachrach, J., et al.: Efficient sampling of sat solutions for testing. In: 2018 IEEE/ACM 40th international conference on software engineering (icse), pp. 549–559. IEEE Computer Society, Los Alamitos, CA, USA (2018)

    Google Scholar 

  • Garvin, B.J., Cohen, M.B., Dwyer, M.B.: Evaluating improvements to a meta-heuristic search for constrained interaction testing. Empirical Softw. Eng. 16(1), 61–102 (2011)

    Article  Google Scholar 

  • Gogate, V., Dechter, R.: A new algorithm for sampling csp solutions uniformly at random. In: Benhamou, F. (ed.) Principles and practice of constraint programming - CP 2006, pp. 711–715. Springer, Berlin Heidelberg, Berlin, Heidelberg (2006)

    Chapter  Google Scholar 

  • Guo, J., White, J., Wang, G., et al.: A genetic algorithm for optimized feature selection with resource constraints in software product lines. Journal of Systems & Software 84(12), 2208–2221 (2011)

    Article  Google Scholar 

  • Guo, J., Czarnecki, K., Apel, S., et al.: Variability-aware performance prediction: A statistical learning approach. In: 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 301–311 (2013)

  • Guo, J., Zulkoski, E., Olaechea, R., et al: Scaling exact multi-objective combinatorial optimization by parallelization. In: ACM/IEEE International Conference on automated software engineering, ASE ’14, Vasteras, Sweden - September 15 - 19, 2014, pp 409–420 (2014)

  • Guo, J., Liang, J.H., Shi, K., et al.: SMTIBEA: a hybrid multi-objective optimization algorithm for configuring large constrained software product lines. Softw. Syst.Model. 18, 1447–1466 (2019)

    Article  Google Scholar 

  • Halin, A., Nuttinck, A., Acher, M., et al.: Test them all, is it worth it? assessing configuration sampling on the jhipster web development stack. Empirical Softw. Eng. 24, 674–717 (2019)

    Article  Google Scholar 

  • Henard, C., Papadakis, M., Perrouin, G., et al: PLEDGE: A Product Line Editor and Test Generation Tool. In: Proceedings of the 17th International Software Product Line Conference Co-Located Workshops. Association for computing machinery, New York, NY, USA, SPLC’13 Workshops, pp 126–129 (2013)

  • Henard, C., Papadakis, M., Perrouin, G., et al.: Bypassing the combinatorial explosion: using similarity to generate and prioritize t-wise test configurations for software product lines. IEEE Trans. Softw. Eng. 40(7), 650–670 (2014)

    Article  Google Scholar 

  • Henard, C., Papadakis, M., Harman, M., et al: Combining multi-objective search and constraint solving for configuring large software product lines. In: The 37th International Conference on Software Engineering, pp 517–528 (2015)

  • Heradio, R., Fernandez-Amoros, D., Galindo, J.A., et al: Uniform and scalable sat-sampling for configurable systems. In: Proceedings of the 24th ACM Conference on Systems and Software Product Line: Volume A - Volume A. Association for computing machinery, New York, NY, USA, SPLC ’20 (2020)

  • Hierons, R.M., Li, M., Liu, X., et al.: SIP: optimal product selection from feature models using many-objective evolutionary optimization. ACM Trans. Softw. Eng. Methodol. 25(2), 17:1-17:39 (2016)

    Article  Google Scholar 

  • Johansen, M.F., Haugen, Ø., Fleurey, F.: Properties of realistic feature models make combinatorial testing of product lines feasible. In: Proceedings of the 14th International Conference on model driven engineering languages and systems. Springer-Verlag, Berlin, Heidelberg, MODELS’11, pp 638–652 (2011)

  • Johansen, M.F., Haugen, Ø., Fleurey, F.: An algorithm for generating t-wise covering arrays from large feature models. In: Proceedings of the 16th International Software Product Line Conference - Volume 1. Association for computing machinery, New York, NY, USA, SPLC’12, pp 46–55 (2012)

  • Kaltenecker, C., Grebhahn, A., Siegmund, N., et al: Distance-based sampling of software configuration spaces. In: Proceedings of the 41st International Conference on software engineering. IEEE Press, ICSE’19, pp 1084–1094 (2019)

  • Kang, K.C., Cohen, S.G., Hess, J.A., et al: Feature-oriented domain analysis (FODA ) feasibility study. Tech. rep., CMU/SEI-90-TR-21. SEI. Georgetown University (1990)

  • Krieter, S., Thüm, T., Schulze, S., et al: YASA: yet another sampling algorithm. In: Cordy M, Acher M, Beuche D, et al (eds) VaMoS 20: 14th International Working Conference on variability modelling of software-intensive systems, Magdeburg Germany, February 5-7, 2020. ACM, pp 4:1–4:10 (2020)

  • Lehman, J., Kenneth, O.S.: Abandoning objectives: evolution through the search for novelty alone. Evolut. Comput. 19(2), 189–223 (2011)

    Article  Google Scholar 

  • Lehman, J., Stanley, K.O.: Exploiting open-endedness to solve problems through the search for novelty. In: Proceedings of the Eleventh International Conference on Artificial Life (ALIFE XI), pp 329–336 (2008)

  • Li, M., Yao, X.: Quality evaluation of solution sets in multiobjective optimisation: a survey. ACM Comput. Surv. 52(2), 26:1-26:38 (2019)

    Google Scholar 

  • Li, M., Yang, S., Liu, X.: Bi-goal evolution for many-objective optimization problems. Artif. Intell. 228, 45–65 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Li, M., Chen, T., Yao, X.: A critical review of: “a practical guide to select quality indicators for assessing pareto-based search algorithms in search-based software engineering”: Essay on quality indicator selection for sbse. In: Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results. ACM, New York, NY, USA, ICSE-NIER ’18, pp 17–20, https://doi.org/10.1145/3183399.3183405 (2018)

  • Liang, J.H., Ganesh, V., Czarnecki, K., et al: SAT-based Analysis of Large Real-world Feature Models is Easy. In: Proceedings of the 19th International Conference on Software Product Line. ACM, New York, NY, USA, SPLC ’15, pp 91–100 (2015)

  • Liebig, J., von Rhein, A., Kästner, C., et al: Scalable analysis of variable software. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. Association for Computing Machinery, New York, NY, USA, ESEC/FSE 2013, pp 81–91 (2013)

  • Lopez-Herrejon, R.E., Fischer, S., Ramler, R., et al: A first systematic mapping study on combinatorial interaction testing for software product lines. In: IEEE Eighth International Conference on Software testing, verification and validation workshops (ICSTW). IEEE, pp 1–10 (2015)

  • Luo, C., Sun, B., Qiao, B., et al: LS-Sampling: An effective local search based sampling approach for achieving high t-wise coverage. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for computing machinery, New York, NY, USA, ESEC/FSE 2021, pp 1081–1092 (2021)

  • Marques-Silva, J.P., Sakallah, et al.: GRASP: a search algorithm for propositional satisfiability. IEEE Trans. Comput. 48(5), 506–521 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Medeiros, F., Kästner, C., Ribeiro, M., et al: A comparison of 10 sampling algorithms for configurable systems. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp 643–654 (2016)

  • Melo, J., Flesborg, E., Brabrand, C., et al: A quantitative analysis of variability warnings in linux. In: Proceedings of the Tenth International Workshop on variability modelling of software-intensive systems. ACM, New York, NY, USA, VaMoS’16, pp 3–8 (2016)

  • Mendonca, M., Branco, M., Cowan, D.: SPLOT: software product lines online tools. In: Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications, ACM, pp 761–762 (2009)

  • Oh, J., Batory, D., Myers, M., et al: Finding near-optimal configurations in product lines by random sampling. In: Proceedings of the 2017 11th Joint meeting on foundations of software engineering. ACM, New York, NY, USA, ESEC/FSE 2017, pp 61–71 (2017)

  • Oh, J., Gazzillo, P., Batory, D.: T-wise coverage by uniform sampling. In: Proceedings of the 23rd International Systems and Software Product Line Conference - Volume A. Association for computing machinery, New York, NY, USA, SPLC’19, pp 84–87 (2019)

  • Oh, J., Gazzillo, P., Batory, D., et al: Scalable uniform sampling for real-world software product lines. Tech. Rep. TR-20-01, The University of Texas at Austin (2020)

  • Olaechea, R., Rayside, D., Guo, J., et al: Comparison of exact and approximate multi-objective optimization for software product lines. In: The International software product line conference, pp 92–101 (2014)

  • Pett, T., Thüm, T., Runge, T., et al: Product sampling for product lines: The scalability challenge. In: Proceedings of the 23rd International Systems and Software Product Line Conference - Volume A. Association for Computing Machinery, New York, NY, USA, SPLC’19, pp 78–83 (2019)

  • Plazar, Q., Acher, M., Perrouin, G., et al: Uniform sampling of SAT solutions for configurable systems: Are we there yet? In: 12th IEEE Conference on software testing, validation and verification, ICST 2019, Xi’an, China, April 22-27, 2019. IEEE, pp 240–251 (2019)

  • Pohl, R., Lauenroth, K., Pohl, K.: A performance comparison of contemporary algorithmic approaches for automated analysis operations on feature models. In: IEEE/ACM International Conference on automated software engineering, pp 313–322 (2011)

  • Sarkar, A., Guo, J., Siegmund, N., et al: Cost-efficient sampling for performance prediction of configurable systems. In: Proceedings of the 30th IEEE/ACM International conference on automated software engineering. IEEE Press, ASE’15, pp 342–352 (2015)

  • Sayyad, A.S., Goseva-Popstojanova, K., Menzies, T., et al: On parameter tuning in search based software engineering: A replicated empirical study. In: International Workshop on Replication in empirical software engineering research, pp 84–90 (2013a)

  • Sayyad, A.S., Ingram, J., Menzies, T., et al: Optimum feature selection in software product lines:let your model and values guide your search. In: International Workshop on combining modelling and search-based software engineering, pp 22–27 (2013b)

  • Sayyad, A.S., Ingram, J., Menzies, T., et al: Scalable product line configuration: A straw to break the camel’s back. In: 2013 28th IEEE/ACM International conference on automated software engineering (ASE), pp 465–474 (2013c)

  • Sayyad, A.S., Menzies, T., Ammar, H.: On the value of user preferences in search-based software engineering: A case study in software product lines. In: 2013 35th International Conference on Software Engineering (ICSE), pp 492–501, https://doi.org/10.1109/ICSE.2013.6606595 (2013d)

  • Selman, B., Levesque, H., Mitchell, D.: A new method for solving hard satisfiability problems. In: Proceedings of the Tenth National Conference on artificial intelligence. AAAI Press, AAAI’92, pp 440–446 (1992)

  • Siegmund, N., Grebhahn, A., Apel, S., et al.: Performance-influence models for highly configurable systems. Assoc. Comput. Mach. 2015, 284–294 (2015)

    Google Scholar 

  • Simon, L., Audemard, G.: Predicting Learnt Clauses Quality in Modern SAT Solver. In: Twenty-first International Joint Conference on artificial intelligence (IJCAI’09), Pasadena, United States (2009)

  • Soos, M., Gocht, S., Meel, K.S.: Tinted, detached, and lazy cnf-xor solving and its applications to counting and sampling. In: Proceedings of International Conference on Computer-Aided Verification (CAV) (2020)

  • Sundermann, C., Thüm, T., Schaefer, I.: Evaluating #sat solvers on industrial feature models. In: Proceedings of the 14th International Working Conference on Variability Modelling of Software-Intensive Systems. Association for Computing Machinery, New York, NY, USA, VAMOS’20, https://doi.org/10.1145/3377024.3377025 (2020)

  • Tan, T.H., Xue, Y., Chen, M., et al: Optimizing selection of competing features via feedback-directed evolutionary algorithms. In: Proceedings of the 2015 International Symposium on software testing and analysis (ISSTA 2015), pp 246–256 (2015)

  • Thurley, M.: Sharpsat - counting models with advanced component caching and implicit bcp. In: Biere, A., Gomes, C.P. (eds.) Theor. Appl. Satisf. Test., pp. 424–429. Springer, Berlin Heidelberg, Berlin, Heidelberg (2006)

    Google Scholar 

  • Vargha, A., Delaney, H.D.: A critique and improvement of the cl common language effect size statistics of mcgraw and wong. J. Educ. Behav. Stat. 25(2), 101–132 (2000)

    Google Scholar 

  • Xiang, Y., Zhou, Y., Li, M., et al.: A vector angle based evolutionary algorithm for unconstrained many-objective problems. IEEE Trans. Evol. Comput. 21(1), 131–152 (2017)

    Article  Google Scholar 

  • Xiang, Y., Zhou, Y., Zheng, Z., et al.: Configuring software product lines by combining many-objective optimization and SAT solvers. ACM Trans. Softw. Eng. Methodol. 26(4), 14:1-14:46 (2018)

    Article  Google Scholar 

  • Xiang, Y., Yang, X., Zhou, Y., et al.: Going deeper with optimal software products selection using many-objective optimization and satisfiability solvers. Empir. Softw. Eng. 25, 591–626 (2020)

    Article  Google Scholar 

  • Xiang, Y., Huang, H., Li, M., et al.: Looking for novelty in search-based software product line testing. IEEE Trans. Softw. Eng. 48(7), 2317–2338 (2022). https://doi.org/10.1109/TSE.2021.3057853

    Article  Google Scholar 

  • Xiang, Y., Huang, H., Zhou, Y., et al: Search-based diverse sampling from real-world software product lines. In: Proceedings of the 44th International Conference on Software Engineering. ACM, New York, NY, USA, ICSE’22, pp 1945–1957, https://doi.org/10.1145/3510003.3510053 (2022)

  • Xue, Y., Li, Y.F.: Multi-objective Integer Programming Approaches for Solving Optimal Feature Selection Problem: A New Perspective on Multi-objective Optimization Problems in SBSE. In: Proceedings of the 40th International Conference on Software Engineering. ACM, New York, NY, USA, ICSE ’18, pp 1231–1242, https://doi.org/10.1145/3180155.3180257 (2018)

  • Xue, Y., Li, M., Shepperd, M., et al.: A novel aggregation-based dominance for pareto-based evolutionary algorithms to configure software product lines. Neurocomputing 364, 32–48 (2019)

    Article  Google Scholar 

  • Zhang, W., Sun, Z., Zhu, Q., et al: NLocalSAT: Boosting local search with solution prediction. In: Bessiere C (ed) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020. ijcai.org, pp 1177–1183 (2020)

  • Zitzler, E., Künzli, S.: Indicator-based selection in multiobjective search. In: Proc. 8th International Conference on parallel problem solving from nature, PPSN VIII. Springer, pp 832–842 (2004)

  • Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans. Evolut. Comput. 3(4), 257–271 (1999)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by Guangdong Basic and Applied Basic Research Foundation (2019A1515011700, 2019A1515011411), National Natural Science Foundation of China (61906069, 61773410, 61876207), Science and Technology Program of Guangzhou (202002030355, 201802010007), Guangdong Province Key Area R &D Program (2018B010109003), and Fundamental Research Funds for the Central Universities (2020ZYGXZR014).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaowei Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, Y., Yang, X., Huang, H. et al. Sampling configurations from software product lines via probability-aware diversification and SAT solving. Autom Softw Eng 29, 54 (2022). https://doi.org/10.1007/s10515-022-00348-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-022-00348-8

Keywords

Navigation