skip to main content
10.1145/3646548.3672586acmconferencesArticle/Chapter ViewAbstractPublication PagessplcConference Proceedingsconference-collections
research-article
Open access

Pragmatic Random Sampling of the Linux Kernel: Enhancing the Randomness and Correctness of the conf Tool

Published: 02 September 2024 Publication History

Abstract

The configuration space of some systems is so large that it cannot be computed. This is the case with the Linux Kernel, which provides almost 19,000 configurable options described across more than 1,600 files in the Kconfig language. As a result, many analyses of the Kernel rely on sampling its configuration space (e.g., debugging compilation errors, predicting configuration performance, finding the configuration that optimizes specific performance metrics, etc.). The Kernel can be sampled pragmatically, with its built-in tool conf, or idealistically, translating the Kconfig files into logic formulas. The pros of the idealistic approach are that it provides statistical guarantees for the sampled configurations, but the cons are that it sets out many challenging problems that have not been solved yet, such as scalability issues. This paper introduces a new version of conf called randconfig+, which incorporates a series of improvements that increase the randomness and correctness of pragmatic sampling and also help validate the Boolean translation required for the idealistic approach. randconfig+ has been tested on 20,000 configurations generated for 10 different Kernel versions from 2003 to the present day. The experimental results show that randconfig+ is compatible with all tested Kernel versions, guarantees the correctness of the generated configurations, and increases conf’s randomness for numeric and string options.

References

[1]
Mathieu Acher, Hugo Martin, Luc Lesoil, Arnaud Blouin, Jean-Marc Jézéquel, Djamel Eddine Khelladi, Olivier Barais, and Juliana Alves Pereira. 2022. Feature Subset Selection for Learning Huge Configuration Spaces: The case of Linux Kernel Size. In Int. Systems and Software Product Line Conference (SPLC) (Graz, Austria). 85–96. https://doi.org/10.1145/3546932.3546997
[2]
Dimitris Achlioptas, Zayd S. Hammoudeh, and Panos Theodoropoulos. 2018. Fast Sampling of Perfectly Uniform Satisfying Assignments. In Int. Conference on Theory and Applications of Satisfiability Testing (SAT) (Oxford, UK). 135–147. https://doi.org/10.1007/978-3-319-94144-8_9
[3]
Joshua Ammermann, Tim Bittner, Domenik Eichhorn, Ina Schaefer, and Christoph Seidl. 2023. Can Quantum Computing Improve Uniform Random Sampling of Large Configuration Spaces?. In Int. Workshop on Quantum Software Engineering (Q-SE) (Melbourne, Australia). 34–41. https://doi.org/10.1109/Q-SE59154.2023.00012
[4]
D. Benavides, C. Sundermann, S. Vill, K. Feichtinger, José A. Galindo, R. Rabiser, and T. Thüm. 2024. UVL: Feature Modelling with the Universal Variability Language. Journal of Systems and Software (2024), submitted. https://doi.org/10.2139/ssrn.4764657
[5]
Thorsten Berger, Steven She, Rafael Lotufo, Andrzej Wasowski, and Krzysztof Czarnecki. 2013. A Study of Variability Models and Languages in the Systems Software Domain. IEEE Transactions on Software Engineering 39, 12 (2013), 1611–1640. https://doi.org/10.1109/TSE.2013.34
[6]
Bryant. 1986. Graph-Based Algorithms for Boolean Function Manipulation. IEEE Trans. Comput. C-35, 8 (1986), 677–691. https://doi.org/10.1109/TC.1986.1676819
[7]
Supratik Chakraborty, Daniel J. Fremont, Kuldeep S. Meel, Sanjit A. Seshia, and Moshe Y. Vardi. 2015. On Parallel Scalable Uniform SAT Witness Generation. In Int. Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS) (London, UK). 304–319. https://doi.org/10.1007/978-3-662-46681-0_25
[8]
Jacon Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences. Routledge, 2nd Edition. https://doi.org/10.4324/9780203771587
[9]
Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory. Wiley-Interscience, 2nd. Edition. https://doi.org/10.1002/047174882X
[10]
Christian Dietrich, Reinhard Tartler, Wolfgang Schröder-Preikschat, and Daniel Lohmann. 2012. A robust approach for variability extraction from the Linux build system. In Int. Software Product Line Conference (SPLC) (Salvador de Bahia, Brazil). 21–30. https://doi.org/10.1145/2362536.2362544
[11]
Rafael Dutra, Kevin Laeufer, Jonathan Bachrach, and Koushik Sen. 2018. Efficient Sampling of SAT Solutions for Testing. In Int. Conference on Software Engineering (ICSE) (Gothenburg, Sweden). 549–559. https://doi.org/10.1145/3180155.3180248
[12]
Sascha El-Sharkawy, Adam Krafczyk, and Klaus Schmid. 2015. Analysing the Kconfig semantics and its analysis tools. In Int. Conference on Generative Programming: Concepts and Experiences (GPCE) (Pittsburgh, PA, USA). 45–54. https://doi.org/10.1145/2814204.2814222
[13]
Sascha El-Sharkawy, Adam Krafczyk, and Klaus Schmid. 2017. An Empirical Study of Configuration Mismatches in Linux. In Int. Systems and Software Product Line Conference (SPLC) (Sevilla, Spain). 19–28. https://doi.org/10.1145/3106195.3106208
[14]
David Fernandez-Amoros, Ruben Heradio, Christoph Mayr-Dorn, and Alexander Egyed. 2019. A Kconfig Translation to Logic with One-Way Validation System. In Int. Systems and Software Product Line Conference (SPLC) (Paris, France). 303–308. https://doi.org/10.1145/3336294.3336313
[15]
David Fernandez-Amoros, Ruben Heradio, Christoph Mayr-Dorn, and Alexander Egyed. 2023. Scalable Sampling of Highly-Configurable Systems: Generating Random Instances of the Linux Kernel. In Int. Conference on Automated Software Engineering (ASE) (Rochester, USA). 1-12 pages. https://doi.org/10.1145/3551349.3556899
[16]
Andy Field, Jeremy Miles, and Zoe Field. 2012. Discovering Statistics Using R. SAGE Publications Ltd.
[17]
Patrick Franz, Thorsten Berger, Ibrahim Fayaz, Sarah Nadi, and Evgeny Groshev. 2021. ConfigFix: Interactive Configuration Conflict Resolution for the Linux Kernel. In International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (Madrid, Spain). 91–100. https://doi.org/10.1109/ICSE-SEIP52600.2021.00018
[18]
José A. Galindo, José Miguel Horcas, Alexander Felfernig, David Fernández-Amorós, and David Benavides. 2023. FLAMA: A collaborative effort to build a new framework for the automated analysis of feature models. In Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume B, SPLC 2023, Tokyo, Japan, 28 August 2023- 1 September 2023. ACM, 16–19. https://doi.org/10.1145/3579028.3609008
[19]
Lukas Güthing, Mathis Weiß, Ina Schaefer, and Malte Lochau. 2024. Sampling Cardinality-Based Feature Models. In Int. Working Conference on Variability Modelling of Software-Intensive Systems (VaMoS) (Bern, Switzerland). https://doi.org/10.1145/3634713.3634719
[20]
Ruben Heradio, David Fernandez-Amoros, José A. Galindo, David Benavides, and Don Batory. 2022. Uniform and scalable sampling of highly configurable systems. Empirical Software Engineering 27, 4 (2022), 1–34. https://doi.org/10.1007/s10664-021-10102-5
[21]
Kyo C. Kang, Sholom G. Cohen, James A. Hess, William E. Novak, and A. Spencer Peterson. 1990. Feature-oriented domain analysis (FODA) feasibility study (CMU/SEI-90-TR-021). Technical Report. Software Engineering Institute. https://insights.sei.cmu.edu/library/feature-oriented-domain-analysis-foda-feasibility-study/
[22]
Daniel T. Kaplan. 2011. Statistical Modeling: A Fresh Approach. Project Mosaic. 2nd Edition.
[23]
Christian Kästner. 2017. Differential Testing for Variational Analyses: Experience from Developing KConfigReader. Technical Report. Carnegie Mellon University. https://arxiv.org/abs/1706.09357
[24]
Hugo Martin, Mathieu Acher, Juliana Alves Pereira, and Jean-Marc Jézéquel. 2021. A comparison of performance specialization learning for configurable systems. In Int. Software Product Line Conference (SPLC) (Leicester, United Kingdom). 46–57. https://doi.org/10.1145/3461001.3471155
[25]
Hugo Martin, Mathieu Acher, Juliana Alves Pereira, Luc Lesoil, Jean-Marc Jézéquel, and Djamel Eddine Khelladi. 2022. Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size. IEEE Transactions on Software Engineering 48, 11 (2022), 4274–4290. https://doi.org/10.1109/TSE.2021.3116768
[26]
Jens Meinicke, Thomas Thüm, Reimar Schröter, Fabian Benduhn, Thomas Leich, and Gunter Saake. 2017. Mastering Software Variability with FeatureIDE. Springer. https://doi.org/10.1007/978-3-319-61443-4
[27]
Jean Melo, Elvis Flesborg, Claus Brabrand, and Andrzej Wąsowski. 2016. A Quantitative Analysis of Variability Warnings in Linux. In Int. Workshop on Variability Modelling of Software-Intensive Systems (VaMoS) (Salvador de Bahia, Brazil). 3–8. https://doi.org/10.1145/2866614.2866615
[28]
Austin Mordahl, Jeho Oh, Ugur Koc, Shiyi Wei, and Paul Gazzillo. 2019. An empirical study of real-world variability bugs detected by variability-oblivious tools. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (Tallinn, Estonia). 50–61. https://doi.org/10.1145/3338906.3338967
[29]
Johann Mortara. 2023. Mastering variability in the wild : on object-orientedvariability implementations and variability-aware buildsystems. Ph. D. Dissertation. Université Côte d’Azur,. https://theses.hal.science/tel-03988118
[30]
Johann Mortara and Philippe Collet. 2021. Capturing the diversity of analyses on the Linux kernel variability. In Int. Systems and Software Product Line Conference (SPLC) (Leicester, United Kingdom). 160–171. https://doi.org/10.1145/3461001.3471151
[31]
Daniel-Jesus Muñoz, Jeho Oh, Mónica Pinto, Lidia Fuentes, and Don Batory. 2019. Uniform Random Sampling Product Configurations of Feature Models That Have Numerical Features. In Int. Systems and Software Product Line Conference (SPLC) (Paris, France). 289–301. https://doi.org/10.1145/3336294.3336297
[32]
Sarah Nadi, Thorsten Berger, Christian Kästner, and Krzysztof Czarnecki. 2014. Mining configuration constraints: static analyses and empirical results. In Int. Conference on Software Engineering (ICSE) (Hyderabad, India). 140–151. https://doi.org/10.1145/2568225.2568283
[33]
Jeho Oh, Don Batory, and Rubén Heradio. 2023. Finding Near-optimal Configurations in Colossal Spaces with Statistical Guarantees. ACM Trans. Softw. Eng. Methodol. 33, 1 (2023), 1–36. https://doi.org/10.1145/3611663
[34]
Jeho Oh, Don Batory, Marijn J.H. Heule, Margaret Myers, and Paul Gazzillo. 2019. Uniform Sampling from Kconfig Feature Models. Technical Report. The University of Texas at Austin, Department of Computer Science. https://apps.cs.utexas.edu/apps/sites/default/files/tech_reports/2018Kconfig_0.pdf
[35]
Jeho Oh, Paul Gazzillo, and Don Batory. 2019. t-wise Coverage by Uniform Sampling. In Int. Systems and Software Product Line Conference (SPLC) (Paris, France). 84–87. https://doi.org/10.1145/3336294.3342359
[36]
Jeho Oh, Necip Fazıl Yıldıran, Julian Braha, and Paul Gazzillo. 2021. Finding broken Linux configuration specifications by statically analyzing the Kconfig language. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (Athens, Greece). 893–905. https://doi.org/10.1145/3468264.3468578
[37]
Juliana Alves Pereira, Mathieu Acher, Hugo Martin, Jean-Marc Jézéquel, Goetz Botterweck, and Anthony Ventresque. 2021. Learning software configuration spaces: A systematic literature review. Journal of Systems and Software 182 (2021), 1–29. https://doi.org/10.1016/j.jss.2021.111044
[38]
Quentin Plazar, Mathieu Acher, Gilles Perrouin, Xavier Devroey, and Maxime Cordy. 2019. Uniform Sampling of SAT Solutions for Configurable Systems: Are We There Yet?. In IEEE Conference on Software Testing, Validation and Verification (ICST) (Xian, China). 240–251. https://doi.org/10.1109/ICST.2019.00032
[39]
Georges Aaron Randrianaina, Djamel Eddine Khelladi, Olivier Zendra, and Mathieu Acher. 2022. Towards incremental build of software configurations. In Int. Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER) (Pittsburgh, Pennsylvania). 101–105. https://doi.org/10.1145/3510455.3512792
[40]
Shubham Sharma, Rahul Gupta, Subhajit Roy, and Kuldeep S. Meel. 2018. Knowledge Compilation meets Uniform Sampling. In Int. Conference on Logic for Programming, Artificial Intelligence and Reasoning (LPAR) (Awassa, Ethiopia). 620–636.
[41]
Steven She and Thorsten Berger. 2010. Formal Semantics of the Kconfig Language. Technical Report. University of Waterloo. https://doi.org/10.48550/arXiv.2209.04916
[42]
Norbert Siegmund, Marko Rosenmuller, Christian Kastner, Paolo G. Giarrusso, Sven Apel, and Sergiy S. Kolesnikov. 2011. Scalable Prediction of Non-functional Properties in Software Product Lines. In Int. Software Product Line Conference (SPLC) (Munich, Germany). 160–169. https://doi.org/10.1109/SPLC.2011.20
[43]
Julio Sincero, Wolfgang Schröder-Preikschat, and Olaf Spinczyk. 2010. Approaching Non-functional Properties of Software Product Lines: Learning from Products. In Asia Pacific Software Engineering Conference (APSEC) (Sydney, Australia). 147–155. https://doi.org/10.1109/APSEC.2010.26
[44]
Julio Sincero, Reinhard Tartler, Daniel Lohmann, and Wolfgang Schröder-Preikschat. 2010. Efficient extraction and analysis of preprocessor-based variability. In Int. Conference on Generative Programming and Component Engineering (GPCE) (Eindhoven, The Netherlands). 33–42. https://doi.org/10.1145/1868294.1868300
[45]
Mate Soos, Stephan Gocht, and Kuldeep S. Meel. 2020. Tinted, Detached, and Lazy CNF-XOR solving and its Applications to Counting and Sampling. In Int. Conference on Computer-Aided Verification (CAV) (Los Angeles, California, USA). 463–484. https://doi.org/10.1007/978-3-030-53288-8_22
[46]
Reinhard Tartler. 2013. Mastering Variability Challenges in Linux and RelatedHighly-Configurable System Software. Ph. D. Dissertation. Universität Erlangen-Nürnberg. https://open.fau.de/items/ac4a8eda-d3aa-4a5e-b087-c2be66dcc9f8
[47]
Reinhard Tartler, Julio Sincero, Christian Dietrich, and Wolfgang Schröder-Preikschat. 2012. Revealing and repairing configuration inconsistencies in large-scale system software. International Journal on Software Tools for Technology Transfer 14, 5 (2012), 531–51. https://doi.org/10.1007/s10009-012-0225-2
[48]
Hadley Wickham, Mine Cetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, 2nd Edition.
[49]
Claus O. Wilke. 2019. Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures. O’Reilly Media.
[50]
Kaan Berk Yaman. 2023. The Kconfig Variability Framework as a Feature Model. Master’s thesis. Institute of Information Security and Dependability. Department of Informatics. https://publikationen.bibliothek.kit.edu/1000162110/151337644
[51]
Kaan Berk Yaman, Jan W. Wittler, and Christopher Gerking. 2024. Kfeature: Rendering the Kconfig System into Feature Models. In Int. Working Conference on Variability Modelling of Software-Intensive Systems (VaMoS) (Bern, Switzerland). 1–5. https://doi.org/10.1145/3634713.3634731
[52]
Necip Fazil Yildiran, Jeho Oh, Julia Lawall, and Paul Gazzillo. 2024. Maximizing Patch Coverage for Testing of Highly-Configurable Software without Exploding Build Times (preprint). In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (Porto de Galinhas, Brazil). 1–23. https://www.paulgazzillo.com/papers/fse24.pdf

Index Terms

  1. Pragmatic Random Sampling of the Linux Kernel: Enhancing the Randomness and Correctness of the conf Tool

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SPLC '24: Proceedings of the 28th ACM International Systems and Software Product Line Conference
    September 2024
    103 pages
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 September 2024

    Check for updates

    Badges

    Author Tags

    1. Kconfig
    2. SAT
    3. configurable systems
    4. randconfig
    5. random sampling
    6. software product lines
    7. variability modeling

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • FEDER/Spanish Ministry of Science, Innovation and Universities
    • Universidad Nacional de Educación a Distancia (UNED)
    • FEDER/Spanish Ministry of Science, Innovation and Universities
    • FEDER/Spanish Ministry of Science, Innovation and Universities
    • Spanish Ministry of Science, Innovation and Universities
    • Junta de Andalucía/State Research Agency/CDTI

    Conference

    SPLC '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 167 of 463 submissions, 36%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 181
      Total Downloads
    • Downloads (Last 12 months)181
    • Downloads (Last 6 weeks)37
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media