research-article

Open access

Pragmatic Random Sampling of the Linux Kernel: Enhancing the Randomness and Correctness of the conf Tool

Authors:

David Fernandez-Amoros,

Jose Miguel Horcas Aguilera,

José A. Galindo,

David Benavides,

Lidia FuentesAuthors Info & Claims

SPLC '24: Proceedings of the 28th ACM International Systems and Software Product Line Conference

Pages 24 - 35

https://doi.org/10.1145/3646548.3672586

Published: 02 September 2024 Publication History

All formats PDF

Abstract

The configuration space of some systems is so large that it cannot be computed. This is the case with the Linux Kernel, which provides almost 19,000 configurable options described across more than 1,600 files in the Kconfig language. As a result, many analyses of the Kernel rely on sampling its configuration space (e.g., debugging compilation errors, predicting configuration performance, finding the configuration that optimizes specific performance metrics, etc.). The Kernel can be sampled pragmatically, with its built-in tool conf, or idealistically, translating the Kconfig files into logic formulas. The pros of the idealistic approach are that it provides statistical guarantees for the sampled configurations, but the cons are that it sets out many challenging problems that have not been solved yet, such as scalability issues. This paper introduces a new version of conf called randconfig+, which incorporates a series of improvements that increase the randomness and correctness of pragmatic sampling and also help validate the Boolean translation required for the idealistic approach. randconfig+ has been tested on 20,000 configurations generated for 10 different Kernel versions from 2003 to the present day. The experimental results show that randconfig+ is compatible with all tested Kernel versions, guarantees the correctness of the generated configurations, and increases conf’s randomness for numeric and string options.

References

[1]

Mathieu Acher, Hugo Martin, Luc Lesoil, Arnaud Blouin, Jean-Marc Jézéquel, Djamel Eddine Khelladi, Olivier Barais, and Juliana Alves Pereira. 2022. Feature Subset Selection for Learning Huge Configuration Spaces: The case of Linux Kernel Size. In Int. Systems and Software Product Line Conference (SPLC) (Graz, Austria). 85–96. https://doi.org/10.1145/3546932.3546997

Digital Library

[2]

Dimitris Achlioptas, Zayd S. Hammoudeh, and Panos Theodoropoulos. 2018. Fast Sampling of Perfectly Uniform Satisfying Assignments. In Int. Conference on Theory and Applications of Satisfiability Testing (SAT) (Oxford, UK). 135–147. https://doi.org/10.1007/978-3-319-94144-8_9

Digital Library

[3]

Joshua Ammermann, Tim Bittner, Domenik Eichhorn, Ina Schaefer, and Christoph Seidl. 2023. Can Quantum Computing Improve Uniform Random Sampling of Large Configuration Spaces?. In Int. Workshop on Quantum Software Engineering (Q-SE) (Melbourne, Australia). 34–41. https://doi.org/10.1109/Q-SE59154.2023.00012

[4]

D. Benavides, C. Sundermann, S. Vill, K. Feichtinger, José A. Galindo, R. Rabiser, and T. Thüm. 2024. UVL: Feature Modelling with the Universal Variability Language. Journal of Systems and Software (2024), submitted. https://doi.org/10.2139/ssrn.4764657

[5]

Thorsten Berger, Steven She, Rafael Lotufo, Andrzej Wasowski, and Krzysztof Czarnecki. 2013. A Study of Variability Models and Languages in the Systems Software Domain. IEEE Transactions on Software Engineering 39, 12 (2013), 1611–1640. https://doi.org/10.1109/TSE.2013.34

Digital Library

[6]

Bryant. 1986. Graph-Based Algorithms for Boolean Function Manipulation. IEEE Trans. Comput. C-35, 8 (1986), 677–691. https://doi.org/10.1109/TC.1986.1676819

Digital Library

[7]

Supratik Chakraborty, Daniel J. Fremont, Kuldeep S. Meel, Sanjit A. Seshia, and Moshe Y. Vardi. 2015. On Parallel Scalable Uniform SAT Witness Generation. In Int. Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS) (London, UK). 304–319. https://doi.org/10.1007/978-3-662-46681-0_25

Digital Library

[8]

Jacon Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences. Routledge, 2nd Edition. https://doi.org/10.4324/9780203771587

[9]

Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory. Wiley-Interscience, 2nd. Edition. https://doi.org/10.1002/047174882X

[10]

Christian Dietrich, Reinhard Tartler, Wolfgang Schröder-Preikschat, and Daniel Lohmann. 2012. A robust approach for variability extraction from the Linux build system. In Int. Software Product Line Conference (SPLC) (Salvador de Bahia, Brazil). 21–30. https://doi.org/10.1145/2362536.2362544

Digital Library

[11]

Rafael Dutra, Kevin Laeufer, Jonathan Bachrach, and Koushik Sen. 2018. Efficient Sampling of SAT Solutions for Testing. In Int. Conference on Software Engineering (ICSE) (Gothenburg, Sweden). 549–559. https://doi.org/10.1145/3180155.3180248

Digital Library

[12]

Sascha El-Sharkawy, Adam Krafczyk, and Klaus Schmid. 2015. Analysing the Kconfig semantics and its analysis tools. In Int. Conference on Generative Programming: Concepts and Experiences (GPCE) (Pittsburgh, PA, USA). 45–54. https://doi.org/10.1145/2814204.2814222

Digital Library

[13]

Sascha El-Sharkawy, Adam Krafczyk, and Klaus Schmid. 2017. An Empirical Study of Configuration Mismatches in Linux. In Int. Systems and Software Product Line Conference (SPLC) (Sevilla, Spain). 19–28. https://doi.org/10.1145/3106195.3106208

Digital Library

[14]

David Fernandez-Amoros, Ruben Heradio, Christoph Mayr-Dorn, and Alexander Egyed. 2019. A Kconfig Translation to Logic with One-Way Validation System. In Int. Systems and Software Product Line Conference (SPLC) (Paris, France). 303–308. https://doi.org/10.1145/3336294.3336313

Digital Library

[15]

David Fernandez-Amoros, Ruben Heradio, Christoph Mayr-Dorn, and Alexander Egyed. 2023. Scalable Sampling of Highly-Configurable Systems: Generating Random Instances of the Linux Kernel. In Int. Conference on Automated Software Engineering (ASE) (Rochester, USA). 1-12 pages. https://doi.org/10.1145/3551349.3556899

Digital Library

[16]

Andy Field, Jeremy Miles, and Zoe Field. 2012. Discovering Statistics Using R. SAGE Publications Ltd.

Digital Library

[17]

Patrick Franz, Thorsten Berger, Ibrahim Fayaz, Sarah Nadi, and Evgeny Groshev. 2021. ConfigFix: Interactive Configuration Conflict Resolution for the Linux Kernel. In International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (Madrid, Spain). 91–100. https://doi.org/10.1109/ICSE-SEIP52600.2021.00018

Digital Library

[18]

José A. Galindo, José Miguel Horcas, Alexander Felfernig, David Fernández-Amorós, and David Benavides. 2023. FLAMA: A collaborative effort to build a new framework for the automated analysis of feature models. In Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume B, SPLC 2023, Tokyo, Japan, 28 August 2023- 1 September 2023. ACM, 16–19. https://doi.org/10.1145/3579028.3609008

Digital Library

[19]

Lukas Güthing, Mathis Weiß, Ina Schaefer, and Malte Lochau. 2024. Sampling Cardinality-Based Feature Models. In Int. Working Conference on Variability Modelling of Software-Intensive Systems (VaMoS) (Bern, Switzerland). https://doi.org/10.1145/3634713.3634719

Digital Library

[20]

Ruben Heradio, David Fernandez-Amoros, José A. Galindo, David Benavides, and Don Batory. 2022. Uniform and scalable sampling of highly configurable systems. Empirical Software Engineering 27, 4 (2022), 1–34. https://doi.org/10.1007/s10664-021-10102-5

Digital Library

[21]

Kyo C. Kang, Sholom G. Cohen, James A. Hess, William E. Novak, and A. Spencer Peterson. 1990. Feature-oriented domain analysis (FODA) feasibility study (CMU/SEI-90-TR-021). Technical Report. Software Engineering Institute. https://insights.sei.cmu.edu/library/feature-oriented-domain-analysis-foda-feasibility-study/

[22]

Daniel T. Kaplan. 2011. Statistical Modeling: A Fresh Approach. Project Mosaic. 2nd Edition.

[23]

Christian Kästner. 2017. Differential Testing for Variational Analyses: Experience from Developing KConfigReader. Technical Report. Carnegie Mellon University. https://arxiv.org/abs/1706.09357

[24]

Hugo Martin, Mathieu Acher, Juliana Alves Pereira, and Jean-Marc Jézéquel. 2021. A comparison of performance specialization learning for configurable systems. In Int. Software Product Line Conference (SPLC) (Leicester, United Kingdom). 46–57. https://doi.org/10.1145/3461001.3471155

Digital Library

[25]

Hugo Martin, Mathieu Acher, Juliana Alves Pereira, Luc Lesoil, Jean-Marc Jézéquel, and Djamel Eddine Khelladi. 2022. Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size. IEEE Transactions on Software Engineering 48, 11 (2022), 4274–4290. https://doi.org/10.1109/TSE.2021.3116768

[26]

Jens Meinicke, Thomas Thüm, Reimar Schröter, Fabian Benduhn, Thomas Leich, and Gunter Saake. 2017. Mastering Software Variability with FeatureIDE. Springer. https://doi.org/10.1007/978-3-319-61443-4

[27]

Jean Melo, Elvis Flesborg, Claus Brabrand, and Andrzej Wąsowski. 2016. A Quantitative Analysis of Variability Warnings in Linux. In Int. Workshop on Variability Modelling of Software-Intensive Systems (VaMoS) (Salvador de Bahia, Brazil). 3–8. https://doi.org/10.1145/2866614.2866615

Digital Library

[28]

Austin Mordahl, Jeho Oh, Ugur Koc, Shiyi Wei, and Paul Gazzillo. 2019. An empirical study of real-world variability bugs detected by variability-oblivious tools. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (Tallinn, Estonia). 50–61. https://doi.org/10.1145/3338906.3338967

Digital Library

[29]

Johann Mortara. 2023. Mastering variability in the wild : on object-orientedvariability implementations and variability-aware buildsystems. Ph. D. Dissertation. Université Côte d’Azur,. https://theses.hal.science/tel-03988118

[30]

Johann Mortara and Philippe Collet. 2021. Capturing the diversity of analyses on the Linux kernel variability. In Int. Systems and Software Product Line Conference (SPLC) (Leicester, United Kingdom). 160–171. https://doi.org/10.1145/3461001.3471151

Digital Library

[31]

Daniel-Jesus Muñoz, Jeho Oh, Mónica Pinto, Lidia Fuentes, and Don Batory. 2019. Uniform Random Sampling Product Configurations of Feature Models That Have Numerical Features. In Int. Systems and Software Product Line Conference (SPLC) (Paris, France). 289–301. https://doi.org/10.1145/3336294.3336297

Digital Library

[32]

Sarah Nadi, Thorsten Berger, Christian Kästner, and Krzysztof Czarnecki. 2014. Mining configuration constraints: static analyses and empirical results. In Int. Conference on Software Engineering (ICSE) (Hyderabad, India). 140–151. https://doi.org/10.1145/2568225.2568283

Digital Library

[33]

Jeho Oh, Don Batory, and Rubén Heradio. 2023. Finding Near-optimal Configurations in Colossal Spaces with Statistical Guarantees. ACM Trans. Softw. Eng. Methodol. 33, 1 (2023), 1–36. https://doi.org/10.1145/3611663

Digital Library

[34]

Jeho Oh, Don Batory, Marijn J.H. Heule, Margaret Myers, and Paul Gazzillo. 2019. Uniform Sampling from Kconﬁg Feature Models. Technical Report. The University of Texas at Austin, Department of Computer Science. https://apps.cs.utexas.edu/apps/sites/default/files/tech_reports/2018Kconfig_0.pdf

[35]

Jeho Oh, Paul Gazzillo, and Don Batory. 2019. t-wise Coverage by Uniform Sampling. In Int. Systems and Software Product Line Conference (SPLC) (Paris, France). 84–87. https://doi.org/10.1145/3336294.3342359

Digital Library

[36]

Jeho Oh, Necip Fazıl Yıldıran, Julian Braha, and Paul Gazzillo. 2021. Finding broken Linux configuration specifications by statically analyzing the Kconfig language. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (Athens, Greece). 893–905. https://doi.org/10.1145/3468264.3468578

Digital Library

[37]

Juliana Alves Pereira, Mathieu Acher, Hugo Martin, Jean-Marc Jézéquel, Goetz Botterweck, and Anthony Ventresque. 2021. Learning software configuration spaces: A systematic literature review. Journal of Systems and Software 182 (2021), 1–29. https://doi.org/10.1016/j.jss.2021.111044

Digital Library

[38]

Quentin Plazar, Mathieu Acher, Gilles Perrouin, Xavier Devroey, and Maxime Cordy. 2019. Uniform Sampling of SAT Solutions for Configurable Systems: Are We There Yet?. In IEEE Conference on Software Testing, Validation and Verification (ICST) (Xian, China). 240–251. https://doi.org/10.1109/ICST.2019.00032

[39]

Georges Aaron Randrianaina, Djamel Eddine Khelladi, Olivier Zendra, and Mathieu Acher. 2022. Towards incremental build of software configurations. In Int. Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER) (Pittsburgh, Pennsylvania). 101–105. https://doi.org/10.1145/3510455.3512792

Digital Library

[40]

Shubham Sharma, Rahul Gupta, Subhajit Roy, and Kuldeep S. Meel. 2018. Knowledge Compilation meets Uniform Sampling. In Int. Conference on Logic for Programming, Artificial Intelligence and Reasoning (LPAR) (Awassa, Ethiopia). 620–636.

[41]

Steven She and Thorsten Berger. 2010. Formal Semantics of the Kconfig Language. Technical Report. University of Waterloo. https://doi.org/10.48550/arXiv.2209.04916

[42]

Norbert Siegmund, Marko Rosenmuller, Christian Kastner, Paolo G. Giarrusso, Sven Apel, and Sergiy S. Kolesnikov. 2011. Scalable Prediction of Non-functional Properties in Software Product Lines. In Int. Software Product Line Conference (SPLC) (Munich, Germany). 160–169. https://doi.org/10.1109/SPLC.2011.20

Digital Library

[43]

Julio Sincero, Wolfgang Schröder-Preikschat, and Olaf Spinczyk. 2010. Approaching Non-functional Properties of Software Product Lines: Learning from Products. In Asia Pacific Software Engineering Conference (APSEC) (Sydney, Australia). 147–155. https://doi.org/10.1109/APSEC.2010.26

Digital Library

[44]

Julio Sincero, Reinhard Tartler, Daniel Lohmann, and Wolfgang Schröder-Preikschat. 2010. Efficient extraction and analysis of preprocessor-based variability. In Int. Conference on Generative Programming and Component Engineering (GPCE) (Eindhoven, The Netherlands). 33–42. https://doi.org/10.1145/1868294.1868300

Digital Library

[45]

Mate Soos, Stephan Gocht, and Kuldeep S. Meel. 2020. Tinted, Detached, and Lazy CNF-XOR solving and its Applications to Counting and Sampling. In Int. Conference on Computer-Aided Verification (CAV) (Los Angeles, California, USA). 463–484. https://doi.org/10.1007/978-3-030-53288-8_22

Digital Library

[46]

Reinhard Tartler. 2013. Mastering Variability Challenges in Linux and RelatedHighly-Configurable System Software. Ph. D. Dissertation. Universität Erlangen-Nürnberg. https://open.fau.de/items/ac4a8eda-d3aa-4a5e-b087-c2be66dcc9f8

[47]

Reinhard Tartler, Julio Sincero, Christian Dietrich, and Wolfgang Schröder-Preikschat. 2012. Revealing and repairing configuration inconsistencies in large-scale system software. International Journal on Software Tools for Technology Transfer 14, 5 (2012), 531–51. https://doi.org/10.1007/s10009-012-0225-2

Digital Library

[48]

Hadley Wickham, Mine Cetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, 2nd Edition.

[49]

Claus O. Wilke. 2019. Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures. O’Reilly Media.

[50]

Kaan Berk Yaman. 2023. The Kconfig Variability Framework as a Feature Model. Master’s thesis. Institute of Information Security and Dependability. Department of Informatics. https://publikationen.bibliothek.kit.edu/1000162110/151337644

[51]

Kaan Berk Yaman, Jan W. Wittler, and Christopher Gerking. 2024. Kfeature: Rendering the Kconfig System into Feature Models. In Int. Working Conference on Variability Modelling of Software-Intensive Systems (VaMoS) (Bern, Switzerland). 1–5. https://doi.org/10.1145/3634713.3634731

Digital Library

[52]

Necip Fazil Yildiran, Jeho Oh, Julia Lawall, and Paul Gazzillo. 2024. Maximizing Patch Coverage for Testing of Highly-Configurable Software without Exploding Build Times (preprint). In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (Porto de Galinhas, Brazil). 1–23. https://www.paulgazzillo.com/papers/fse24.pdf

Index Terms

Pragmatic Random Sampling of the Linux Kernel: Enhancing the Randomness and Correctness of the conf Tool
1. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Reusability
        Software product lines

Recommendations

Scalable Sampling of Highly-Configurable Systems: Generating Random Instances of the Linux Kernel
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Software systems are becoming increasingly configurable. A paradigmatic example is the Linux kernel, which can be adjusted for a tremendous variety of hardware devices, from mobile phones to supercomputers, thanks to the thousands of configurable ...
An Empirical Study of Configuration Mismatches in Linux
SPLC '17: Proceedings of the 21st International Systems and Software Product Line Conference - Volume A

Ideally the variability of a product line is represented completely and correctly by its variability model. However, in practice additional variability is often represented on the level of the build system or in the code. Such a situation may lead to ...
SPLat: lightweight dynamic analysis for reducing combinatorics in testing configurable systems
ESEC/FSE 2013: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering

Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SPLC '24: Proceedings of the 28th ACM International Systems and Software Product Line Conference

September 2024

103 pages

ISBN:9798400705939

DOI:10.1145/3646548

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 September 2024

Check for updates

Badges

Artifacts Available / v1.1

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

FEDER/Spanish Ministry of Science, Innovation and Universities
Universidad Nacional de Educación a Distancia (UNED)
FEDER/Spanish Ministry of Science, Innovation and Universities
FEDER/Spanish Ministry of Science, Innovation and Universities
Spanish Ministry of Science, Innovation and Universities
Junta de Andalucía/State Research Agency/CDTI

Conference

SPLC '24

Sponsor:

SIGSOFT

SPLC '24: 28th ACM International Systems and Software Product Line Conference

September 2 - 6, 2024

Dommeldange, Luxembourg

Acceptance Rates

Overall Acceptance Rate 167 of 463 submissions, 36%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
181
Total Downloads

Downloads (Last 12 months)181
Downloads (Last 6 weeks)37

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten