skip to main content
10.1145/3579027.3608973acmconferencesArticle/Chapter ViewAbstractPublication PagessplcConference Proceedingsconference-collections
research-article

Large Language Models to generate meaningful feature model instances

Published: 28 August 2023 Publication History

Abstract

Feature models are the "de facto" standard for representing variability in software-intensive systems. Automated analysis of feature models is the computer-aided extraction of information of feature models and is used in testing, maintenance, configuration, and derivation, among other tasks. Testing the analyses of feature models often requires relying on a large number of models that are as realistic as possible. There exist different proposals to generate synthetic feature models using random techniques or metamorphic relations; however, the existing methods do not take into account the semantics of the concepts of the domain that are being represented and the interrelations between them, leading to less realistic feature models. In this paper, we propose a novel approach that uses Large Language Models (LLMs), such as Codex or GPT-3, to generate realistic feature models that preserve semantic coherence while maintaining syntactic validity. The approach automatically generates instances of feature models from a given domain. Concretely, two language models were used, first OpenAI's Codex to generate new instances of feature models using the Universal Variability Language (UVL) syntax and then Cohere's semantic analysis to verify if the newly introduced concepts are from the same domain. This approach enabled the generation of 90% of valid instances according to the UVL syntax. In addition, the valid models score well on model complexity metrics, and the generated features mirror the domain of the original UVL instance used as prompts. With this work, we envision a new thread of research where variability is generated and analyzed using LLMs. This opens the door for a new generation of techniques and tools for variability management.

References

[1]
Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, and Denny Zhou. 2022. What learning algorithm is in-context learning? Investigations with linear models. arXiv:2211.15661 [cs.LG]
[2]
Stephen H. Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Xiangru Tang, Mike Tian-Jian Jiang, and Alexander M. Rush. 2022. PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. arXiv:2202.01279 [cs.LG]
[3]
Guillaume Bécan, Mathieu Acher, Benoit Baudry, and Sana Ben Nasr. 2016. Breathing ontological knowledge into feature model synthesis: an empirical study. Empir. Softw. Eng. 21, 4 (2016), 1794--1841. https://doi.org/10.1007/s10664-014-9357-1
[4]
David Benavides, Sergio Segura, and Antonio Ruiz-Cortés. 2010. Automated Analysis of Feature Models 20 years Later: a Literature Review. Information Systems 35, 6 (2010). https://doi.org/10.1016/j.is.2010.01.001
[5]
David Benavides, Pablo Trinidad, Antonio Ruiz-Cortés, and Sergio Segura. 2013. Fama. Systems and Software Variability Management: Concepts, Tools and Experiences (2013), 163--171.
[6]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.
[7]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL]
[9]
Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen tau Yih, Luke Zettlemoyer, and Mike Lewis. 2023. InCoder: A Generative Model for Code Infilling and Synthesis. arXiv:2204.05999 [cs.SE]
[10]
José Galindo, David Benavides, and Sergio Segura. 2010. Debian Packages Repositories as Software Product Line Models. Towards Automated Analysis. In ACoTA. 29--34.
[11]
José A. Galindo and David Benavides. 2020. A Python framework for the automated analysis of feature models: A first step to integrate community efforts. In SPLC '20: 24th ACM International Systems and Software Product Line Conference, Montreal, Quebec, Canada, October 19-23, 2020, Volume B. ACM, 52--55. https://doi.org/10.1145/3382026.3425773
[12]
José Angel Galindo, David Benavides, Pablo Trinidad, Antonio Manuel Gutiérrez-Fernández, and Antonio Ruiz-Cortés. 2019. Automated analysis of feature models: Quo vadis? Computing 101, 5 (2019), 387--433. https://doi.org/10.1007/s00607-018-0646-1
[13]
José Angel Galindo, Hamilton A. Turner, David Benavides, and Jules White. 2016. Testing variability-intensive systems using automated analysis: an application to Android. Softw. Qual. J. 24, 2 (2016), 365--405. https://doi.org/10.1007/s11219-014-9258-y
[14]
Ian J. Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, and Yoshua Bengio. 2015. An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks. arXiv:1312.6211 [stat.ML]
[15]
José Miguel Horcas, José Angel Galindo, Mónica Pinto, Lidia Fuentes, and David Benavides. 2022. FM fact label: a configurable and interactive visualization of feature model characterizations. In SPLC '22: 26th ACM International Systems and Software Product Line Conference, Graz, Austria, September 12 - 16, 2022, Volume B. ACM, 42--45. https://doi.org/10.1145/3503229.3547025
[16]
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 [cs.CL]
[17]
Kyo C Kang, Sholom G Cohen, James A Hess, William E Novak, and A Spencer Peterson. 1990. Feature-oriented domain analysis (FODA) feasibility study. Technical Report. DTIC Document.
[18]
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling Laws for Neural Language Models. arXiv:2001.08361 [cs.LG]
[19]
Christian Kastner, Thomas Thum, Gunter Saake, Janet Feigenspan, Thomas Leich, Fabian Wielgorz, and Sven Apel. 2009. FeatureIDE: A tool framework for feature-oriented software development. In 2009 IEEE 31st International Conference on Software Engineering. IEEE, 611--614.
[20]
Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Ré mi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d'Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, and Oriol Vinyals. 2022. Competition-level code generation with AlphaCode. Science 378, 6624 (dec 2022), 1092--1097. https://doi.org/10.1126/science.abq1158
[21]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. Comput. Surveys 55, 9 (2023), 1--35.
[22]
Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, and Jie Tang. 2022. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. arXiv:2110.07602 [cs.CL]
[23]
Roberto Erick Lopez-Herrejon, Lukas Linsbauer, José Angel Galindo, José Antonio Parejo, David Benavides, Sergio Segura, and Alexander Egyed. 2015. An assessment of search-based techniques for reverse engineering feature models. J. Syst. Softw. 103 (2015), 353--369. https://doi.org/10.1016/j.jss.2014.10.037
[24]
R. Lotufo, S. She, T. Berger, K. Czarnecki, and A. Wasowski. 2010. Evolution of the Linux kernel variability model. Software Product Lines: Going Beyond (2010), 136--150.
[25]
Marcilio Mendonca, Moises Branco, and Donald Cowan. 2009. SPLOT: software product lines online tools. In Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications. 761--762.
[26]
Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong. 2023. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. arXiv:2203.13474 [cs.LG]
[27]
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730--27744.
[28]
Terence Parr. 2013. The definitive ANTLR 4 reference. The Definitive ANTLR 4 Reference (2013), 1--326.
[29]
Philipp Probst, Anne-Laure Boulesteix, and Bernd Bischl. 2019. Tunability: Importance of hyperparameters of machine learning algorithms. The Journal of Machine Learning Research 20, 1 (2019), 1934--1965.
[30]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arXiv:1910.10683 [cs.LG]
[31]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv:1908.10084 [cs.CL]
[32]
Jorge Rodas-Silva, José Angel Galindo, Jorge García-Gutiérrez, and David Benavides. 2019. Selection of Software Product Line Implementation Components Using Recommender Systems: An Application to Wordpress. IEEE Access 7 (2019), 69226--69245. https://doi.org/10.1109/ACCESS.2019.2918469
[33]
Sergio Segura, José Angel Galindo, David Benavides, José Antonio Parejo, and Antonio Ruiz Cortés. 2012. BeTTy: benchmarking and testing on the automated analysis of feature models. In Sixth International Workshop on Variability Modelling of Software-Intensive Systems, Leipzig, Germany, January 25-27, 2012. Proceedings. ACM, 63--71. https://doi.org/10.1145/2110147.2110155
[34]
Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, and Nigel Collier. 2022. A Contrastive Framework for Neural Text Generation. arXiv:2202.06417 [cs.CL]
[35]
Chico Sundermann, Kevin Feichtinger, José A. Galindo, David Benavides, Rick Rabiser, Sebastian Krieter, and Thomas Thüm. 2022. Tutorial on the universal variability language. In SPLC '22:26th ACM International Systems and Software Product Line Conference, Graz, Austria, September 12 - 16, 2022, Volume A. ACM, 260. https://doi.org/10.1145/3546932.3547024
[36]
A.B. Sánchez, S. Segura, and A. Ruiz-Cortés. 2014. The Drupal framework: A case study to evaluate variability testing techniques, In International Workshop on Variability Modeling of Software-Intensive Systems (VAMOS). ACM International Conference Proceeding Series. cited By (since 1996)0.
[37]
Thomas Thum, Don Batory, and Christian Kastner. 2009. Reasoning about edits to feature models. In 2009 IEEE 31st International Conference on Software Engineering. IEEE, 254--264.
[38]
Jason Wei, Maarten Bosma, Vincent Y Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, and Quoc V Le. 2021. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652 (2021).
[39]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. 2023. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903 [cs.CL]
[40]
Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C Schmidt. 2023. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382 (2023).
[41]
Jules White, Sam Hays, Quchen Fu, Jesse Spencer-Smith, and Douglas C Schmidt. 2023. Chatgpt prompt patterns for improving code quality, refactoring, requirements elicitation, and software design. arXiv preprint arXiv.2303.07839 (2023).
[42]
Frank F Xu, Uri Alon, Graham Neubig, and Vincent Josua Hellendoorn. 2022. A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming. 1--10.

Cited By

View all
  • (2025)Large Language Models (LLMs) for Smart Manufacturing and Industry X.0Artificial Intelligence for Smart Manufacturing and Industry X.010.1007/978-3-031-80154-9_5(97-119)Online publication date: 6-Mar-2025
  • (2024)Generating Feature Models with UVL's Full ExpressivenessProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3676602(61-65)Online publication date: 2-Sep-2024
  • (2024)Not Quite There Yet: Remaining Challenges in Systems and Software Product Line Engineering as Perceived by Industry PractitionersProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672587(179-190)Online publication date: 2-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPLC '23: Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A
August 2023
305 pages
ISBN:9798400700910
DOI:10.1145/3579027
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 August 2023

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. deep learning
  2. large language models
  3. synthetic models
  4. universal variability language

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Junta de Andalucía
  • FEDER/Ministry of Science and Innovation

Conference

SPLC '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 167 of 463 submissions, 36%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)186
  • Downloads (Last 6 weeks)14
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Large Language Models (LLMs) for Smart Manufacturing and Industry X.0Artificial Intelligence for Smart Manufacturing and Industry X.010.1007/978-3-031-80154-9_5(97-119)Online publication date: 6-Mar-2025
  • (2024)Generating Feature Models with UVL's Full ExpressivenessProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3676602(61-65)Online publication date: 2-Sep-2024
  • (2024)Not Quite There Yet: Remaining Challenges in Systems and Software Product Line Engineering as Perceived by Industry PractitionersProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672587(179-190)Online publication date: 2-Sep-2024
  • (2024)Variability Management for Large Language Model Tasks: Practical Insights from an Industrial ApplicationProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672581(148-152)Online publication date: 2-Sep-2024
  • (2024)Automating Software Product Line Adoption Based on Feature Models Using Large Language Models2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA)10.1109/ETFA61755.2024.10710832(1-4)Online publication date: 10-Sep-2024
  • (2024)FM Fact LabelScience of Computer Programming10.1016/j.scico.2024.103214(103214)Online publication date: Sep-2024
  • (2024)Exploring LLMs’ Ability to Detect Variability in RequirementsRequirements Engineering: Foundation for Software Quality10.1007/978-3-031-57327-9_11(178-188)Online publication date: 8-Apr-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media