research-article

Large Language Models to generate meaningful feature model instances

Authors:

José A. Galindo,

Antonio J. Dominguez,

Jules White,

David BenavidesAuthors Info & Claims

SPLC '23: Proceedings of the 27th ACM International Systems and Software Product Line Conference - Volume A

Pages 15 - 26

https://doi.org/10.1145/3579027.3608973

Published: 28 August 2023 Publication History

Get Access

Abstract

Feature models are the "de facto" standard for representing variability in software-intensive systems. Automated analysis of feature models is the computer-aided extraction of information of feature models and is used in testing, maintenance, configuration, and derivation, among other tasks. Testing the analyses of feature models often requires relying on a large number of models that are as realistic as possible. There exist different proposals to generate synthetic feature models using random techniques or metamorphic relations; however, the existing methods do not take into account the semantics of the concepts of the domain that are being represented and the interrelations between them, leading to less realistic feature models. In this paper, we propose a novel approach that uses Large Language Models (LLMs), such as Codex or GPT-3, to generate realistic feature models that preserve semantic coherence while maintaining syntactic validity. The approach automatically generates instances of feature models from a given domain. Concretely, two language models were used, first OpenAI's Codex to generate new instances of feature models using the Universal Variability Language (UVL) syntax and then Cohere's semantic analysis to verify if the newly introduced concepts are from the same domain. This approach enabled the generation of 90% of valid instances according to the UVL syntax. In addition, the valid models score well on model complexity metrics, and the generated features mirror the domain of the original UVL instance used as prompts. With this work, we envision a new thread of research where variability is generated and analyzed using LLMs. This opens the door for a new generation of techniques and tools for variability management.

References

[1]

Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, and Denny Zhou. 2022. What learning algorithm is in-context learning? Investigations with linear models. arXiv:2211.15661 [cs.LG]

Abstract

References

Cited By

Index Terms

Recommendations

Mapping feature models onto domain models: ensuring consistency of configured domain models

Automatic generation of feature models from UML requirement models

Enhancing Relation Extraction from Biomedical Texts by Large Language Models

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Badges

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations