research-article

A machine learning model to classify the feature model maintainability

Authors:
Publio Silva

Federal University of Ceara, Quixada, Brazil

Federal University of Ceara, Quixada, Brazil
View Profile

,
Carla I. M. Bezerra

Federal University of Ceara, Quixada, Brazil

Federal University of Ceara, Quixada, Brazil
View Profile

,
Ivan Machado

Federal University of Bahia, Salvador, Brazil

Federal University of Bahia, Salvador, Brazil
View Profile

SPLC '21: Proceedings of the 25th ACM International Systems and Software Product Line Conference - Volume ASeptember 2021Pages 35–45https://doi.org/10.1145/3461001.3471152

Published:06 September 2021Publication History

SPLC '21: Proceedings of the 25th ACM International Systems and Software Product Line Conference - Volume A

Pages 35–45

ABSTRACT

Software Product Lines (SPL) are generally specified using a Feature Model (FM), an artifact designed in the early stages of the SPL development life cycle. This artifact can quickly become too complex, which makes it challenging to maintain an SPL. Therefore, it is essential to evaluate the artifact's maintainability continuously. The literature brings some approaches that evaluate FM maintainability through the aggregation of maintainability measures. Machine Learning (ML) models can be used to create these approaches. They can aggregate the values of independent variables into a single target data, also called a dependent variable. Besides, when using white-box ML models, it is possible to interpret and explain the ML model results. This work proposes white-box ML models intending to classify the FM maintainability based on 15 measures. To build the models, we performed the following steps: (i) we compared two approaches to evaluate the FM maintainability through a human-based oracle of FM maintainability classifications; (ii) we used the best approach to pre-classify the ML training dataset; (iii) we generated three ML models and compared them against classification accuracy, precision, recall, F1 and AUC-ROC; and, (iv) we used the best model to create a mechanism capable of providing improvement indicators to domain engineers. The best model used the decision tree algorithm that obtained accuracy, precision, and recall of 0.81, F1-Score of 0.79, and AUC-ROC of 0.91. Using this model, we could reduce the number of measures needed to evaluate the FM maintainability from 15 to 9 measures.

References

Mathieu Acher, Benoit Baudry, Patrick Heymans, Anthony Cleve, and Jean-Luc Hainaut. 2013. Support for Reverse Engineering and Maintaining Feature Models. In Proceedings of the Seventh International Workshop on Variability Modelling of Software-Intensive Systems (VaMoS '13). Association for Computing Machinery, New York, NY, USA, Article 20, 8 pages. Google ScholarDigital Library
Mohamed Alloghani, Dhiya Al-Jumeily, Jamila Mustafina, Abir Hussain, and Ahmed J. Aljaaf. 2020. A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science. Springer, Cham, 3--21.Google Scholar
Ethem Alpaydin. 2020. Introduction to machine learning. MIT press. Google ScholarDigital Library
Hadeel Alsolai and Marc Roper. 2020. A systematic literature review of machine learning techniques for software maintainability prediction. Information and Software Technology 119 (2020), 106214. Google ScholarCross Ref
Sven Apel, Don S. Batory, Christian Kästner, and Gunter Saake. 2013. Feature-Oriented Software Product Lines - Concepts and Implementation. Springer. Google ScholarDigital Library
Ebrahim Bagheri and Dragan Gasevic. 2011. Assessing the maintainability of software product line feature models using structural metrics. Software Quality Journal 19, 3 (2011), 579--612. Google ScholarDigital Library
Gabriel Bailey, Allison Joffrion, and Megan Pearson. 2018. A comparison of machine learning applications across professional sectors. Available at SSRN 3174123 (2018).Google Scholar
Don Batory. 2005. Feature Models, Grammars, and Propositional Formulas. In Software Product Lines, Henk Obbink and Klaus Pohl (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 7--20. Google ScholarDigital Library
Clément Bénard, Gérard Biau, Sébastien Veiga, and Erwan Scornet. 2021. Interpretable random forests via rule extraction. In International Conference on Artificial Intelligence and Statistics. PMLR, 937--945.Google Scholar
Thorsten Berger and Jianmei Guo. 2014. Towards System Analysis with Variability Model Metrics. In Proceedings of the Eighth International Workshop on Variability Modelling of Software-Intensive Systems (VaMoS '14). Association for Computing Machinery, New York, NY, USA, Article 23, 8 pages. Google ScholarDigital Library
Thorsten Berger and Jianmei Guo. 2014. Towards system analysis with variability model metrics. In Proceedings of the Eighth International Workshop on Variability Modelling of Software-Intensive Systems. 1--8. Google ScholarDigital Library
Carla I.M. Bezerra, Rossana M.C. Andrade, and Jose Maria Monteiro. 2017. Exploring quality measures for the evaluation of feature models: a case study. Journal of Systems and Software 131 (2017), 366--385.Google ScholarCross Ref
Carla I. M. Bezerra, Rossana M. C. Andrade, and José Maria S. Monteiro. 2014. Measures for Quality Evaluation of Feature Models. In Software Reuse for Dynamic Systems in the Cloud and Beyond, Ina Schaefer and Ioannis Stamelos (Eds.). Springer International Publishing, Cham, 282--297.Google Scholar
Carla I. M. Bezerra, Jefferson Barbosa, Joao Holanda Freires, Rossana M. C. Andrade, and José Maria Monteiro. 2016. DyMMer: A Measurement-Based Tool to Support Quality Evaluation of DSPL Feature Models. In Proceedings of the 20th International Systems and Software Product Line Conference (SPLC). ACM. Google ScholarDigital Library
Carla I. M. Bezerra, José Maria Monteiro, Rossana M. C. Andrade, and Lincoln S. Rocha. 2016. Analyzing the Feature Models Maintainability over Their Evolution Process: An Exploratory Study. In Proceedings of the Tenth International Workshop on Variability Modelling of Software-Intensive Systems (VaMoS). ACM. Google ScholarDigital Library
Giuseppe Bonaccorso. 2017. Machine learning algorithms. Packt Publishing Ltd.Google Scholar
Michael W Browne. 2000. Cross-validation methods. Journal of Mathematical Psychology 44, 1 (2000), 108--132. Google ScholarDigital Library
Johannes Bürdek, Timo Kehrer, Malte Lochau, Dennis Reuling, Udo Kelter, and Andy Schürr. 2016. Reasoning about product-line evolution using complex feature model differences. Automated Software Engineering 23, 4 (01 Dec 2016), 687--733. Google ScholarDigital Library
Davide Chicco and Giuseppe Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21, 1 (02 Jan 2020).Google Scholar
Paul Clements and Linda Northrop. 2002. Software product lines. Addison-Wesley Boston.Google Scholar
Davi Cedraz S. de Oliveira and Carla I. M. Bezerra. 2019. Development of the Maintainability Index for SPLs Feature Models Using Fuzzy Logic. In Proceedings of the XXXIII Brazilian Symposium on Software Engineering (SBES). ACM, New York, NY, USA. Google ScholarDigital Library
Issam El Naqa and Martin J Murphy. 2015. What is machine learning? In machine learning in radiation oncology. Springer, 3--11.Google Scholar
Sascha El-Sharkawy, Adam Krafczyk, and Klaus Schmid. 2019. MetricHaven: More than 23,000 Metrics for Measuring Quality Attributes of Software Product Lines. In Proceedings of the 23rd International Systems and Software Product Line Conference - Volume B (SPLC '19). Association for Computing Machinery, New York, NY, USA, 25--28. Google ScholarDigital Library
Sascha El-Sharkawy, Adam Krafczyk, and Klaus Schmid. 2020. Fast Static Analyses of Software Product Lines: An Example with More than 42,000 Metrics. In Proceedings of the 14th International Working Conference on Variability Modelling of Software-Intensive Systems (VAMOS '20). Association for Computing Machinery, New York, NY, USA, Article 8, 9 pages. Google ScholarDigital Library
Sascha El-Sharkawy, Nozomi Yamagishi-Eichler, and Klaus Schmid. 2019. Metrics for analyzing variability and its implementation in software product lines: A systematic literature review. Information and Software Technology 106 (2019), 1--30. Google ScholarCross Ref
Brandon M Greenwell, B Boehmke, and B Gray. 2020. Variable importance plots---An introduction to the vip package. The R Journal 12, 1 (2020), 343--366.Google ScholarCross Ref
Anil K. Jain. 2010. Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31, 8 (2010), 651--666. Award winning papers from the 19th International Conference on Pattern Recognition (ICPR). Google ScholarDigital Library
Sudan Jha, Raghvendra Kumar, Le Hoang Son, Mohamed Abdel-Basset, Ishaani Priyadarshini, Rohit Sharma, and Hoang Viet Long. 2019. Deep Learning Approach for Software Maintainability Metrics Prediction. IEEE Access 7 (2019), 61840--61855.Google ScholarCross Ref
Kyo C Kang, Sholom G Cohen, James A Hess, William E Novak, and A Spencer Peterson. 1990. Feature-oriented domain analysis (FODA) feasibility study. Technical Report. Carnegie-Mellon Univ Pittsburgh Pa Software Engineering Inst.Google Scholar
George Klir and Bo Yuan. 1995. Fuzzy sets and fuzzy logic. Vol. 4. Prentice hall New Jersey.Google Scholar
Luan Lima, Anderson Uchôa, Carla Bezerra, Emanuel Coutinho, and Lincoln Rocha. 2020. Visualizing the Maintainability of Feature Models in SPLs. In Anais do VIII Workshop de Visualização, Evolução e Manutenção de Software. SBC, 1--8.Google Scholar
Octavio Loyola-Gonzalez. 2019. Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view. IEEE Access 7 (2019), 154096--154113.Google ScholarCross Ref
Yuxin Ma, Wei Chen, Xiaohong Ma, Jiayi Xu, Xinxin Huang, Ross Maciejewski, and Anthony KH Tung. 2017. EasySVM: A visual analysis approach for open-box support vector machines. Computational Visual Media 3, 2 (2017), 161--175.Google ScholarCross Ref
Valerio Maggio. 2013. Improving Software Maintenance using Unsupervised Machine Learning techniques. Ph.D. Dissertation. University of Naples Federico II, Italy. http://www.fedoa.unina.it/9079/Google Scholar
Maíra Marques, Jocelyn Simmonds, Pedro O. Rossel, and María Cecilia Bastarrica. 2019. Software product line evolution: A systematic literature review. Information and Software Technology 105 (2019), 190--208.Google ScholarCross Ref
Stephen Marsland. 2015. Machine learning. CRC press.Google Scholar
Marcilio Mendonca, Moises Branco, and Donald Cowan. 2009. S.P.L.O.T.: Software Product Lines Online Tools. In Proceedings of the 24th ACM SIGPLAN Conference Companion on Object Oriented Programming Systems Languages and Applications (OOPSLA). ACM. Google ScholarDigital Library
Sonia Montagud, Silvia Abrahão, and Emilio Insfran. 2012. A systematic review of quality attributes and measures for software product lines. Software Quality Journal 20, 3 (2012), 425--486. Google ScholarDigital Library
Sarang Narkhede. 2018. Understanding auc-roc curve. Towards Data Science 26 (2018), 220--227.Google Scholar
Leonardo Passos, Krzysztof Czarnecki, Sven Apel, Andrzej Wąsowski, Christian Kästner, and Jianmei Guo. 2013. Feature-Oriented Software Evolution. In Proceedings of the Seventh International Workshop on Variability Modelling of Software-Intensive Systems (VaMoS). ACM. Google ScholarDigital Library
Neil J Salkind and Terese Rainwater. 2006. Exploring research. Pearson Prentice Hall Upper Saddle River, NJ.Google Scholar
Patrick Schober, Christa Boer, and Lothar A Schwarte. 2018. Correlation coefficients: appropriate use and interpretation. Anesthesia & Analgesia 126, 5 (2018), 1763--1768.Google ScholarCross Ref
Pratap Chandra Sen, Mahimarnab Hajra, and Mitadru Ghosh. 2020. Supervised Classification Algorithms in Machine Learning: A Survey and Review. In Emerging Technology in Modelling and Graphics, Jyotsna Kumar Mandal and Debika Bhattacharya (Eds.). Springer Singapore, Singapore, 99--111.Google Scholar
Publio Silva, Carla I. M. Bezerra, Rafael Lima, and Ivan Machado. 2020. Classifying Feature Models Maintainability Based on Machine Learning Algorithms. In Proceedings of the 14th Brazilian Symposium on Software Components, Architectures, and Reuse (SBCARS). ACM. Google ScholarDigital Library
Larissa Rocha Soares, Ivan Machado, Eduardo Santana de Almeida, Christian Kästner, and Sarah Nadi. 2020. A semi-automated iterative process for detecting feature interactions. In SBES '20: 34th Brazilian Symposium on Software Engineering, Natal, Brazil, October 19-23, 2020, Everton Cavalcante, Francisco Dantas, and Thaís Batista (Eds.). ACM, 778--787. Google ScholarDigital Library
Larissa Rocha Soares, Pierre-Yves Schobbens, Ivan do Carmo Machado, and Eduardo Santana de Almeida. 2018. Feature interaction in software product line engineering: A systematic mapping study. Information and Software Technology 98 (2018), 44--58.Google ScholarDigital Library
Paul Temple, José A. Galindo, Mathieu Acher, and Jean-Marc Jézéquel. 2016. Using Machine Learning to Infer Constraints for Product Lines. In Proceedings of the 20th International Systems and Software Product Line Conference (SPLC '16). Association for Computing Machinery, New York, NY, USA, 209--218. Google ScholarDigital Library
Gustavo Vale, Eduardo Fernandes, and Eduardo Figueiredo. 2019. On the proposal and evaluation of a benchmark-based threshold derivation method. Software Quality Journal 27, 1 (01 Mar 2019), 275--306. Google ScholarDigital Library
Heping Zhang and Minghui Wang. 2009. Search for the smallest random forest. Statistics and its Interface 2, 3 (2009), 381.Google Scholar

Index Terms

A machine learning model to classify the feature model maintainability

Recommendations

Classifying Feature Models Maintainability based on Machine Learning Algorithms
SBCARS '20: Proceedings of the 14th Brazilian Symposium on Software Components, Architectures, and Reuse

Maintenance in the context of SPLs is a topic of interest, and that still needs further investigation. There are several ways to evaluate the maintainability of a feature model (FM), one of which is a manual or automated analysis of quality measures. ...
Read More
Automating Feature Model maintainability evaluation using machine learning techniques
Abstract Context:
Software Product Lines (SPL) are generally specified using a Feature Model (FM), an artifact designed in the early stages of the SPL development life cycle. This artifact can quickly become too complex, which makes ...
Highlights
- The study compares two FM maintainability classification approaches described in the literature.
Read More
Construction of a quality model for machine learning systems
Abstract
Nowadays, systems containing components based on machine learning (ML) methods are becoming more widespread. In order to ensure the intended behavior of a software system, there are standards that define necessary qualities of the system and its ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SPLC '21: Proceedings of the 25th ACM International Systems and Software Product Line Conference - Volume A
September 2021
239 pages
ISBN:9781450384698
DOI:10.1145/3461001
General Chairs:
Mohammad Reza Mousavi
King's College London, UK
,
Pierre-Yves Schobbens
University of Namur, Belgium
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 September 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
feature model
machine learning
quality evaluation
software product line
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate167of463submissions,36%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 129
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A machine learning model to classify the feature model maintainability

SPLC '21: Proceedings of the 25th ACM International Systems and Software Product Line Conference - Volume A

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classifying Feature Models Maintainability based on Machine Learning Algorithms

Automating Feature Model maintainability evaluation using machine learning techniques

Construction of a quality model for machine learning systems