skip to main content
10.1145/3530019.3531333acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault Prediction

Published: 13 June 2022 Publication History

Abstract

Context: Software Engineering (SE) community has empirically investigated software defect prediction as a proxy to benchmark it as a process improvement activity to assure software quality. In the domain of software fault prediction, the performance of classification algorithms is highly provoked with the residual effects attributed to feature irrelevance and data redundancy issues. Problem: The meta-learning-based ensemble methods are usually carried out to mitigate these noise effects and boost the software fault prediction performance. However, there is a need to benchmark the performance of meta-learning ensemble methods (as fault predictor) to assure software quality control and aid developers in their decision making. Method: We conduct an empirical and comparative study to evaluate and benchmark the improvement in the fault prediction performance via meta-learning ensemble methods as compared to their component base-level fault predictors. In this study, we perform a series of experiments with four well-known meta-level ensemble methods Vote, StackingC (i.e., Stacking), MultiScheme, and Grading. We also use five high-performance fault predictors Logistic (i.e., Logistic Regression), J48 (i.e., Decision Tree), IBK (i.e. k-nearest neighbor), NaiveBayes, and Decision Table (DT). Subsequently, we performed these experiments on public defect datasets with k-fold (k=10) cross-validation. We used F-measure and ROC-AUC (Receiver Operating Characteristic-Area Under Curve) performance measures and applied the four non-parametric tests to benchmark the fault prediction performance results of meta-learning ensemble methods. Results and Conclusion: we conclude that meta-learning ensemble methods, especially Vote could outperform the base-level fault predictors to tackle the feature irrelevance and redundancy issues in the domain of software fault prediction. Having said that, their performance is highly related to the number of base-level classifiers and the set of software fault prediction metrics.

References

[1]
G. Abaei and A. Selamat. 2014. A survey on software fault detection based on different prediction approaches. Vietnam Journal of Computer Science 1 (2014), 79–955.
[2]
Sara Abdelwahab and Ajith Abraham. 2015. Risk Assessment for Grid Computing Using Meta-Learning Ensembles. In Pattern Analysis, Intelligent Security and the Internet of Things, Ajith Abraham, Azah Kamilah Muda, and Yun-Huoy Choo (Eds.). Springer International Publishing, Cham, 251–260.
[3]
Yasser Ali Alshehri, Katerina Goseva-Popstojanova, Dale G. Dzielski, and Thomas Devine. 2018. Applying Machine Learning to Predict Software Fault Proneness Using Change Metrics, Static Code Metrics, and a Combination of Them. In SoutheastCon 2018. 1–7. https://doi.org/10.1109/SECON.2018.8478911
[4]
C. Andersson. 2007. A replicated empirical study of a selection method for software reliability growth models. Empirical Software Engineering 12 (2007), 161.
[5]
N Bouguila, W. J. Han, and A. Hamza. 2008. A Bayesian approach for software quality prediction. In 4th International IEEE Conference ”Intelligent Systems. IEEE, 512–515.
[6]
Cagatay Catal and Banu Diri. 2009. Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Information Sciences 179, 8 (2009), 1040–1058. https://doi.org/10.1016/j.ins.2008.12.001
[7]
N. V. Chawla, K. W. Bowyer, and W. P. Kegelmeyer. 2002. SMOTE: Sythetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research 16 (2002), 321–357.
[8]
S.R. Chidamber and C.F. Kemerer. 1994. A metrics suite for object oriented design. IEEE Transactions on Software Engineering 20, 6 (1994), 476–493. https://doi.org/10.1109/32.295895
[9]
Khaled El Emam, Saida Benlarbi, Nishith Goel, and Shesh N. Rai. 2001. Comparing case-based reasoning classifiers for predicting high risk software components. Journal of Systems and Software 55, 3 (2001), 301–320. https://doi.org/10.1016/S0164-1212(00)00079-0
[10]
Norman Fenton, Martin Neil, William Marsh, Peter Hearty, David Marquez, Paul Krause, and Rajat Mishra. 2007. Predicting software defects in varying development lifecycles using Bayesian nets. Information and Software Technology 49, 1 (2007), 32–43. https://doi.org/10.1016/j.infsof.2006.09.001 Most Cited Journal Articles in Software Engineering - 2000.
[11]
Manuel Fernndez-Delgado, Eva Cernadas, Senn Barro, and Dinani Amorim. 2014. Do We Need Hundreds of Classifiers to Solve Real World Classification Problems?J. Mach. Learn. Res. 15, 1 (jan 2014), 3133–3181.
[12]
L. Guo, B. Cukic, and H. Singh. 2003. Predicting fault prone modules by the Dempster-Shafer belief networks. In 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings.249–252. https://doi.org/10.1109/ASE.2003.1240314
[13]
H He and E. A Garcia. 2009. Learning From Imbalanced Data. IEEE Transaction on Knowledge and Data Engineering 21 (2009), 1263–1284.
[14]
Shahid Hussain, Arif Ali Khan, and Kwabena Ebo Bennin. 2016. Empirical Investigation of Fault Predictors in Context of Class Membership Probability Estimation(SAC ’16). Association for Computing Machinery, New York, NY, USA, 1550–1553. https://doi.org/10.1145/2851613.2851973
[15]
Marian Jureczko and Lech Madeyski. 2010. Towards Identifying Software Project Clusters with Regard to Defect Prediction(PROMISE ’10). Association for Computing Machinery, New York, NY, USA, Article 9, 10 pages. https://doi.org/10.1145/1868328.1868342
[16]
T.M. Khoshgoftaar, E.B. Allen, J.P. Hudepohl, and S.J. Aud. 1997. Application of neural networks to software quality modeling of a very large telecommunications system. IEEE Transactions on Neural Networks 8, 4 (1997), 902–909. https://doi.org/10.1109/72.595888
[17]
T.M. Khoshgoftaar, P. Rebours, and N. Seliya. 2009. Software quality analysis by combining multiple projects and learners. Software Quality Journal 17 (2009), 25–49.
[18]
T.M. Khoshgoftaar and N. Seliya. 2002. Software quality classification modeling using the SPRINT decision tree algorithm. In 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings.365–374. https://doi.org/10.1109/TAI.2002.1180826
[19]
Issam H. Laradji, Mohammad Alshayeb, and Lahouari Ghouti. 2015. Software defect prediction using ensemble learning on selected features. Information and Software Technology 58 (2015), 388–402. https://doi.org/10.1016/j.infsof.2014.07.005
[20]
Stefan Lessmann, Bart Baesens, Christophe Mues, and Swantje Pietsch. 2008. Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings. IEEE Transactions on Software Engineering 34, 4 (2008), 485–496. https://doi.org/10.1109/TSE.2008.35
[21]
Ruchika Malhotra. 2015. A systematic review of machine learning techniques for software fault prediction. Applied Soft Computing 27 (2015), 504–518. https://doi.org/10.1016/j.asoc.2014.11.023
[22]
Ruchika Malhotra and Shine Kamal. 2019. An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343(2019), 120–140. https://doi.org/10.1016/j.neucom.2018.04.090 Learning in the Presence of Class Imbalance and Concept Drift.
[23]
Hector M. Olague, Letha H. Etzkorn, Sampson Gholston, and Stephen Quattlebaum. 2007. Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes. IEEE Transactions on Software Engineering 33, 6 (2007), 402–419. https://doi.org/10.1109/TSE.2007.1015
[24]
Lior Rokach. 2009. Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Computational Statistics & Data Analysis 53, 12 (2009), 4046–4072. https://doi.org/10.1016/j.csda.2009.07.017
[25]
Naeem Seliya, Taghi M. Khoshgoftaar, and Jason Van Hulse. 2010. Predicting Faults in High Assurance Software. In 2010 IEEE 12th International Symposium on High Assurance Systems Engineering. 26–34. https://doi.org/10.1109/HASE.2010.29
[26]
Deepak Sharma and Pravin Chandra. 2018. Software Fault Prediction Using Machine-Learning Techniques. In Smart Computing and Informatics, Suresh Chandra Satapathy, Vikrant Bhateja, and Swagatam Das (Eds.). Springer Singapore, Singapore, 541–549.
[27]
Raed Shatnawi and Wei Li. 2008. The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. Journal of Systems and Software 81, 11 (2008), 1868–1882. https://doi.org/10.1016/j.jss.2007.12.794
[28]
R. Subramanyam and M.S. Krishnan. 2003. Empirical analysis of CK metrics for object-oriented design complexity: implications for software defects. IEEE Transactions on Software Engineering 29, 4 (2003), 297–310. https://doi.org/10.1109/TSE.2003.1191795
[29]
Yanmin Sun, Mohamed S. Kamel, Andrew K.C. Wong, and Yang Wang. 2007. Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40, 12 (2007), 3358–3378. https://doi.org/10.1016/j.patcog.2007.04.009
[30]
Zhongbin Sun, Qinbao Song, and Xiaoyan Zhu. 2012. Using Coding-Based Ensemble Learning to Improve Software Defect Prediction. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42, 6 (2012), 1806–1817. https://doi.org/10.1109/TSMCC.2012.2226152
[31]
Minku L.L. Yao X. Wang, S.2013. Online class imbalance learning and its applications in fault detection. International Journal of Computational Intelligence and Applications 12 (2013), 4241–4254.
[32]
Fatih Yucalar, Akin Ozcift, Emin Borandag, and Deniz Kilinc. 2020. Multiple-classifiers in software quality engineering: Combining predictors to improve software fault prediction ability. Engineering Science and Technology, an International Journal 23, 4(2020), 938–950. https://doi.org/10.1016/j.jestch.2019.10.005
[33]
Jun Zheng. 2010. Cost-sensitive boosting neural networks for software defect prediction. Expert Systems with Applications 37, 6 (2010), 4537–4543. https://doi.org/10.1016/j.eswa.2009.12.056

Cited By

View all
  • (2024)An extensive study of the effects of different deep learning models on code vulnerability detection in Python codeAutomated Software Engineering10.1007/s10515-024-00413-431:1Online publication date: 31-Jan-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EASE '22: Proceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering
June 2022
466 pages
ISBN:9781450396134
DOI:10.1145/3530019
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Classification
  2. Ensemble method
  3. Fault Prediction
  4. Metrics
  5. Performance

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EASE 2022

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An extensive study of the effects of different deep learning models on code vulnerability detection in Python codeAutomated Software Engineering10.1007/s10515-024-00413-431:1Online publication date: 31-Jan-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media