Heterogeneous stacked ensemble classifier for software defect prediction

Goyal, Somya; Bhatia, Pradeep Kumar

doi:10.1007/s11042-021-11488-6

Heterogeneous stacked ensemble classifier for software defect prediction

1211: AIoT Support and Applications with Multimedia
Published: 12 September 2021

Volume 81, pages 37033–37055, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Somya Goyal^1,2 &
Pradeep Kumar Bhatia²

574 Accesses
17 Citations
1 Altmetric
Explore all metrics

Abstract

Software defect prediction (SDP) plays an important role to ensure that software meets quality standards; by highlighting the modules which are prone to errors and hence allows to focus the test efforts on them. Class imbalance nature of the defect dataset hinders the defect predictors to correctly classify the buggy modules. Here, we introduce a novel heterogenous ensemble classifier built with stacking methodology to overcome this problem of imbalanced datasets and hence, significant improvement in the prediction power is being proposed. Stacked ensemble is achieved with the best known classifiers from SDP literature as base classifiers (artificial neural network, nearest neighbor, tree based classifier, Bayesian classifier and support vector machines). For experimental work, five public datasets from NASA corpus are used. A comparative analysis for the proposed heterogenous stacking based ensemble method is made with the base classifiers and with the state-of-the art ensemble based SDP models over the evaluation criteria of ROC, AUC and accuracy. It is found that the proposed heterogenous stacking based ensemble classifier outperforms the base classifiers by 12% in terms of AUC score and by 8% in terms of Accuracy. It improves the performance of state-of-the-art ensemble methods by 4% in terms of AUC score and by 9% in terms of Accuracy. It can be concluded from the comparative analysis that the proposed SDP classifier is best performer among the candidate SDP classifiers statistically.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stacking Based Ensemble Learning for Improved Software Defect Prediction

Predicting the Defects using Stacked Ensemble Learner with Filtered Dataset

Article 07 August 2021

Feature Selection and Software Defect Prediction by Different Ensemble Classifiers

References

Balogun AO, Lafenwa-Balogun FB, Mojeed HA, Adeyemo VE, Akande ON, Akintola AG, Bajeh AO, Usman-Hamza FE (2020) SMOTE-Based Homogeneous Ensemble Methods for Software Defect Prediction. Computational Science and Its Applications – ICCSA 2020: 20th International Conference, Cagliari, Italy, July 1–4, 2020. Proceedings, Part VI 12254:615–631. https://doi.org/10.1007/978-3-030-58817-5_45
Article Google Scholar
Boucher A, Badri M (2018) Software metrics thresholds calculation techniques to predict fault-proneness: an empirical comparison. Inf Softw Technol 96:38–67
Article Google Scholar
Chen L, Fang B, Shang Z et al (2018) Tackling class overlap and imbalance problems in software defect prediction. Softw Qual J 26:97–125. https://doi.org/10.1007/s11219-016-9342-6
Article Google Scholar
Erturk E, Sezer EA (2015) A comparison of some soft computing methods for software fault prediction. Expert Syst Appl 42:1872–1879
Article Google Scholar
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2011) A review on ensembles for the class imbalance problem: bagging-, boosting- and hybrid-based approaches. IEEE Trans Syst Man Cybernetics Part C 42(4):463–484
Article Google Scholar
Goyal S, Bhatia P (2020) Comparison of machine learning techniques for software quality prediction. Int J Knowl Syst Sci IJKSS 11(2):21–40. https://doi.org/10.4018/IJKSS.2020040102
Article Google Scholar
Goyal S, Bhatia PK (2020 ) Empirical software measurements with machine learning. In: Bansal A, Jain A, Jain S, Jain V, Choudhary A (eds) Computational intelligence techniques and their applications to software engineering problems, pp 49–64. CRC Press, Boca Raton. https://doi.org/10.1201/9781003079996
Haixiang G, Yijing Li, Jennifer Shang Gu, Mingyun HY, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239
Article Google Scholar
Haykin S (2010) Neural networks and learning machines, 3/e. PHI Learning, India
Google Scholar
Huda S, Liu K, Abdelrazek M, Ibrahim A, Alyahya S, Al-Dossari H, Ahmad S (2018) An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction. IEEE Access 6:24184–24195. https://doi.org/10.1109/access.2018.2817572
Article Google Scholar
Khuat TT, Le MH (2020) Evaluation of Sampling-based ensembles of classifiers on imbalanced data for software defect prediction problems. SN Comput Sci 1:108. https://doi.org/10.1007/s42979-020-0119-4
Article Google Scholar
Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inf Softw Technol 58:388–402
Article Google Scholar
Lee HK, Kim SB (2018) An overlap-sensitive margin classifier for imbalanced and overlapping data. Expert Syst Appl 98:72–83
Article Google Scholar
Lehmann EL, Romano JP (2008) Testing statistical hypothesis: springer texts in statistics. Springer, New York
Google Scholar
Miholca, D., G., Czibula, I., Czibula. A novel approach for software defect prediction through hybridizing gradual relational association rules with artificial neural networks. J. Information Sciences. Feb 2018
(NASA 2015) https://www.nasa.gov/sites/default/files/files/Space_Math_VI_2015.pdf. Accessed 23 Aug 2018
Ozakıncı R, Tarhan A (2018) Early software defect prediction: ¨a systematic map and review. J Syst Softw 144:216–239. https://doi.org/10.1016/j.jss.2018.06.025
Article Google Scholar
Rathore S, Kumar S (2017) Towards an ensemble-based system for predicting the number of software faults. Expert Syst Appl 82:357–382
Article Google Scholar
(PROMISE) http://promise.site.uottawa.ca/SERepository. Accessed 23 Aug 2018
Rathore SS, Kumar S (2019) A study on software fault prediction techniques. Artif Intell Rev 51(2):255–327. https://doi.org/10.1007/s10462-017-9563-5
Article Google Scholar
Sayyad S, Menzies T (2005) The PROMISE repository of software engineering databases. Canada: University of Ottawa, http://promise.site.uottawa.ca/SERepository.
Siers MJ, Islam MZ (2015) Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem. Inf Syst 51:62–71
Article Google Scholar
Son LH, Pritam N, Khari M, Kumar R, Phuong PTM, Thong PH (2019) Empirical study of software defect prediction: a systematic mapping. Symmetry. MDPI AG. https://doi.org/10.3390/sym11020212
Tong H, Liu B, Wang S (2018) Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning. Inf Softw Technol 96:94–111. https://doi.org/10.1016/j.infsof.2017.11.008
Article Google Scholar
Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
Article Google Scholar
Wang T, Zhang Z, Jing X, Zhang L (2015) Multiple kernel ensemble learning for software defect prediction. Autom Softw Eng 23:569–590
Article Google Scholar
Xia X, Lo D, Shihab E, Wang X, Yang X (2015) ELBlocker: Predicting blocking bugs with ensemble imbalance learning. Inf Softw Technol 61:93–106
Article Google Scholar
Yang X, Lo D, Xia X, Sun J (2017) TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Inf Softw Technol 87:206–20
Article Google Scholar

Download references

Author information

Authors and Affiliations

Manipal University Jaipur, Jaipur, Rajasthan, 303007, India
Somya Goyal
Guru Jambheshwar University of Science & Technology, Hisar, Haryana, 125001, India
Somya Goyal & Pradeep Kumar Bhatia

Authors

Somya Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Pradeep Kumar Bhatia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Somya Goyal.

Ethics declarations

Conflict of interest

Authors have no Conflicts of interest/Competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goyal, S., Bhatia, P.K. Heterogeneous stacked ensemble classifier for software defect prediction. Multimed Tools Appl 81, 37033–37055 (2022). https://doi.org/10.1007/s11042-021-11488-6

Download citation

Received: 23 September 2020
Revised: 01 June 2021
Accepted: 19 August 2021
Published: 12 September 2021
Issue Date: November 2022
DOI: https://doi.org/10.1007/s11042-021-11488-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heterogeneous stacked ensemble classifier for software defect prediction

Abstract

Access this article

Similar content being viewed by others

Stacking Based Ensemble Learning for Improved Software Defect Prediction

Predicting the Defects using Stacked Ensemble Learner with Filtered Dataset

Feature Selection and Software Defect Prediction by Different Ensemble Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Heterogeneous stacked ensemble classifier for software defect prediction

Abstract

Access this article

Similar content being viewed by others

Stacking Based Ensemble Learning for Improved Software Defect Prediction

Predicting the Defects using Stacked Ensemble Learner with Filtered Dataset

Feature Selection and Software Defect Prediction by Different Ensemble Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation