Abstract
Addressing software defects is an ongoing challenge in software development, and effectively managing and resolving defects is vital for ensuring software reliability, which is in turn a crucial quality attribute of any software system. Software defect prediction supported by machine learning (ML) methods offers a promising approach to address the problem of software defects. However, one common challenge in ML-based software defect prediction is the issue of data imbalance. In this paper, we present an empirical study aimed at assessing the impact of various class balancing methods on the issue of class imbalance in software defect prediction. We conducted a set of experiments that involved nine distinct class balancing methods across seven different classifiers. We used datasets from the PROMISE repository, provided by the NASA software project. We also employed various metrics including AUC, Accuracy, Precision, Recall, and the F1 measure to gauge the effectiveness of the different class balancing methods. Furthermore, we applied hypothesis testing to determine any significant differences in metric results between datasets with balanced and unbalanced classes. Based on our findings, we conclude that balancing the classes in software defect prediction yields significant improvements in overall performance. Therefore, we strongly advocate for the inclusion of class balancing as a preprocessing step in this domain.




Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.REFERENCES
Olvera-Villeda, D.J., Sanchez-Garcia, A.J., Limon, X., and Dominguez Isidro, S., Class balancing approaches in dataset for software defect prediction: A systematic literature review, Proc. 11th IEEE Int. Conf. in Software Engineering Research and Innovation (CONISOFT), Leon, 2023, pp. 1–6.
Glinz, M., A glossary of requirements engineering terminology, in Standard Glossary of the Certified Professionalfor Requirements Engineering (CPRE) Studies and Exam, Version, 2011, vol. 1, p. 56.
Musa, J.D., Software reliability measurement, J. Syst. Software, 1979, vol. 1, pp. 223–241.
Iso, I. and IEC, N., ISO/IEC, in IEEE International Standard-Systems and Software Engineering-Vocabulary, 2017, pp. 1–541.
Singh, P.D. and Chug, A., Software defect prediction analysis using machine learning algorithms, Proc. 7th IEEE Int. Conf. on Cloud Computing, Data Science & Engineering Confluence, Noida, 2017, pp. 775–781.
Sayyad Shirabad, J. and Menzies, T., The PROMISE repository of software engineering databases, in Proc. School of Information Technology and Engineering, Univ. of Ottawa, 2005. http://PROMISEsite.uottawa.ca/SERepository.
McCabe, T., A complexity measure, IEEE Trans. Software Eng., 1976, vol. 2, no. 4, pp. 308–320.
Halstead, M., Elements of Software Science, Elsevier, 1977.
Wolpert, D.H. and Macready, W.G., No free lunch theorems for optimization, IEEE Trans. Evol. Comput., 1997, vol. 1, no. 1.
Zhang, Y., Yan, X., and Khan, A.A., A kernel density estimation-based variation sampling for class imbalance in defect prediction, Proc. IEEE Int. Conf. on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), Exeter, 2020, pp. 1058–1065.
Elahi, E., Kanwal, S., and Asif, A.N., A new ensemble approach for software fault prediction, Proc. 17th Int. Bhurban Conf. on Applied Sciences and Technology (IBCAST), Bhurban, 2020, pp. 407–412.
Zheng, J., Wang, X., Wei, D., Chen, B., and Shao, Y., A novel imbalanced ensemble learning in software defect predication, IEEE Access, 2021, vol. 9, pp. 86855–86868.
Zha, Q., Yan, X., andZhou, Y., Adaptive centre-weighted oversampling for class imbalance in software defect prediction, Proc. IEEE Int. Conf. on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), Melbourne, 2018, pp. 223–230.
Huda, S., Liu, K., Abdelrazek, M., Ibrahim, A., Alyahya, S., Al-Dossari, H., and Ahmad, S., An ensemble oversampling model for class imbalance problem in software defect prediction, IEEE Access, 2018, vol. 6, pp. 24184–24195.
Malhotra, R., Nishant, N., Gurha, S., and Rathi,V., Application of particle swarm optimization for software defect prediction using object oriented metrics, Proc. 11th Int. Conf. on Cloud Computing, Data Science & Engineering (Confluence), Noida, 2021, pp. 88–93.
Li, Z., Zhang, X., Guo, J., and Shang, Y., Class imbalance data generation for software defect prediction, Proc. 26th IEEE Asia-Pacific Software Engineering Conf. (APSEC), Putrajaya, 2019, pp. 276–283.
Ghosh, S., Rana, A., and Kansal, V., Combining integrated sampling with nonlinear manifold detection techniques for software defect prediction, Proc. 3rd IEEE Int. Conf. on Contemporary Computing and Informatics (IC3I), Gurgaon, 2018, pp. 147–154.
Putri, S.A., et al., Combining integreted sampling technique with feature selection for software defect prediction, Proc. 5th IEEE Int. Conf. on Cyber and IT Service Management (CITSM), Denpasar, 2017, pp. 1–6.
Thaher, T. and Arman, N., Efficient multi-swarm binary harrishawks optimization as a feature selection approach for softwarefault prediction, Proc. 11th IEEE Int. Conf. on Information and Communication Systems (ICICS), Irbid, 2020, pp. 249–254.
Bashir, K., Li, T., Yohannese, C.W., and Mahama, Y., Enhancing software defect prediction using supervised-learning based framework, Proc. 12th IEEE Int. Conf. on Intelligent Systems and Knowledge Engineering (ISKE), Nanjing, 2017, pp. 1–6.
Rathore, S.S., Chouhan, S.S., Jain, D.K., and Vachhani, A.G., Generative oversampling methods for handling imbalanced data in software fault prediction, IEEE Trans. Reliab., 2022, vol. 71, no. 2, pp. 747–762.
Eivazpour, Z. and Keyvanpour, M.R., Improving performance in software defect prediction using variational autoencoder, Proc. 5th IEEE Conf. on Knowledge Based Engineering and Innovation (KBEI), Teheran, 2019, pp. 644–649.
Bispo, A., Prudèncio, R., and Våleras, D., Instance selection and class balancing techniques for cross project defect prediction, Proc. 7th IEEE Brazilian Conf. on Intelligent Systems (BRACIS), Sao Paulo, 2018, pp. 552–557.
Bennin, K.E., Keung, J., Phannachitta, P., Monden, A., and Mensah, S., Mahakil: Diversity based oversampling approachto alleviate the class imbalance issue in software defect prediction, IEEE Trans. Software Eng., 2017, vol. 44, no. 6, pp. 534–550.
Malhotra, R., Kapoor, R., Saxena, P., and Sharma, P., Saga: a hybrid technique to handle imbalance data in software defect prediction, Proc. 11th IEEE Symp. on Computer Applications & Industrial Electronics (ISCAIE), Penang, 2021, pp. 331–336.
Wang, D. and Xiong, X., Software defect prediction basedon combined sampling and feature selection, Proc. 2nd Int. Conf. on Machine Learning and Computer Application ICMLCA 2021, Shenyang, 2021, pp. 1–5.
Liu, Y., Sun, F., Yang, J., and Zhou, D., Software defect prediction model based on improved BP neural network, Proc. 6th IEEE Int. Conf. on Dependable Systems and Their Applications (DSA), Harbin, 2020, pp. 521–522.
Bahaweres, R.B., Agustian, F., Hermadi, I., Suroso, A.I., andArkeman, Y., Software defect prediction using neural network based smote, Proc. 7th IEEE Int. Conf. on Electrical Engineering, Computer Sciences and Informatics (EECSI), Yogyakarta, 2020, pp. 71–76.
Choirunnisa, S., Meidyani, B., andRochimah, S., Software defect prediction using oversampling algorithm: A-suwo, Proc. IEEE Conf. on Electrical Power, Electronics, Communications, Controls and Informatics Seminar (EECCIS), Batu, 2018, pp. 337–341.
Dipa, W.A. and Sunindyo, W.D., Software defect prediction using smote and artificial neural network, Proc. IEEE Int. Conf. on Data and Software Engineering (ICoDSE), Bandung, 2021, pp. 1–4.
Malhotra, R., Agrawal, V., Pal, V., and Agarwal, T., Support vector based oversampling technique for handling class imbalance in software defect prediction, Proc. 11th IEEE Int. Conf. on Cloud Computing, Data Science & Engineering (Confluence), Noida, 2021, pp. 1078–1083.
Gong, L., Jiang, S., and Jiang, L., Tackling class imbalance problem in software defect prediction through clusterbased over-sampling with filtering, IEEE Access, 2019, vol. 7, pp. 145725–145737.
Malhotra, R. and Kamal, S., Tool to handle imbalancing problem in software defect prediction using oversampling methods, Proc. IEEE Int. Conf. on Advances in Computing, Communications and Informatics (ICACCI), Udupi, 2017, pp. 906–912.
Pandey, S.K. and Tripathi, A.K., Class imbalance issuein software defect prediction models by various machine learning techniques: an empirical study, Proc. 8th IEEE Int. Conf. on Smart Computing and Communications (ICSCC), Chongqing, 2021, pp. 58–63.
Zhang, W., Li, Y., Wen, M., and He, R., Comparative studyof ensemble learning methods in just-in-time software defect prediction, Proc. 23rd IEEE Int. Conf. on Software Quality, Reliability, and Security Companion (QRSC), Chiang Mai, 2023, pp. 83–92.
Yang, X., Wang, S., Li, Y., and Wang, S., Does data sampling improve deep learning-based vulnerability detection? yeas! and nays!, Proc. IEEE/ACM 45th Int. Conf. on Software Engineering (ICSE), Melbourne, 2023, pp. 2287–2298.
Kumar, R. and Chaturvedi, A., Software bug prediction usingreward-based weighted majority voting ensemble technique, IEEE Trans. Reliab., 2024, vol. 73, no. 1, pp. 726–740.
Devi, M., Rajkumar, T., and Balakrishnan, D., Predictionof software defects by employing optimized deep learning and oversampling approaches, Proc. 2nd Int. Conf. on Computer, Communication and Control (IC4), Indore, 2024, pp. 1–5.
Wei, W., Jiang, F., Yu, X., and Du, J., An under-sampling algorithm based on weighted complexity and its application in software defect prediction, Proc. 5th Int. Conf. on Software Engineering and Information Management, Yokohama, 2022, pp. 38–44.
Abaei, G., Tah, W.Z., Toh, J.Z.W., and Hor, E.S.J., Improving software fault prediction in imbalanced datasets usingthe under-sampling approach, Proc. 11th Int. Conf. on Software and Computer Applications, Melaka, 2022, pp. 41–47.
Zhang, Z.-W., Jing, X.-Y., and Wang, T.-J., Label propagation based semi-supervised learning for software defect prediction, Automat. Software Eng., 2017, vol. 24, pp. 47–69.
Du, X., Yue, H., and Dong, H., Software defect prediction method based on hybrid sampling, in Proc. Int. Conf. on Frontiers of Electronics, Information andComputation Technologies, Ser. ICFEICT 2021, New York: Association for Computing Machinery, 2022. https://doi.org/10.1145/3474198.3478215.
Ryu, D., Jang, J.-I., and Baik, J., A transfer cost-sensitiveboosting approach for cross-project defect prediction, Software Quality J., 2017, vol. 25, pp. 235–272.
Zhou, L., Li, R., Zhang, S., and Wang, H., Imbalanced data processing model for software defect prediction, Wireless Personal Commun., 2018, vol. 102, pp. 937–950.
He, H., Zhang, X., Wang, Q., Ren, J., Liu, J., Zhao, X., and Cheng, Y., Ensemble multiboost based on ripper classifier for prediction of imbalanced software defect data, IEEE Access, 2019, vol. 7, pp. 110333–110343.
Zeng, C., Zhou, C.Y., Lv, S.K., He, P., and Huang, J., Gcn2defect: graph convolutional networks for smotetomek based software defect prediction, Proc. 32nd IEEE Int. Symp. on Software Reliability Engineering (ISSRE), Wuhan, 2021, pp. 69–79.
Joon, A., Tyagi, R.K., and Kumar, K., Noise filtering and imbalance class distribution removal for optimizing software fault prediction using best software metrics suite, Proc. 5th IEEE Int. Conf. on Communication and Electronics Systems (ICCES), Coimbatore, 2020, pp. 1381–1389.
Chen, L., Fang, B., Shang, Z., and Tang, Y., Tackling class overlap and imbalance problems in software defect prediction, Software Quality J., 2018, vol. 26, pp. 97–125.
Riaz, S., Arshad, A., and Jiao, L., Rough noise-filtered easy ensemble for software fault prediction, IEEE Access, 2018, vol. 6, pp. 46886–46889.
Wan, X., Zheng, Z., and Liu, Y., Spe2: Self-paced ensemble of ensembles for software defect prediction, IEEE Trans. Reliab., 2022, vol. 71, no. 2, pp. 865–879.
Menardi, G. and Torelli, N., Training and assessing classification rules with imbalanced data, Data Mining Knowledge Discovery, 2012, vol. 28, no. 1, pp. 92–122. https://doi.org/10.1007/s10618-012-0295-5
Chawla, N.V., Bowyer, K.W., Hall, L.O., and Kegelmeyer, W.P., Smote: Synthetic minority over-sampling technique, J. Artif. Intellig. Res., 2002, vol. 16, no. nil, pp. 321–357. https://doi.org/10.1613/jair.953
He, H., Bai, Y., Garcia, E.A., and Li, S., Adasyn: adaptive synthetic sampling approach for imbalanced learning, Proc. IEEE Int. Joint Conf. on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, 2008, p. nil. https://doi.org/10.1109/IJCNN.2008.4633969.
Batista, G.E.A.P.A., Pratim R.C., and Monard, M.C., A study of the behavior of several methods for balancing machine learning training data, ACM SIGKDD Explor. Newsl., 2004, vol. 6, no. 1, pp. 20–29. https://doi.org/10.1145/1007730.1007735
Mani, I. and Zhang, I., knn approach to unbalanced data distributions: A case study involving information extraction, in Proc. Workshop on Learning from Imbalanced Datasets, Washington, 2003, vol. 126, no. 1, pp. 2–7.
Wilson, D.L., Asymptotic properties of nearest neighborrules using edited data, IEEE Trans. Syst., Man, Cybernet., 1972, vol. SMC-2, no. 3, pp. 408–421.
Tomek, I., An experiment with the edited nearest-neighbor rule, IEEE Trans. Syst., Man, Cybernet., 1976, vol. SMC-6, no. 6, pp. 448–452. https://doi.org/10.1109/TSMC.1976.4309523
Manju, B.R. and Nair, A.R., Classification of cardiac arrhythmia of 12 lead ECG using combination of smoteenn, xgboost and machine learning algorithms, Proc. 9th Int. Symp. on Embedded Computing and System Design (ISED), Kollam, 2019, pp. 1–7.
Batista, G.E.A.P.A., Bazzan, A.L.C., and Monard, M.C., Balancing training data for automated annotation of keywords: A case study, Proc. 2nd Brazilian Workshop on Bioinformatics, Macaé, Dec. 3–5, 2003. https://api.semanticscholar.org/CorpusID:1579194.
Tomek, I., Two modifications of CNN, IEEE Trans. Syst., Man, Cybernet., 1976, vol. SMC-6, no. 11, pp. 769–772.
Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J., Classification and Regression Trees, New York: Chapman and Hall, 2017. https://doi.org/10.1201/9781315139470
Cieslak, D.A. and Chawla, N.V., Learning decision trees for unbalanced data, in Machine Learning and Knowledge Discovery in Databases, Daelemans, W., Goethals, B., and Morik, K., Eds., Berlin, Heidelberg: Springer, 2008, pp. 241–256.
Fix, E. and Hodges, J.L., Discriminatory analysis. Nonparametric discrimination: consistency properties, Int. Stat. Rev., 1989, vol. 57, no. 3, P. 238. https://doi.org/10.2307/1403797
Breiman, L., Random forests, Mach. Learn., 2001, vol. 45, no. 1, pp. 5–32. https://doi.org/10.1023/A:1010933404324
Freund, Y. and Schapire, R.E., Experiments with a newboosting algorithm, in Proc. 13th Int. Conf. on Machine Learning, Ser. ICML’96, San Francisco: MorganKaufmann, 1996.
Friedman, J.H., Stochastic gradient boosting, Comput. Stat. Data Anal. Nonlin. Methods Data Mining, 2002, vol. 38, no. 4, pp. 367–378. https://www.sciencedirect.com/science/article/pii/S0167947301000652.
Hand, D.J. and Yu, K., Idiot’s Bayes-not so stupid after all?, Int. Stat. Rev., 2001, vol. 69, no. 3, pp. 385–398. https://doi.org/10.1111/j.1751-823.2001.tb00465.x
Hinton, G.E., Connectionist Learning Procedures, Elsevier, 1990, pp. 555–610. https://doi.org/10.1016/B978-0-08-051055-2.50029-8
Dyba, T., Kampenes, V.B., and Sj.berg, D.I., A systematic review of statistical power in software engineering experiments, Inf. Software Technol., 2006, vol. 48, no. 8, pp. 745–755.
Sánchez-García, J., Statistical tests among groups, 2024. https://doi.org/10.5281/zenodo.13239734
Moore, D.S. and McCabe, G.P., Introduction to the Practice of Statistics, WH Freeman/Times Books/Henry Holt & Co, 1989.
Sánchez-García, J., Statistical tests results, 2024. https://doi.org/10.5281/zenodo.13240040
Malhotra, R. and Khanna, M., Threats to validity in search based predictive modelling for software engineering, IET Software, 2018, vol. 12, no. 4, pp. 293–305.
Bronshteyn, I., Study of defects in a program code in Python, Program. Comput. Software, 2013, vol. 39, pp. 279–284.
Belevantsev, A., Multilevel static analysis for improving program quality, Program. Comput. Software, 2017, vol. 43, pp. 321–336.
Funding
This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
AI tools may have been used in the translation or editing of this article.
Rights and permissions
About this article
Cite this article
Sánchez-García, Á.J., Limón, X., Domínguez-Isidro, S. et al. Class Balancing Approaches to Improve for Software Defect Prediction Estimations: A Comparative Study. Program Comput Soft 50, 621–647 (2024). https://doi.org/10.1134/S036176882470066X
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S036176882470066X