Skip to main content
Log in

Machine learning approach for software defect prediction using multi-core parallel computing

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Defect prediction in software development is a very active topic of study. Software defect prediction (SDP) findings give the list of defect-prone source code artefacts, enabling quality assurance teams to efficiently allocate limited resources for validating software products. In order to enable both developers and reduce the time to market for more dependable software products, software defect prediction tools will play an increasingly significant role. Many machine learning approaches are present in the existing literature for SDP to enhance the performance of the software development team. However, very little work is reported for SDP using multi-core parallel computing. In this paper, a multi-core parallel machine learning approach for software defect prediction is proposed to classify a component as defective or non-defective. The proposed model has been built, trained and tested by varying the number of CPU cores involved in the processing. Extensive empirical studies have been conducted by applying the proposed approach on 11 software systems of NASA/PROMISE and other relevant repositories. The proposed approach has been compared with various state-of-art machine learning models to investigate the proposed models' supremacy in comparison with the other existing models. The experimental results indicate that the predictive performance of the proposed model is improved, and execution time is decreased by involving a greater number of CPU cores. Through evaluation of calculated results, it has been observed that the multi-core parallel processing Random Forest approach gives the best predicting performance parameters values nearly 99 or 100%. Moreover, the proposed approach performs significantly better in accuracy, precision, recall, F-Measures, and AUC compared to other machine learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • http://promise.site.uottawa.ca/SERepository/datasets-page.html. (2022) Accessed Jan 2022

  • https://scikit-learn.org/stable/.(2022) Accessed Jan 2022

  • B. Ghotra , S. McIntosh , A.E. Hassan ,: A large-scale study of the impact of fea- ture selection techniques on defect classification models, In: Proceedings of the 14th International Conference on Mining Software Repositories (MSR), IEEE, 2017, pp. 146–157

  • Das, R., Walia, E.: Partition selection with sparse autoencoders for content based image classification. Neural. Comput. Appl. 31, 675–690 (2019)

    Article  Google Scholar 

  • Defect Datasets: https://github.com/klainfo/DefectData (2022). Accessed Jan 2022

  • Gong, L., Jiang, S., Bo, L., Jiang, L., Qian, J.: A novel class-imbalance learning approach for both within-project and cross-project defect prediction. IEEE Trans. Reliab. 69(1), 40–54 (2019)

    Article  Google Scholar 

  • Guo, J., Chen, Z., Ban, Y.-L.: Precise enumeration of circulating tumor cells using support vector machine algorithm on a microfluidic sensor. IEEE Trans. Emerging Top. Comput. 5(99), 518–525 (2017)

    Article  Google Scholar 

  • Herbold, S.: Comments on ScottKnottESD in response to : an empirical comparison of model validation techniques for defect prediction models. IEEE Trans. Softw. Eng. 99, 1091–1094 (2017)

    Article  Google Scholar 

  • Hijazi, N.M., Faris, H., Aljarah, I.: A parallel metaheuristic approach for ensemble feature selection based on multi-core architectures. Expert Syst. Appl. 182, 115290 (2021)

    Article  Google Scholar 

  • Hong, L., Dai, F., Liu, H.: A fused-lasso-based Doppler imaging algorithm for spinning targets with occlusion effect. IEEE Sens. J. 16(9), 3099–3108 (2016)

    Article  Google Scholar 

  • Jin, C.: Cross-project software defect prediction based on domain adaptation learning and optimization. Expert Syst. Appl. 171, 114637 (2021)

    Article  Google Scholar 

  • Kalaivani, N., Beena, R.: Overview of software defect prediction using machine learning algorithms. Int. J. Pure Appl. Math. 118(20), 3863–3873 (2018)

    Google Scholar 

  • Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)

    Article  Google Scholar 

  • Li, F.: Lu Y :Lasso-type estimation for covariate-adjusted linear model. J. Appl. Stat. 45(1), 1–17 (2016)

    Google Scholar 

  • Limsettho, N., Bennin, K.E., Keung, J.W., Hata, H., Matsumoto, K.: Cross project defect prediction using class distribution estimation and oversampling. Inf. Softw. Technol. 100, 87–102 (2018)

    Article  Google Scholar 

  • Liu, C., Yang, D., Xia, X., Yan, M., Zhang, X.: A two-phase transfer learning model for cross-project defect prediction. Inf. Softw. Technol. 107, 125–136 (2019)

    Article  Google Scholar 

  • Luo, G., Chen, H.: Kernel based asymmetric learning for software defect prediction. IEICE Trans. Inf. Syst. 95(1), 267–270 (2012)

    Google Scholar 

  • Luo, G., Ma, Y., Qin, K.: Asymmetric learning based on kernel partial least squares for software defect prediction. IEICE Trans. Inf. Syst. 95(7), 2006–2008 (2012)

    Article  Google Scholar 

  • Majumder, S., Mody, P., Menzies, T.: Revisiting process versus product metrics: a large scale analysis. Empir. Softw. Eng. 27(3), 1–42 (2020)

    Google Scholar 

  • Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Appl. Softw. Comput. 27, 504–518 (2015)

    Article  Google Scholar 

  • Malhotra, R.: An empirical framework for defect prediction using machine learning techniques with android software. Appl. Softw. Comput. 49, 1034–1050 (2016)

    Article  Google Scholar 

  • Nam, J., Fu, W., Kim, S.: Heterogeneous defect prediction. IEEE Trans. Softw. Eng. 44(9), 874–896 (2018)

    Article  Google Scholar 

  • Peng, X.: A spheres-based support vector machine for pattern classification. Neural. Comput. Appl. 31, 379–396 (2019)

    Article  Google Scholar 

  • R. Malhotra , R. Raje :An empirical comparison of machine learning techniques for software defect prediction, In: Proceedings of the 8th International Conference on Bioinspired Information and Communications Technologies, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2014, pp. 320–327

  • Radmanesh, N., Burnett, I., Rao, B.: A lasso-LS optimization with a frequency variable dictionary in a multizone sound system. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 583–593 (2016)

    Article  Google Scholar 

  • Random Forest. https://www.datacamp.com/community/tutorials/random-forests-classifier-python#building. (2021) Accessed Aug, 2021

  • Random Forest. https://towardsdatascience.com/the-mathematics-of-decision-trees-random-forest-and-feature-importance-in-scikit-learn-and-spark-f2861df67e3. (2021) Accessed on Aug, 2021

  • Ren, K., Qin, Y., Ma, G. Luo.: On software defect prediction using machine learning. J. Appl. Math (2014). https://doi.org/10.1155/2014/785435

    Article  MathSciNet  MATH  Google Scholar 

  • Shrikanth, NC. Majumder, S. and Menzies T (2021). Early life cycle software defect prediction. why? how? In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 448–459, IEEE Computer Society

  • Verbraeken, J., Wolting, M., Katzy, J., Kloppenburg, J., Verbelen, T., Rellermeyer, J.S.: A survey on distributed machine learning. ACM Comput. Surv. 53(2), 1–33 (2017)

    Article  Google Scholar 

  • Vijayakumar, K., Arun, C.: Continuous security assessment of cloud based applications using distributed hashing algorithm in SDLC. Clust. Computing 22(5), 10789–10800 (2019)

    Article  Google Scholar 

  • Wang, K., Liu, L., Yuan, C., Wang, Z.: Software defect prediction model based on LASSO–SVM. Neural Comput. Appl. 33(14), 8249–8259 (2021)

    Article  Google Scholar 

  • Xu, Z., Liu, J., Luo, X., Yang, Z., Zhang, Y., Yuan, P., Zhang, T.: Software defect prediction based on kernel PCA and weighted extreme learning machine. Inf. Softw. Technol. 106, 182–200 (2019)

    Article  Google Scholar 

  • Yu, X., Liu, J., Peng, W.: Improving cross-company defect prediction with data filtering. Int. J. Softw. Eng. Knowl. Eng. 27(10), 1427–1438 (2017)

    Article  Google Scholar 

  • Zhang, Z.-W., Jing, X.-Y., Wang, T.-J.: Label propagation based semi-supervised learning for software defect prediction. Autom. Softw. Eng. 24(1), 1–23 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudip Kumar Sahana.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Parashar, A., Kumar Goyal, R., Kaushal, S. et al. Machine learning approach for software defect prediction using multi-core parallel computing. Autom Softw Eng 29, 44 (2022). https://doi.org/10.1007/s10515-022-00340-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-022-00340-2

Keywords

Navigation