Machine learning approach for software defect prediction using multi-core parallel computing

Parashar, Anshu; Kumar Goyal, Raman; Kaushal, Sakshi; Kumar Sahana, Sudip

doi:10.1007/s10515-022-00340-2

Machine learning approach for software defect prediction using multi-core parallel computing

Published: 14 June 2022

Volume 29, article number 44, (2022)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Anshu Parashar¹,
Raman Kumar Goyal¹,
Sakshi Kaushal² &
…
Sudip Kumar Sahana ORCID: orcid.org/0000-0002-2493-3695³

987 Accesses
4 Citations
Explore all metrics

Abstract

Defect prediction in software development is a very active topic of study. Software defect prediction (SDP) findings give the list of defect-prone source code artefacts, enabling quality assurance teams to efficiently allocate limited resources for validating software products. In order to enable both developers and reduce the time to market for more dependable software products, software defect prediction tools will play an increasingly significant role. Many machine learning approaches are present in the existing literature for SDP to enhance the performance of the software development team. However, very little work is reported for SDP using multi-core parallel computing. In this paper, a multi-core parallel machine learning approach for software defect prediction is proposed to classify a component as defective or non-defective. The proposed model has been built, trained and tested by varying the number of CPU cores involved in the processing. Extensive empirical studies have been conducted by applying the proposed approach on 11 software systems of NASA/PROMISE and other relevant repositories. The proposed approach has been compared with various state-of-art machine learning models to investigate the proposed models' supremacy in comparison with the other existing models. The experimental results indicate that the predictive performance of the proposed model is improved, and execution time is decreased by involving a greater number of CPU cores. Through evaluation of calculated results, it has been observed that the multi-core parallel processing Random Forest approach gives the best predicting performance parameters values nearly 99 or 100%. Moreover, the proposed approach performs significantly better in accuracy, precision, recall, F-Measures, and AUC compared to other machine learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prevalence of Machine Learning Techniques in Software Defect Prediction

Hybrid deep architecture for software defect prediction with improved feature set

Article 17 February 2024

Performance evaluation of software defect prediction with NASA dataset using machine learning techniques

Article 04 October 2023

References

http://promise.site.uottawa.ca/SERepository/datasets-page.html. (2022) Accessed Jan 2022
https://scikit-learn.org/stable/.(2022) Accessed Jan 2022
B. Ghotra , S. McIntosh , A.E. Hassan ,: A large-scale study of the impact of fea- ture selection techniques on defect classification models, In: Proceedings of the 14th International Conference on Mining Software Repositories (MSR), IEEE, 2017, pp. 146–157
Das, R., Walia, E.: Partition selection with sparse autoencoders for content based image classification. Neural. Comput. Appl. 31, 675–690 (2019)
Article Google Scholar
Defect Datasets: https://github.com/klainfo/DefectData (2022). Accessed Jan 2022
Gong, L., Jiang, S., Bo, L., Jiang, L., Qian, J.: A novel class-imbalance learning approach for both within-project and cross-project defect prediction. IEEE Trans. Reliab. 69(1), 40–54 (2019)
Article Google Scholar
Guo, J., Chen, Z., Ban, Y.-L.: Precise enumeration of circulating tumor cells using support vector machine algorithm on a microfluidic sensor. IEEE Trans. Emerging Top. Comput. 5(99), 518–525 (2017)
Article Google Scholar
Herbold, S.: Comments on ScottKnottESD in response to : an empirical comparison of model validation techniques for defect prediction models. IEEE Trans. Softw. Eng. 99, 1091–1094 (2017)
Article Google Scholar
Hijazi, N.M., Faris, H., Aljarah, I.: A parallel metaheuristic approach for ensemble feature selection based on multi-core architectures. Expert Syst. Appl. 182, 115290 (2021)
Article Google Scholar
Hong, L., Dai, F., Liu, H.: A fused-lasso-based Doppler imaging algorithm for spinning targets with occlusion effect. IEEE Sens. J. 16(9), 3099–3108 (2016)
Article Google Scholar
Jin, C.: Cross-project software defect prediction based on domain adaptation learning and optimization. Expert Syst. Appl. 171, 114637 (2021)
Article Google Scholar
Kalaivani, N., Beena, R.: Overview of software defect prediction using machine learning algorithms. Int. J. Pure Appl. Math. 118(20), 3863–3873 (2018)
Google Scholar
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)
Article Google Scholar
Li, F.: Lu Y :Lasso-type estimation for covariate-adjusted linear model. J. Appl. Stat. 45(1), 1–17 (2016)
Google Scholar
Limsettho, N., Bennin, K.E., Keung, J.W., Hata, H., Matsumoto, K.: Cross project defect prediction using class distribution estimation and oversampling. Inf. Softw. Technol. 100, 87–102 (2018)
Article Google Scholar
Liu, C., Yang, D., Xia, X., Yan, M., Zhang, X.: A two-phase transfer learning model for cross-project defect prediction. Inf. Softw. Technol. 107, 125–136 (2019)
Article Google Scholar
Luo, G., Chen, H.: Kernel based asymmetric learning for software defect prediction. IEICE Trans. Inf. Syst. 95(1), 267–270 (2012)
Google Scholar
Luo, G., Ma, Y., Qin, K.: Asymmetric learning based on kernel partial least squares for software defect prediction. IEICE Trans. Inf. Syst. 95(7), 2006–2008 (2012)
Article Google Scholar
Majumder, S., Mody, P., Menzies, T.: Revisiting process versus product metrics: a large scale analysis. Empir. Softw. Eng. 27(3), 1–42 (2020)
Google Scholar
Malhotra, R.: A systematic review of machine learning techniques for software fault prediction. Appl. Softw. Comput. 27, 504–518 (2015)
Article Google Scholar
Malhotra, R.: An empirical framework for defect prediction using machine learning techniques with android software. Appl. Softw. Comput. 49, 1034–1050 (2016)
Article Google Scholar
Nam, J., Fu, W., Kim, S.: Heterogeneous defect prediction. IEEE Trans. Softw. Eng. 44(9), 874–896 (2018)
Article Google Scholar
Peng, X.: A spheres-based support vector machine for pattern classification. Neural. Comput. Appl. 31, 379–396 (2019)
Article Google Scholar
R. Malhotra , R. Raje :An empirical comparison of machine learning techniques for software defect prediction, In: Proceedings of the 8th International Conference on Bioinspired Information and Communications Technologies, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2014, pp. 320–327
Radmanesh, N., Burnett, I., Rao, B.: A lasso-LS optimization with a frequency variable dictionary in a multizone sound system. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 583–593 (2016)
Article Google Scholar
Random Forest. https://www.datacamp.com/community/tutorials/random-forests-classifier-python#building. (2021) Accessed Aug, 2021
Random Forest. https://towardsdatascience.com/the-mathematics-of-decision-trees-random-forest-and-feature-importance-in-scikit-learn-and-spark-f2861df67e3. (2021) Accessed on Aug, 2021
Ren, K., Qin, Y., Ma, G. Luo.: On software defect prediction using machine learning. J. Appl. Math (2014). https://doi.org/10.1155/2014/785435
Article MathSciNet MATH Google Scholar
Shrikanth, NC. Majumder, S. and Menzies T (2021). Early life cycle software defect prediction. why? how? In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 448–459, IEEE Computer Society
Verbraeken, J., Wolting, M., Katzy, J., Kloppenburg, J., Verbelen, T., Rellermeyer, J.S.: A survey on distributed machine learning. ACM Comput. Surv. 53(2), 1–33 (2017)
Article Google Scholar
Vijayakumar, K., Arun, C.: Continuous security assessment of cloud based applications using distributed hashing algorithm in SDLC. Clust. Computing 22(5), 10789–10800 (2019)
Article Google Scholar
Wang, K., Liu, L., Yuan, C., Wang, Z.: Software defect prediction model based on LASSO–SVM. Neural Comput. Appl. 33(14), 8249–8259 (2021)
Article Google Scholar
Xu, Z., Liu, J., Luo, X., Yang, Z., Zhang, Y., Yuan, P., Zhang, T.: Software defect prediction based on kernel PCA and weighted extreme learning machine. Inf. Softw. Technol. 106, 182–200 (2019)
Article Google Scholar
Yu, X., Liu, J., Peng, W.: Improving cross-company defect prediction with data filtering. Int. J. Softw. Eng. Knowl. Eng. 27(10), 1427–1438 (2017)
Article Google Scholar
Zhang, Z.-W., Jing, X.-Y., Wang, T.-J.: Label propagation based semi-supervised learning for software defect prediction. Autom. Softw. Eng. 24(1), 1–23 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala, India
Anshu Parashar & Raman Kumar Goyal
Computer Science and Engineering, University Institute of Engineering and Technology, Panjab University, Chandigarh, India
Sakshi Kaushal
Computer Science and Engineering, Birla Institute of Technology Mesra, Ranchi, India
Sudip Kumar Sahana

Authors

Anshu Parashar
View author publications
You can also search for this author in PubMed Google Scholar
Raman Kumar Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Sakshi Kaushal
View author publications
You can also search for this author in PubMed Google Scholar
Sudip Kumar Sahana
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sudip Kumar Sahana.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parashar, A., Kumar Goyal, R., Kaushal, S. et al. Machine learning approach for software defect prediction using multi-core parallel computing. Autom Softw Eng 29, 44 (2022). https://doi.org/10.1007/s10515-022-00340-2

Download citation

Received: 25 August 2021
Accepted: 01 April 2022
Published: 14 June 2022
DOI: https://doi.org/10.1007/s10515-022-00340-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning approach for software defect prediction using multi-core parallel computing

Abstract

Access this article

Similar content being viewed by others

Prevalence of Machine Learning Techniques in Software Defect Prediction

Hybrid deep architecture for software defect prediction with improved feature set

Performance evaluation of software defect prediction with NASA dataset using machine learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Machine learning approach for software defect prediction using multi-core parallel computing

Abstract

Access this article

Similar content being viewed by others

Prevalence of Machine Learning Techniques in Software Defect Prediction

Hybrid deep architecture for software defect prediction with improved feature set

Performance evaluation of software defect prediction with NASA dataset using machine learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation