Abstract
In recent years, machine learning technology has developed vigorously. The research on software defect prediction in the field of software engineering is increasingly adopting various algorithms of machine learning. This article has carried out a systematic literature review on the field of defect prediction. First, this article studies the development process of defect prediction, from correlation to prediction model. then this article studies the development process of cross-project defect prediction based on machine learning algorithms (naive Bayes, decision tree, random forest, neural network, etc.). Finally, this paper looks forward to the research difficulties and future directions of software defect prediction, such as imbalance in classification, cost of data labeling, and cross-project data distribution.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Akiyama, F.: An example of software system debugging. In: Freiman, C.V., Griffith, J.E., Rosenfeld, J.L. (eds.) IFIP Congress, no. 1, pp. 353–359. North-Holland (1971). ISBN: 0-7204-2063-6
Basili, R., Marziali, A., Pazienza, M.T.: Modelling syntactic uncertainty in lexical acquisition from texts. J. Quant. Linguist. 1(1), 62–81 (1994). https://doi.org/10.1080/09296179408590000
Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994). https://doi.org/10.1109/32.295895
Goel, L., Sharma, M., Khatri, S., Damodaran, D.: Prediction of cross project defects using ensemble based multinomial classifier. ICST Trans. Scalable Inf. Syst. 159974 (2018). https://doi.org/10.4108/eai.13-7-2018.159974
Gong, L., Jiang, S., Bo, L., Jiang, L., Qian, J.: A novel class-imbalance learning approach for both within-project and cross-project defect prediction. IEEE Trans. Reliab. 69(1), 40–54 (2020). https://doi.org/10.1109/TR.2019.2895462
Halstead, M. H.: Elements of Software Science, Operating, and Programming Systems Series. Elsevier Science, 7 (1977)
Li, Z., Qi, C., Zhang, L., Ren, J.: Discriminant subspace alignment for cross-project defect prediction. In: Proceedings of the 2019 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Internet of People and Smart City Innovation, SmartWorld/UIC/ATC/SCALCOM/IOP/SCI 2019, pp. 1728–1733 (2019). https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00308
Ma, Y., Luo, G., Li, J., Chen, A.: Software defect prediction using transfer method. In: 2011 International Conference on Computational Problem-Solving, ICCP 2011 (2011). https://doi.org/10.1109/ICCPS.2011.6092261
Mccabe, T. J.: A Complexity Measure. IEEE Trans. Softw. Eng. SE-2(4) (1976). https://doi.org/10.1109/TSE.1976.233837
Munson, J.C., Khoshgoftaar, T.M.: The detection of fault-prone programs. IEEE Trans. Softw. Eng. 18(5), 423–433 (1992). https://doi.org/10.1109/32.135775
Nagappan, N., Ball, T.: Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th International Conference on Software Engineering, ICSE 2005 (2005). https://doi.org/10.1145/1062455.1062514
Nagappan, N., Murphy, B., Basili, V.R.: The influence of organizational structure on software quality: an empirical case study. In: Proceedings of the International Conference on Software Engineering (2008). https://doi.org/10.1145/1368088.1368160
Pan, S.J.: Transfer defect learning. In: Proceedings of the International Conference on Software Engineering, pp. 382–391, 22 May 2013
Shen, V.Y., Yu, T.J., Thebaut, S.M., Paulsen, L.R.: Identifying error-prone software—an empirical study. IEEE Trans. Softw. Eng. SE-11(4), 317–324 (1985). https://doi.org/10.1109/TSE.1985.232222
Tian, Y.: Research on software defect prediction based on program slice (2020)
Wang, H.: Research on software defect predication based on ensemble learning (2020)
Yang, Y., Ai, J., Wang, F.: Defect prediction based on the characteristics of multilayer structure of software network. In: Proceedings of the 2018 IEEE 18th International Conference on Software Quality, Reliability, and Security Companion, QRS-C 2018 (2018). https://doi.org/10.1109/QRS-C.2018.00019
Zhu, K., Zhang, N., Ying, S., Wang, X.: Within-project and cross-project software defect prediction based on improved transfer Naive Bayes algorithm. Comput. Mater. Contin. 63(2), 891–910 (2020). https://doi.org/10.32604/cmc.2020.08096
Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of the International Conference on Software Engineering (2008). https://doi.org/10.1145/1368088.1368161
Zimmermann, T., Nagappan, N., Gall, H., Giger, E., Murphy, B.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: ESEC-FSE 2009 - Proceedings of the Joint 12th European Software Engineering Conference and 17th ACM SIGSOFT Symposium on the Foundations of Software Engineering (2009). https://doi.org/10.1145/1595696.1595713
Acknowledgement
The work was supported by Macau Foundation, Project number: MF2012.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, B., Wang, W., Zhu, L., Liu, W. (2021). Research on Cross-Project Software Defect Prediction Based on Machine Learning. In: Zhou, W., Mu, Y. (eds) Advances in Web-Based Learning – ICWL 2021. ICWL 2021. Lecture Notes in Computer Science(), vol 13103. Springer, Cham. https://doi.org/10.1007/978-3-030-90785-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-90785-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90784-6
Online ISBN: 978-3-030-90785-3
eBook Packages: Computer ScienceComputer Science (R0)