Abstract
In recent years, software defect prediction has been recognized as a cost-sensitive learning problem. To deal with the unequal misclassification losses resulted by different classification errors, some cost-sensitive dictionary learning methods have been proposed recently. Generally speaking, these methods usually define the misclassification costs to measure the unequal losses and then propose to minimize the cost-sensitive reconstruction loss by embedding the cost information into the reconstruction function of dictionary learning. Although promising performance has been achieved, their cost-sensitive reconstruction functions are not well-designed. In addition, no sufficient attentions are paid to the coding coefficients which can also be helpful to reduce the reconstruction loss. To address these issues, this paper proposes a new cost-sensitive reconstruction loss function and introduces an additional cost-sensitive discrimination regularization for the coding coefficients. Both the two terms are jointly optimized in a unified cost-sensitive dictionary learning framework. By doing so, we can achieve the minimum reconstruction loss and thus obtain a more cost-sensitive dictionary for feature encoding of test data. In the experimental part, we have conducted extensive experiments on twenty-five software projects from four benchmark datasets of NASA, AEEEM, ReLink and Jureczko. The results, in comparison with ten state-of-the-art software defect prediction methods, demonstrate the effectiveness of learned cost-sensitive dictionary for software defect prediction.
Similar content being viewed by others
References
Menzies T, Greenwald J, Frank A (2006) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13
Shepperd M, Bowes D, Hall T (2014) Researcher bias: the use of machine learning in software defect prediction. IEEE Trans Softw Eng 40(6):603–616
Li ZQ, Jing XY, Zhu XK (2018) Progress on approaches to software defect prediction. IET Softw 12(3):161–175
Boehm BW, Basili VR (2005) Foundations of empirical software engineering: the legacy of Victor R.Basuli. Springer, Berlin
Boehm BW, Papaccio PN (1988) Understanding and controlling software costs. IEEE Trans Softw Eng 14(10):1462–1477
McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320
Halstead MH (1977) Elements of software science. Elsevier, North-Holland
Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Ma Y, Zhu S, Qin K, Luo G (2014) Combining the requirement information for software defect estimation in design time. Inf Process Lett 114(9):469–474
Jiang Y, Cuki B, Menzies T, Bartlow N (2008) Comparing design and code metrics for software quality prediction. In: Proceedings of the 4th international workshop on predictor models in software engineering, pp 11–18
Gray D, Bowes D, Davey N, Sun Y, Christianson B (2009) Using the support vector machine as a classification method for software defect prediction with static code metrics. In: Proceedings of international conference on engineering applications of neural networks, pp 223–234
Elish KO, Elish MO (2008) Predicting defect-prone software modules using support vector machines. J Syst Soft 81(5):649–660
Wang J, Shen B, Chen Y (2012) Compressed C4.5 models for software defect prediction. In: Proceedings of 12th international conference on quality software, pp 13–16
Khoshgoftaar TM, Seliya N (2002) Tree-based software quality estimation models for fault prediction. In: Proceedings of eighth IEEE symposium on software metrics, pp 203–214
Wang T, Li WH (2010) Naive Bayes software defect prediction model. In: Proceedings of 2010 international conference on computational intelligence and software engineering, pp 1–4
Amasaki S, Takagi Y, Mizuno O, Kikuno T (2003) A Bayesian belief network for assessing the likelihood of fault content. In: Proceedings of 14th international symposium on software reliability engineering, pp 215–226
Khoshgoftaar TM, Allen EB, Hudepohl JP, Aud SJ (1997) Application of neural networks to software quality modeling of a very large telecommunications system. IEEE Trans Neural Netw 8(4):902–909
Singh Y, Kaur A, Malhotra R (2008) Predicting software fault proneness model using neural network. In: Proceedings of international conference on product focused software process improvement, pp 204–214
Liu MX, Miao LS, Zhang DQ (2014) Two-stage cost-sensitive learning for software defect prediction. IEEE Trans Reliab 63(2):676–686
Yang M, Zhang L, Feng X, Zhang D (2011) Fisher discrimination dictionary learning for sparse representation. In: Proceedings of 2011 international conference on computer vision, pp 543–550
Liu HD, Yang M, Gao Y, Yin YL, Chen L (2014) Bilinear discriminative dictionary learning for face recognition. Pattern Recognit 47(5):1835–1845
Özakıncı R, Tarhan A (2018) Early software defect prediction: a systematic map and review. J Syst Softw 144:216–239
Rathore SS, Kumar S (2019) A study on software fault prediction techniques. Artif Intell Rev 51(2):255–327
Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P, Zhang T (2019) Software defect prediction based on kernel PCA and weighted extreme learning machine. Inf Softw Technol 106:182–200
Zhang ZW, Jing XY, Wang TJ (2017) Label propagation based semi-supervised learning for software defect prediction. Automat Softw Eng 24(1):47–69
Kondo M, Bezemer CP, Kamei Y, Hassan AE, Mizuno O (2019) The impact of feature reduction techniques on defect prediction models. Empir Softw Eng 24(4):1925–1963
Yang X, Lo D, Xia X, Sun J (2017) TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Inf Softw Technol 87:206–220
Xu Z, Liu J, Luo X, Zhang T (2018) Cross-version defect prediction via hybrid active learning with kernel principal component analysis. In: Proceedings of IEEE 25th international conference on software analysis, evolution and reengineering (SANER), pp. 209–220
Jing XY, Wu F, Dong X, Qi F, Xu B (2015) Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In: Proceedings of the 10th joint meeting on foundations of software engineering, pp 496–507
Bennin KE, Keung JW, Monden A (2019) On the relative value of data resampling approaches for software defect prediction. Empir Softw Eng 24(2):602–636
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Wan JW, Yang M, Chen YJ (2015) Discriminative cost sensitive Laplacian score for face recognition. Neurocomputing 152(25):333–344
Wan JW, Wang HY, Yang M (2017) Cost sensitive semi-supervised canonical correlation analysis for multi-view dimensionality reduction. Neural Process Lett 45(2):411–430
Khoshgoftaar TM, Geleyn E, Nguyen L, Bullard L (2002) Cost-sensitive boosting in software quality modeling. In: Proceedings of international symposium on high assurance systems engineering, pp 51–60
Zheng J (2010) Cost-sensitive boosting neural networks for software defect prediction. Expert Syst Appl 37(6):4537–4543
Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proceedings of 30th international conference on software engineering, pp 181–190
Yu J, Rui Y, Tao DC (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032
Yu J, Tan M, Zhang H, Tao DC, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2008) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Zhang GQ, Sun HJ, Ji ZX, Yuan YH, Sun QS (2016) Cost-sensitive dictionary learning for face recognition. Pattern Recognit 60:613–629
Wu F, Jing XY, Yue D (2017) Multi-view discriminant dictionary learning via learning view-specific and shared structured dictionaries for image classification. Neural Process Lett 45(2):649–666
Zhang Z, Sun Y, Wang Y, Zha Z, Yan SC, Wang M (2019) Convolutional dictionary pair learning network for image representation learning. arXiv:1912.12138
Liu H, Guo D, Sun F (2016) Object recognition using tactile measurements: kernel sparse coding methods. IEEE Trans Instrum Meas 65(3):656–665
Li Z, Zhang Z, Qin J, Zhang Z, Shao L (2019) Discriminative fisher embedding dictionary learning algorithm for object recognition. IEEE Trans Neural Netw Learn Syst 31(3):786–800
Shrivastava A, Patel VM, Chellappa R (2015) Non-linear dictionary learning with partially labeled data. Pattern Recognit 48(11):3283–3292
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
Jing XY, Ying S, Zhang ZW, Wu SS, Liu J (2014) Dictionary learning based software defect prediction. In: Proceedings of the 36th international conference on software engineering, pp 414–423
Wu F, Jing XY, Sun Y, Sun J, Huang L, Cui F, Sun Y (2018) Cross-project and within-project semisupervised software defect prediction: a unified approach. IEEE Trans Reliab 67(2):581–597
Wan JW, Yang M, Wang HY (2017) Cost sensitive matrix factorization for face recognition. In: Proceedings of intelligence data engineering and automated learning, pp 136–145
Wan JW, Yang M, Gao Y, Chen YJ (2014) Pairwise costs in semisupervised discriminant analysis for face recognition. IEEE Trans Inf Forensic Secur 9(10):1569–1580
Ting KM (2002) An instance-weighting method to induce cost-sensitive trees. IEEE Trans Knowl Data Eng 14(3):659–665
Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, Hoboken
Rosasco L, Verri A, Santoro M, Mosci S, Villa S (2009) Iterative projection methods for structured sparsity regularization. MIT Technical Reports, MIT-CSAIL-TR-2009-050, CBCL-282
Yang M, Zhang L, Yang J, Zhang D (2010) Metaface learning for sparse representation based face recognition. In: Proceedings of IEEE international conference on image processing, pp 1601–1604
Shepperd M, Song QB, Sun Z, Mair C (2013) Data quality: some comments on the nasa software defect datasets. IEEE Trans Softw Eng 39(9):1208–1215
D’Ambros M, Lanza M, Robbes R (2012) An extensive comparison of bug prediction approaches. In: Proceedings of IEEE working conference on mining software repositories, pp 31–41
Wu R, Zhang H, Kim S, Cheung SC (2011) Relink: recovering links between bugs and changes. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, pp 15–25
Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering, pp 1–10
Ji H, Huang S, Wu Y, Hui Z, Zheng C (2019) A new weighted naive Bayes method based on information diffusion for software defect prediction. Softw Qual J 27(3):923–968
Wan JW, Wang Y (2019) Cost-sensitive label propagation for semi-supervised face recognition. IEEE Trans Inf Forensic Secur 14(7):1729–1743
Xu Z, Li S, Luo X, Liu J, Zhang T, Tang Y, Keung J (2019) TSTSS: a two-stage training subset selection framework for cross version defect prediction. J Syst Soft 154:59–78
Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of international joint conference on artificial intelligence, pp 973–978
Iman RL, Davenport JM (1980) Approximations of the critical region of the Friedman statistic. Commun Stat Theory Methods 9(6):571–595
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Nemenyi PB (1963) Distribution-free multiple comparisons. PhD Thesis, Princeton University, Princeton
Boyd S, Boyd SP, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Acknowledgements
The authors would like to thank the anonymous referees and the editors for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by National Natural Science Foundation of China under Grants 61502058, 61572085 and 61976028.
Appendices
Appendix A
In this section, we discuss the convexity of \(R(A^{(i)},C)\) with respect to \(A^{(i)}\). Firstly, we define two constants including matrix \(Z_i=I-\frac{1}{n_i}E_i^i\) where \(E^i_i\in {\mathbf {R}}^{n_i\times n_i}\) with all entries being 1 and \(I\in {\mathbf {R}}^{n_i\times n_i}\) is the identity matrix, and vector \({\mathbf {p}}_i=\frac{1}{n_i}{\mathbf {v}}_i\) where \({\mathbf {v}}_i=[1,\ldots ,1]^T\in {\mathbf {R}}^{n_i}\) with all entries being 1.
With these two constants, we can rewrite the \(R(A^{(i)},C)\) in Eq. (7) as follows
where
According to the definition of convex function, the convexity of \(R(A^{(i)},C)\) with respect to \(A^{(i)}\) depends on whether its Hessian matrix \(\nabla ^2R(A^{(i)},C)\) is positive definite or not [67]. Specifically, by taking the derivative of \(R(A^{(i)},C)\) in Eq. (15) with respect to \(A^{(i)}\), the Hessian matrix \(\nabla ^2R(A^{(i)},C)\) is as follows
In order to prove the positive definite of matrix \(\nabla ^2R(A^{(i)},C)\), we substitute the constants \(Z_i\) and \({\mathbf {p}}_i\) in Eq. (17) with \(I-\frac{1}{n_i}E_i^i\) and \(\frac{1}{n_i}{\mathbf {v}}_i\) respectively, and thus obtain
Obviously, if matrix \(\nabla ^2R(A^{(i)},C)\) is positive definite, all of its eigenvalues should be greater than 0. Due to the fact that the maximal eigenvalue of matrix \(E_i^i\) is \(n_i\) [20], the positive definite of \(\nabla ^2R(A^{(i)},C)\) can be proved if
By simple derivation, we obtain the condition \(\eta >\frac{1}{n_i}\sum _{j=1}^c(C_{ij}+C_{ji})\), which can guarantee the positive definite of matrix \(\nabla ^2R(A^{(i)},C)\) and thus make the \(R(A^{(i)},C)\) be convex with respect to \({A}^{(i)}\).
Appendix B
Rights and permissions
About this article
Cite this article
Niu, L., Wan, J., Wang, H. et al. Cost-sensitive Dictionary Learning for Software Defect Prediction. Neural Process Lett 52, 2415–2449 (2020). https://doi.org/10.1007/s11063-020-10355-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10355-z