Skip to main content
Log in

Software defect prediction model based on improved twin support vector machines

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Software defect prediction contributes to ensuring the quality of software development and reducing software maintenance costs. However, the class imbalance problem can affect the accuracy of defect prediction classification, which is a crucial issue to be solved urgently. We propose a novel software defect prediction model based on a twin support vector machine to address imbalanced data classification issues and optimize the prediction effect. The model embeds the within-class structure of the training samples as the regularization term into the objective function, considering the structural information hidden in the data, and obtains the class structure information through clustering. Moreover, by introducing within-class structure information to maximize the within-class distances and one class intervals, the model produces a superior classification hyperplane and enhances the generalization ability of the support vector machine. The experimental results demonstrate that the proposed algorithm achieves higher prediction accuracy, more robust adaptability, and optimized performance in classifying imbalanced data compared with existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

Inquiries about data availability should be directed to the authors.

References

  • Adak MF (2018) Software defect detection by using data mining based fuzzy logic. In: 2018 Sixth international conference on digital information, networking, and wireless communications (DINWC), pp.65–69. IEEE Press

  • Agarwal S, Tomar D, Verma S (2014) Prediction of software defects using Twin Support Vector Machine. In: 2014 International Conference on Information Systems and Computer Networks (ISCON), pp.128–132. IEEE Press

  • Andreou AS, Chatzis SP (2016) Software defect prediction using doubly stochastic Poisson processes driven by stochastic belief networks. J Syst Softw 122:72–82

    Google Scholar 

  • Chen X, Zhang D, Zhao Y, Cui Z, Ni C (2019) Software defect number prediction: Unsupervised vs supervised methods. Inf Softw Technol 106:161–181

    Google Scholar 

  • Dekhandji FZ (2017) Signal processing deployment in power quality disturbance detection and classification. Acta Phys Pol. Ser. A 132(3):415–419

    Google Scholar 

  • Dekhandji FZ, Talhaoui S, Arkab Y (2019) Power quality detection, classification and monitoring using LABVIEW. Alger J Signals Syst 4(2):101–111

    Google Scholar 

  • Ganeshkumar P, Kalaivani S (2015) Predicting software defects using linear twin cores Vector machine model. Int Res J Eng Technol 2:665–670

    Google Scholar 

  • Du Y, Zhang L, Shi J, Tang J, Yin Y (2018) Feature-grouping-based two steps feature selection algorithm in software defect prediction. In: Proceedings of the 2nd international conference on advances in image processing (ICAIP '18), pp.173–178. Association for Computing Machinery

  • Elish KO, Elish MO (2008) Predicting defect-prone software modules using support vector machines. J Syst Softw 81(5):649–660

    Google Scholar 

  • Fu Y, Dong W, Yin L, Du Y (2017) Software defect prediction model based on the combination of machine learning algorithms. J Comput Res Dev 54(3):633–641

    Google Scholar 

  • Gao Y, Yang C (2019) Software defect prediction based on manifold learning in subspace selection. In: Proceedings of the 2016 international conference on intelligent information processing (ICIIP' 16), pp.1–6. Association for Computing Machinery

  • Ghosh S, Rana A, Kansal V (2018) A nonlinear manifold detection based model for software defect prediction. Proc Comput Sci 132:581–594

    Google Scholar 

  • Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: 2015 37th international conference on software engineering, 1, pp.789–800. IEEE Press

  • Huang H, Wei X, Zhou Y (2018) Twin support vector machines: a survey. Neurocomputing 300:34–43

    Google Scholar 

  • Ibrahim DR, Ghnemat R, Hudaib A (2017) Software defect prediction using feature selection and random forest algorithm. In: 2017 International Conference on New Trends in Computing Sciences (ICTCS), pp.252–257. IEEE Press

  • Jayadeva A, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal 29(5):905–910

    MATH  Google Scholar 

  • Jayanthi R, Florence L (2019) Software defect prediction techniques using metrics based on neural network classifier. Clust Comput 22(1):77–88

    Google Scholar 

  • Jing X-Y, Ying S, Zhang Z-W, Wu S-S, Liu J (2014) Dictionary learning based software defect prediction. In: Proceedings of the 36th International Conference on Software Engineering ((ICSE 2014)), pp.414–423. Association for Computing Machinery

  • Kalai MR, Jacob SG (2015) Improved random forest algorithm for software defect prediction through data mining techniques. Int J Comput Appl 117(23):18–22

    Google Scholar 

  • Khoshgoftaar TM, Lanning DL, Pandya AS (1994) A comparative study of pattern recognition techniques for quality evaluation of telecommunications software. IEEE J Sel Areas Commun 12(2):279–291

    Google Scholar 

  • Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inform Software Tech 58:388–402

    Google Scholar 

  • Li J, He P, Zhu J, Lyu MR (2017) Software defect prediction via convolutional neural network. In: 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp.318–328. IEEE Press

  • Lin M, Tang K, Yao X (2013) Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans Neural Netw Learn Syst 24(4):647–660

    Google Scholar 

  • Liu X-Y, Wu J, Zhou Z-H (2009) exploratory undersampling for class-imbalance learning. IEEE trans Syst man cyb 39(2):539–550

    Google Scholar 

  • Liu M, Miao L, Zhang D (2014a) Two-stage cost-sensitive learning for software defect prediction. IEEE Trans Reliab 63(2):676–686

    Google Scholar 

  • Liu W, Chen X, Gu Q, Liu S, Chen D (2016) A cluster-analysis-based feature-selection method for software defect prediction. Sci Sin Inf 46(1674–7267):1298

    Google Scholar 

  • Liu S, Chen X, Liu W, Chen J, Gu Q, Chen D (2014b) FECAR: a feature selection framework for software defect prediction. In: 2014b IEEE 38th annual computer software and applications conference, pp.426–435. IEEE Press

  • Malhotra R, Kamal S (2019) An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343:120–140

    Google Scholar 

  • Mangasarian OL, Musicant DR (1999) Successive overrelaxation for support vector machines. IEEE Trans Neural Networ 10(5):1032–1037

    Google Scholar 

  • Marandi AK, Khan DA (2015) An impact of linear regression models for improving the software quality with estimated cost. Proc Comput Sci 54:335–342

    Google Scholar 

  • Ni C, Chen X, Xia X, Gu Q, Zhao Y (2019) Multitask defect prediction. J Softw-Evol Proc 31(12):e2203

    Google Scholar 

  • Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355

    Google Scholar 

  • Rong X, Li F, Cui Z (2016) A model for software defect prediction using support vector machine based on CBA. Int J Intell Syst Technol Appl 15(1):19–34

    Google Scholar 

  • Shao Y, Zhang C-H, Wang X-B, Deng N-Y (2011) Improvements on twin support vector machines. IEEE Trans Neural Networ 22(6):962–968

    Google Scholar 

  • Shao Y-H, Wang Z, Chen W-J, Deng N-Y (2013) A regularization for the projection twin support vector machine. Knowl-Based Syst 37:203–210

    Google Scholar 

  • Shao Y-H, Chen W-J, Zhang J-J, Wang Z, Deng N-Y (2014) An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognit 47(9):3158–3167

    MATH  Google Scholar 

  • Sharma D, Chandra P (2018) software fault prediction using machine-learning techniques. In: Proceedings of the first international conference on SCI, 2, pp.541–549. Springer

  • Singh PD, Chug A (2017) Software defect prediction analysis using machine learning algorithms. In: 2017 7th international conference on cloud computing, data science & engineering - confluence, pp.775–781. IEEE Press

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2019) The impact of automated parameter optimization on defect prediction models. IEEE Trans Software Eng 45(7):683–711

    Google Scholar 

  • Tantithamthavorn C, Hassan AE, Matsumoto K (2018) The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans Softw Eng 46(11):1200–1219

    Google Scholar 

  • Tomar D, Agarwal S (2015) Twin support vector machine: a review from 2007 to 2014. Egypt Inform J 16(1):55–69

    Google Scholar 

  • Tomar D, Agarwal S (2016) Prediction of defective software modules using class imbalance learning. Appl Comput Intell Soft Comput 2016:1–12

    Google Scholar 

  • Valles-Barajas F (2015) A comparative analysis between two techniques for the prediction of software defects: fuzzy and statistical linear regression. Innov Syst Softw Eng 11(4):277–287

    Google Scholar 

  • Vapnik VN (1995) The nature of statistical learning theory. Springer, Berlin

    MATH  Google Scholar 

  • Wahono R (2015) A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J Softw Eng 1:1–16

    Google Scholar 

  • Wang BX, Japkowicz N (2010) Boosting support vector machines for imbalanced data sets. Knowl Inf Syst 25(1):1–20

    Google Scholar 

  • Wang X, Niu Y (2013) New one-versus-all ν-SVM solving intra–inter class imbalance with extended manifold regularization and localized relative maximum margin. Neurocomputing 115:106–121

    Google Scholar 

  • Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443

    Google Scholar 

  • Wei H, Hu C, Chen S, Xue Y, Zhang Q (2019) Establishing a software defect prediction model via effective dimension reduction. Inform Sciences 477:399–409

    MathSciNet  Google Scholar 

  • Wu S-H, Lin K-P, Chen C-M, Chen M-S (2008) Asymmetric support vector machines: low false-positive learning under the user tolerance. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD' 08), pp.749–757. Association for Computing Machinery

  • Xiao P, Liu B, Wang S (2018) Feedback-based integrated prediction: defect prediction based on feedback from software testing process. J Syst Softw 143:159–171

    Google Scholar 

  • Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P et al (2019) Software defect prediction based on kernel PCA and weighted extreme learning machine. Inform Softw Tech 106:182–200

    Google Scholar 

  • Yan Z, Chen X, Guo P (2010) Software defect prediction using fuzzy support vector regression. In: the 7th international symposium on neural networks (ISNN 2010), pp.17–24. Springer, Berlin, Heidelberg

  • Yu Q, Jiang SJ, Zhang YM, Wang XY, Gao PF, Qian J (2018) The impact study of class imbalance on the performance of software defect prediction models. Chin J Comput 41(4):809–824

    Google Scholar 

  • Zheng J (2010) Cost-sensitive boosting neural networks for software defect prediction. Expert Syst Appl 37(6):4537–4543

    Google Scholar 

  • Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering( ESEC/FSE '09), pp.91–100. Association for Computing Machinery (2009)

Download references

Funding

This work was supported by the National Natural Science Foundation of Guangxi (No. 2022GXNSFAA035552, 2021GXNSFAA220114), the Guangxi University Young Teachers Foundation Competence Improvement Project (No. 2021KY0592), Natural Science Foundation of China (No. 12261096), Guangxi Natural Science Foundation (No. 2020GXNSFAA159155), and the Natural Science Foundation of Yulin City of China (No. 202125001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Lei.

Ethics declarations

Conflicts of interest

All authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Lei, J., Liao, Z. et al. Software defect prediction model based on improved twin support vector machines. Soft Comput 27, 16101–16110 (2023). https://doi.org/10.1007/s00500-023-07984-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-07984-6

Keywords

Navigation