Skip to main content
Log in

Multi-objective cross-version defect prediction

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Defect prediction models help software project teams to spot defect-prone source files of software systems. Software project teams can prioritize and put up rigorous quality assurance (QA) activities on these predicted defect-prone files to minimize post-release defects so that quality software can be delivered. Cross-version defect prediction is building a prediction model from the previous version of a software project to predict defects in the current version. This is more practical than the other two ways of building models, i.e., cross-project prediction model and cross- validation prediction models, as previous version of same software project will have similar parameter distribution among files. In this paper, we formulate cross-version defect prediction problem as a multi-objective optimization problem with two objective functions: (a) maximizing recall by minimizing misclassification cost and (b) maximizing recall by minimizing cost of QA activities on defect prone files. The two multi-objective defect prediction models are compared with four traditional machine learning algorithms, namely logistic regression, naïve Bayes, decision tree and random forest. We have used 11 projects from the PROMISE repository consisting of a total of 41 different versions of these projects. Our findings show that multi-objective logistic regression is more cost-effective than single-objective algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. Softw Eng IEEE Trans 22(10):751–761

    Article  Google Scholar 

  • Canfora G, De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2013) Multi-objective cross-project defect prediction. In: 2013 IEEE sixth international conference on software testing, verification and validation (ICST), IEEE, pp 252–261

  • Canfora G, Lucia AD, Penta MD, Oliveto R, Panichella A, Panichella S (2015) Defect prediction as a multiobjective optimization problem. Softw Test Verif Reliab 25(4):426–459

    Article  Google Scholar 

  • Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  • Coello CC, Lamont GB, Van Veldhuizen DA (2007) Evolutionary algorithms for solving multi-objective problems. Springer, Berlin

    MATH  Google Scholar 

  • Czibula G, Marian Z, Czibula IG (2014) Software defect prediction using relational association rule mining. Inf Sci 264:260–278. doi:10.1016/j.ins.2013.12.031

    Article  Google Scholar 

  • D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: 2010 7th IEEE working conference on mining software repositories (MSR), IEEE, pp 31–41

  • De Carvalho AB, Pozo A, Vergilio SR (2010) A symbolic fault-prediction model based on multiobjective particle swarm optimization. J Syst Softw 83(5):868–882

    Article  Google Scholar 

  • Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, New York

    MATH  Google Scholar 

  • Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. Lect Notes Comput Sci 1917:849–858

    Article  Google Scholar 

  • Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th international conference on software engineering, ICSE ’15—Volume 1. IEEE Press, Piscataway, pp 789–800

  • Goldberg DE (2006) Genetic algorithms. Pearson Education India, New Delhi

    Google Scholar 

  • Harman M (2010) The relationship between search based software engineering and predictive modeling. In: Proceedings of the 6th international conference on predictive models in software engineering, PROMISE ’10. ACM, New York, pp 1:1–1:13. doi:10.1145/1868328.1868330

  • Harman M, Clark J (2004) Metrics are fitness functions too. In: Proceedings of 10th international symposium on software metrics. IEEE, pp 58–69

  • Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st international conference on software engineering. IEEE Computer Society, pp 78–88

  • He Z, Peters F, Menzies T, Yang Y (2013) Learning from open-source projects: an empirical study on defect prediction. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement, pp 45–54. doi:10.1109/ESEM.2013.20

  • Herbold S (2013) Training data selection for cross-project defect prediction. In: Proceedings of the 9th international conference on predictive models in software engineering, PROMISE ’13. ACM, New York, pp 6:1–6:10. doi:10.1145/2499393.2499395

  • Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering, PROMISE ’10. ACM, New York, pp 9:1–9:10. doi:10.1145/1868328.1868342

  • Kamei Y, Matsumoto S, Monden A, Matsumoto Ki, Adams B, Hassan A (2010) Revisiting common bug prediction findings using effort-aware models. In: 2010 IEEE international conference on software maintenance (ICSM), pp 1–10. doi:10.1109/ICSM.2010.5609530

  • Kim S, Zimmermann T, Whitehead EJ Jr, Zeller A (2007) Predicting faults from cached history. In: Proceedings of the 29th international conference on software engineering. IEEE Computer Society, pp 489–498

  • Krall J, Menzies T, Davies M (2015) GALE: Geometric active learning for search-based software engineering. IEEE Trans Softw Eng 41(10):1001–1018

    Article  Google Scholar 

  • Krishnan S, Strasburg C, Lutz RR, Goševa-Popstojanova K (2011) Are change metrics good predictors for an evolving software product line? In: Proceedings of the 7th international conference on predictive models in software engineering, Promise ’11. ACM, New York, pp 7:1–7:10. doi:10.1145/2020390.2020397

  • Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496

    Article  Google Scholar 

  • Ma W, Chen L, Yang Y, Zhou Y, Xu B (2016) Empirical analysis of network measures for effort-aware fault-proneness prediction. Inf Softw Technol 69:50–70

    Article  Google Scholar 

  • Marian Z, Czibula IG, Czibula G, Sotoc S (2015) Software defect detection using self-organizing maps. Stud Unive Babes-Bolyai Inform 60(2):55–69

  • MATLAB (2015) version 8.5.0 (R2015a). The MathWorks Inc., Natick

  • Mende T, Koschke R (2009) Revisiting the evaluation of defect prediction models. In: Proceedings of the 5th international conference on predictor models in software engineering, PROMISE ’09. ACM, New York, pp 7:1–7:10. doi:10.1145/1540438.1540448

  • Menzie T, Krishna R, Pryor D (2015) The promise repository of empirical software engineering data. http://openscience.us/repo

  • Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: ACM/IEEE 30th international conference on software engineering, 2008. ICSE’08. IEEE, pp 181–190

  • Muthukumaran K, Choudhary A, Murthy NB (2015) Mining github for novel change metrics to predict buggy files in software systems. In: 2015 international conference on computational intelligence and networks (CINE). IEEE, pp 15–20

  • Peters F, Menzies T, Marcus A (2013) Better cross company defect prediction. In: 2013 10th IEEE working conference on mining software repositories (MSR), pp 409–418. doi:10.1109/MSR.2013.6624057

  • Rahman F, Posnett D, Devanbu P (2012) Recalling the “imprecision” of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the foundations of software engineering, FSE ’12. ACM, New York, pp 61:1–61:11. doi:10.1145/2393596.2393669

  • Subramanyam R, Krishnan M (2003) Empirical analysis of CK metrics for object-oriented design complexity: implications for software defects. IEEE Transa Softw Eng 29(4):297–310. doi:10.1109/TSE.2003.1191795

    Article  Google Scholar 

  • Yang X, Tang K, Yao X (2015) A learning-to-rank approach to software defect prediction. IEEE Trans Reliab 64(1):234–246

    Article  Google Scholar 

  • Zhang D, El Emam K, Liu H et al (2009) An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans Softw Eng 35(2):293–304

    Article  Google Scholar 

  • Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: International workshop on predictor models in software engineering, PROMISE’07: ICSE Workshops 2007. IEEE, pp 9–9

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lalita Bhanu Murthy Neti.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shukla, S., Radhakrishnan, T., Muthukumaran, K. et al. Multi-objective cross-version defect prediction. Soft Comput 22, 1959–1980 (2018). https://doi.org/10.1007/s00500-016-2456-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2456-8

Keywords

Navigation