Skip to main content
Log in

Global vs. local models for cross-project defect prediction

A replication study

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Although researchers invested significant effort, the performance of defect prediction in a cross-project setting, i.e., with data that does not come from the same project, is still unsatisfactory. A recent proposal for the improvement of defect prediction is using local models. With local models, the available data is first clustered into homogeneous regions and afterwards separate classifiers are trained for each homogeneous region. Since the main problem of cross-project defect prediction is data heterogeneity, the idea of local models is promising. Therefore, we perform a conceptual replication of the previous studies on local models with a focus on cross-project defect prediction. In a large case study, we evaluate the performance of local models and investigate their advantages and drawbacks for cross-project predictions. To this aim, we also compare the performance with a global model and a transfer learning technique designed for cross-project defect predictions. Our findings show that local models make only a minor difference in comparison to global models and transfer learning for cross-project defect prediction. While these results are negative, they provide valuable knowledge about the limitations of local models and increase the validity of previously gained research results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. With the data used in the study and the success criterion of having both recall and precision of at least 0.75 they achieved a success rate of about 3 %.

  2. The studies by Menzies et al and Bettenburg et al were first published in an initial version at a conference and then in greater detail in a journal publication, leading to five publications for the three studies.

  3. recall, precision, and accuracy all at least 0.75.

  4. The tera-PROMISE repository is the successor of the PROMISE repository, which was previously located at http://promisedata.googlecode.com.

  5. http://bug.inf.usi.ch/

  6. instead of recall, sometimes PD or tpr are used in the literature. PD stands for probability of defect and tpr for true positive rate.

  7. This problem is still very relevant. For example, during the 37th International Conference on Software Engineering held in May 2015, there were five papers on defect prediction (Caglayan et al. 2015; Ghotra et al. 2015; Peters et al. 2015; Tan et al. 2015; Tantithamthavorn et al. 2015). None of them used exactly the same performance measures.

  8. Github: https://github.com/sherbold/replication-kit-emse-2016-local-models/tree/master/replication-kit Zipped Archive: http://hdl.handle.net/21.11101/0000-0001-3C55-D

References

  • Amasaki S, Kawata K, Yokogawa T (2015) Improving cross-project defect prediction methods with data simplification. In: 41st Euromicro conference on software engineering and advanced applications (SEAA)

  • Bettenburg N, Nagappan M, Hassan A (2012) Think locally, act globally: improving defect and effort prediction models. In: Proceedings of the 9th IEEE working conference on mining software repositories (MSR). IEEE Computer Society

  • Bettenburg N, Nagappan M, Hassan A (2014) Towards improving statistical modeling of software engineering data: think locally, act globally!. Empir Softw Eng:1–42

  • Caglayan B, Turhan B, Bener A, Habayeb M, Miranskyy A, Cialini E (2015) Merits of organizational metrics in defect prediction: an industrial replication. In: Proceedings of the 37th international conference on software engineering (ICSE)

  • Camargo Cruz AE, Ochimizu K (2009) Towards logistic regression models for predicting fault-prone code across software projects. In: Proceedings of the 3rd international symposium on empirical software engineering and measurement (ESEM). IEEE Computer Society

  • Carver JC (2010) Towards reporting guidelines for experimental replications: a proposal. In: Proceedings of the international workshop on replication in empirical software engineering

  • Chidamber S, Kemerer C (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  • D’Ambros M, Lanza M, Robbes R (2010) An Extensive Comparison of Bug Prediction Approaches. In: Proceedings of the 7th IEEE working conference on mining software repositories (MSR). IEEE Computer Society

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J of the Royal Statistical Society Series B (Methodological) 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Drummond C, Holte RC (2003) C4.5, class imbalance and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on learning from imbalanced datasets II

  • Faloutsos C, Lin KI (1995) Fastmap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. SIGMOD Rec 24(2):163–174

    Article  Google Scholar 

  • Fraley C, Raftery AE (1999) MCLUST: software for model-based cluster analysis. J Classif 16(2):297–306

    Article  MATH  Google Scholar 

  • Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the 37th international conference on software engineering (ICSE)

  • Gray D, Bowes D, Davey N, Sun Y, Christianson B (2011) The misuse of the NASA metrics data program data sets for automated software defect prediction. In: Proceedings of the 15th annual conference on evaluation & assessment in software engineering (EASE). IET

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1):10–18

    Article  Google Scholar 

  • Halstead MH (1977) Elements of software science (operating and programming systems series). Elsevier Science Inc.

  • Han J, Kamber M (2011) Data mining: concepts and techniques. Morgan Kaufmann

  • Hassan A (2009) Predicting faults using the complexity of code changes. In: IEEE 31st International Conference on Software Engineering, 2009. ICSE 2009, pp 78–88. doi:10.1109/ICSE.2009.5070510

  • He P, Li B, Ma Y (2014) Towards cross-project defect prediction with imbalanced feature sets. CoRR arXiv:1411.4228

  • He Z, Shu F, Yang Y, Li M, Wang Q (2012) An investigation on the feasibility of cross-project defect prediction. Autom Softw Eng 19:167–199

    Article  Google Scholar 

  • He Z, Peters F, Menzies T, Yang Y (2013) Learning from Open-Source projects: an empirical study on defect prediction. In: Proceedings of the 7th international symposium on empirical software engineering and measurement (ESEM)

  • Henderson-Sellers B (1996) Object-oriented metrics; measures of complexity. Prentice-Hall

  • Herbold S (2013) Training data selection for cross-project defect prediction. In: Proceedings of the 9th international conference on predictive models in software engineering (PROMISE), ACM

  • Herbold S (2015) Crosspare: a tool for benchmarking cross-project defect predictions. In: Proceedings of the 4th international workshop on software mining (SoftMine)

  • Huang L, Port D, Wang L, Xie T, Menzies T (2010) Text mining in supporting software systems risk assurance. In: Proceedings of the 25th IEEE/ACM international conference on automated software engineering(ASE), ACM

  • Jelihovschi E, Faria J, Allaman I (2014) Scottknott: a package for performing the Scott-Knott clustering algorithm in R. TEMA (São Carlos) 15:3–17

    Article  MathSciNet  Google Scholar 

  • Jiang Y, Cukic B, Ma Y (2008) Techniques for evaluating fault prediction models. Empir Softw Eng 13(5):561–595

    Article  Google Scholar 

  • Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering (PROMISE), ACM

  • Kawata K, Amasaki S, Yokogawa T (2015) Improving relevancy filter methods for cross-project defect prediction. In: 3rd international conference on applied computing and information technology/2nd international conference on computational science and intelligence (ACIT-CSI)

  • Kitchenham B (2008) The role of replications in empirical software engineering word of warning. Empir Softw Eng 13(2):219–221

    Article  Google Scholar 

  • Kocaguneli E, Menzies T, Keung J, Cok D, Madachy R (2013) Active learning and effort estimation: Finding the essential content of software effort estimation data. IEEE Trans Softw Eng 39(8):1040–1053. doi:10.1109/TSE.2012.88

    Article  Google Scholar 

  • Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Data preprocessing for supervised leaning. Int J Comp Sci 1(2):111–117

    Google Scholar 

  • Ma Y, Luo G, Zeng X, Chen A (2012) Transfer learning for cross-company software defect prediction. Inf Softw Technol 54(3):248–256

    Article  Google Scholar 

  • Madeyski L, Jureczko M (2015) Which process metrics can significantly improve defect prediction models? an empirical study. Softw Qual J 23(3):393–422. doi:10.1007/s11219-014-9241-7

    Article  Google Scholar 

  • Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60

    Article  MathSciNet  MATH  Google Scholar 

  • McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320

    Article  MathSciNet  MATH  Google Scholar 

  • Meneely A, Williams L, Snipes W, Osborne J (2008) Predicting failures with developer networks and social network analysis. In: Proceedings of the 16th ACM SIGSOFT international symposium on foundations of software engineering, ACM, New York, NY, USA, SIGSOFT ’08/FSE-16, pp 13–23. doi:10.1145/1453101.1453106

  • Menzies T, Turhan B, Bener A, Gay G, Cukic B, Jiang Y (2008) Implications of ceiling effects in defect predictors. In: Proceedings of the 4th international workshop on predictor models in software engineering (PROMISE), ACM

  • Menzies T, Butcher A, Marcus A, Zimmermann T, Cok D (2011) Local vs. global models for effort estimation and defect prediction. In: Proceedings of the 26th IEEE/ACM international conference on automated software engineering (ASE), IEEE Computer Society

  • Menzies T, Butcher A, Cok D, Marcus A, Layman L, Shull F, Turhan B, Zimmermann T (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39(6):822–834

    Article  Google Scholar 

  • Menzies T, Pape C, Steele C (2014) tera-promise. http://openscience.us/repo/

  • Nam J, Kim S (2015) Heterogeneous defect prediction. In: Proceedings of the 10th joint meeting of the european software engineering conference (ESEC) and the ACM SIGSOFT symposium on the foundations of software engineering (FSE). doi:10.1145/2786805.2786814

  • Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: Proceedings of the 35th international conference on software engineering (ICSE)

  • Ngomo ACN (2009) Low-bias extraction of domain-specific concepts. Ph.D. Thesis

  • Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  • Peters F, Menzies T, Gong L, Zhang H (2013) Balancing privacy and utility in cross-company defect prediction. IEEE Trans Softw Eng 39(8):1054–1068

    Article  Google Scholar 

  • Peters F, Menzies T, Layman L (2015) LACE2: better privacy-preserving data sharing for cross project defect prediction. In: Proceedings of the 37th international conference on software engineering (ICSE)

  • Premraj R, Herzig K (2011) Network versus code metrics to predict defects: a replication study. In: Proceedings of the international symposium on empirical software engineering and measurement (ESEM)

  • Rahman F, Posnett D, Devanbu P (2012) Recalling the “imprecision” of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering (FSE). ACM

  • Runeson P, Höst M (2009) Guidelines for conducting and reporting case study research in software engineering. Empir Softw Eng 14(2):131–164

    Article  Google Scholar 

  • Scanniello G, Gravino C, Marcus A, Menzies T (2013) Class level fault prediction using software clustering. In: Proceedings of the 28th IEEE/ACM international conference on automated software engineering (ASE). IEEE Computer Society

  • Schikuta E, Schikuta E (1993) Grid-clustering: a hierarchical clustering method for very large data sets. In: Proceedings of the 15th international conference on pattern recognition

  • Schölkopf B, Smola AJ (2002) Learning with Kernels. MIT Press

  • Scott AJ, Knott M (1974) A cluster analysis method for grouping means in the analysis of variance. Biometrics 30(3):507–512

    Article  MATH  Google Scholar 

  • Shepperd M, Song Q, Sun Z, Mair C (2013) Data quality: some comments on the NASA software defect datasets. IEEE Trans Softw Eng 39(9):1208–1215

    Article  Google Scholar 

  • Shull F, Carver J, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13(2):211–218

    Article  Google Scholar 

  • Siegmund J, Siegmund N, Apel S (2015) Views on internal and external validity in empirical software engineering. In: 37th International conference on software engineering

  • Tan M, Tan L, Dara S, Mayeux C (2015) Online defect prediction for imbalanced data. In: Proceedings of the 37th international conference on software engineering (ICSE)

  • Tantithamthavorn C, McIntosh S, Hassan AE, Ihara A, Matsumoto Ki (2015) The impact of mislabelling on the performance and interpretation of defect prediction models. In: Proceedings of the 37th international conference on software engineering (ICSE)

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) Automated parameter optimization of classification techniques for defect prediction models. In: Proceedings of the 38th international conference on software engineering. doi:10.1145/2884781.2884857. ACM

  • Turhan B, Menzies T, Bener A, Di Stefano J (2009) On the relative value of cross-company and within-company data for defect prediction. Empir Softw Eng 14:540–578

    Article  Google Scholar 

  • van Gestel T, Suykens J, Baesens B, Viaene S, Vanthienen J, Dedene G, de Moor B, Vandewalle J (2004) Benchmarking least squares support vector machine classifiers. Mach Learn 54(1):5–32

    Article  MATH  Google Scholar 

  • Watanabe S, Kaiya H, Kaijiri K (2008) Adapting a fault prediction model to allow inter language reuse. In: Proceedings of the 4th international workshop on predictor models in software engineering (PROMISE). ACM

  • Xu R, Wunsch ID (2005) Survey of clustering algorithms. IEEE Trans on Neural Networks 16(3):645–678

    Article  Google Scholar 

  • Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proceedings of the 11th working conference on mining software repositories (MSR). ACM

  • Zhang F, Mockus A, Keivanloo I, Zou Y (2015) Towards building a universal defect prediction model with rank transformed predictors. Empir Softw Eng:1–39. doi:10.1007/s10664-015-9396-2

  • Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proceedings of the the 7th joint meeting european software engineering conference (ESEC) and the ACM SIGSOFT symposium on the foundations of software engineering (FSE). ACM, pp 91–100

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steffen Herbold.

Additional information

Communicated by: Burak Turhan

Appendix A: Metrics

Appendix A: Metrics

1.1 A.1 JSTAT Data

The following metrics are part of the JSTAT data:

  • WMC: weighted method count, number of methods in a class

  • DIT: depth of inheritance tree

  • NOC: number of children

  • CBO: coupling between objects, number of classes coupled to a class

  • RFC: response for class, number of different methods that can be executed if the class receives a message

  • LCOM: lack of cohesion in methods, number of methods not related through the sharing of some of the class fields

  • LCOM3: lack of cohesion in methods after Henderson-Sellers (1996)

  • NPM: number of public methods

  • DAM: data access metric, ratio of private (protected) attributes to total number of attributes in the class

  • MOA: measure of aggregation, number of class fields whose types are user defined classes

  • MFA: measure of functional abstraction, ratio of the number of methods inherited by a class to the total number of methods accessible by the member methods of the class

  • CAM: cohesion among methods of class, relatedness of methods based upon the parameter list of the methods

  • IC: inheritance coupling, number of parent classes to which the class is coupled

  • CBM: coupling between methods, number of new/redefined methods to which all the inherited methods are coupled

  • AMC: average method complexity

  • Ca: afferent couplings

  • Ce: efferent couplings

  • CC: cyclomatic complexity

  • Max(CC): maximum cyclomatic complexity among methods

  • Avg(CC): average cyclomatic complexity among methods

For a detailed explanation see Jureczko and Madeyski (2010).

1.2 A.2 MDP Data

The following metrics are part of the MDP data. This is the common subset of metrics that is obtained by all projects within the MDP data set:

  • LOC_TOTAL: total lines of code

  • LOC_EXECUTABLE: exectuable lines of code

  • LOC_COMMENTS: lines of comments

  • LOC_CODE_AND_COMMENT: lines with comments or code

  • NUM_UNIQUE_OPERATORS: number of unique operators

  • NUM_UNIQUE_OPERANDS: number of unique operands

  • NUM_OPERATORS: total number of operators

  • NUM_OPERANDS: total number of operands

  • HALSTEAD_VOLUME: Halstead volume (see Halstead 1977)

  • HALSTEAD_LENGTH: Halstead length (see Halstead 1977)

  • HALSTEAD_DIFFICULTY: Halstead difficulty (see Halstead 1977)

  • HALSTEAD_EFFORT: Halstead effort (see Halstead 1977)

  • HALSTEAD_ERROR_EST: Halstead Error, also known as Halstead Bug (see Halstead 1977)

  • HALSTEAD_PROG_TIME: Halstead Pro

  • BRANCH_COUNT: Number of branches

  • CYCLOMATIC_COMPLEXITY: Cyclomatic complexity (same as CC in the JSTAT data)

  • DESIGN_COMPLEXITY: design complexity

1.3 A.3 JPROC Data

The following metrics are part of the JPROC data:

  • CBO: coupling between objects

  • DIT: depth of inheritance tree

  • fanIn: number of other classes that reference the class

  • fanOut: number of other classes referenced by the class

  • LCOM: lack of cohesion in methods

  • NOC: number of children

  • RFC: response for class

  • WMC: weighted method count

  • NOA: number of attributes

  • NOAI: number of attributes inherited

  • LOC: lines of code

  • NOM: number of methods

  • NOMI: number of methods inherited

  • NOPRA: number of private attributes

  • NOPRM: number of private methods

  • NOPA: number of public attributes

  • NOPM: number of public methods

  • NR: number of revisions

  • NREF: number of times the file has been refactored

  • NAUTH: number of authors

  • LADD: sum of lines added

  • max(LADD): maximum lines added

  • avg(LADD): average lines added

  • LDEL: sum of lines removed

  • max(LDEL): maximum lines deleted

  • avg(LDEL): average lines deleted

  • CHURN: sum of code churn

  • max(CHURN): maximum code churn

  • avg(CHURN): average code churn

  • AGE: age of the file

  • WAGE: weighted age of the file

For a detailed explanation see D’Ambros et al. (2010).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Herbold, S., Trautsch, A. & Grabowski, J. Global vs. local models for cross-project defect prediction. Empir Software Eng 22, 1866–1902 (2017). https://doi.org/10.1007/s10664-016-9468-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-016-9468-y

Keywords

Navigation