Skip to main content
Log in

Design evolution metrics for defect prediction in object oriented systems

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Testing is the most widely adopted practice to ensure software quality. However, this activity is often a compromise between the available resources and software quality. In object-oriented development, testing effort should be focused on defective classes. Unfortunately, identifying those classes is a challenging and difficult activity on which many metrics, techniques, and models have been tried. In this paper, we investigate the usefulness of elementary design evolution metrics to identify defective classes. The metrics include the numbers of added, deleted, and modified attributes, methods, and relations. The metrics are used to recommend a ranked list of classes likely to contain defects for a system. They are compared to Chidamber and Kemerer’s metrics on several versions of Rhino and of ArgoUML. Further comparison is conducted with the complexity metrics computed by Zimmermann et al. on several releases of Eclipse. The comparisons are made according to three criteria: presence of defects, number of defects, and defect density in the top-ranked classes. They show that the design evolution metrics, when used in conjunction with known metrics, improve the identification of defective classes. In addition, they show that the design evolution metrics make significantly better predictions of defect density than other metrics and, thus, can help in reducing the testing effort by focusing test activity on a reduced volume of code.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Researchers discussed the limits of current predictive models at the 6th edition of Working Conference on Mining Software Repositories (MSR’09).

  2. An attribute is matched to another if they share the same name and type while a method is matched to another if they share the same signature.

  3. In the study, we fix l w  = 0.6, m w  = 0.2, and a w  = 0.2.

  4. The metric values are available on-line at http://www.st.cs.uni-saarland.de/softevo/bug-data/eclipse/.

  5. http://www.mozilla.org/rhino/.

  6. Rhino1.4R3 is excluded since it is the initial release.

  7. http://argouml-downloads.tigris.org/.

  8. http://www.eclipse.org/.

  9. The bigger |C i |, the more X i impacts the outcome. In particular, if C i  > 0, the probability of the outcome increases with the value of X i .

  10. http://cran.r-project.org/.

  11. We use Akaike’s information criterion to elect the “best” model.

  12. We consider that a random prediction model would give in average X% of the defective classes or the defects in any X% partition of the system.

  13. We compute the Cohen-d statistics using pooled standard deviation.

References

  • Ayari K, Meshkinfam P, Antoniol G, Di Penta M (2007) Threats on building models from cvs and bugzilla repositories: the mozilla case study. In: IBM centers for advanced studies conference, Toronto, CA, 23–25 Oct 2007. ACM, pp 215–228

  • Basili V, Caldiera G, Rombach DH (1994) The goal question metric paradigm encyclopedia of software engineering. Wiley, New York

    Google Scholar 

  • Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22:751–761

    Article  Google Scholar 

  • Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P (2009) Fair and balanced? bias in bug-fix datasets. In: ESEC/SIGSOFT FSE, pp 121–130

  • Briand LC, Daly JW, Wüst J (1998) A unified framework for cohesion measurement in object-oriented systems. Empir Softw Eng 3(1):65–117

    Article  Google Scholar 

  • Briand LC, Melo WL, Wüst J (2002) Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans Softw Eng 28(7):706–720

    Article  Google Scholar 

  • Briand LC, Labiche Y, Wang Y (2003) An investigation of graph-based class integration test order strategies. IEEE Trans Softw Eng 29(7):594–607

    Article  Google Scholar 

  • Cartwright M, Shepperd M (2000) An empirical investigation of an object-oriented software system. IEEE Trans Softw Eng 26(8):786–796

    Article  Google Scholar 

  • Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  • Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Earlbaum Associates, Hillsdale, NJ

    MATH  Google Scholar 

  • Eaddy M, Zimmermann T, Sherwood KD, Garg V, Murphy GC, Nagappan N, Aho AV (2008) Do crosscutting concerns cause defects? IEEE Trans Softw Eng 34(4):497–515

    Article  Google Scholar 

  • ECMA (2007) ECMAScript standard—ECMA-262 v3. ISO/IEC 16262

  • El Emam K, Benlarbi S, Goel N, Rai SN (2001) The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans Softw Eng 27(7):630–650

    Article  Google Scholar 

  • Evanco WM (1997) Poisson analyses of defects for small software components. J Syst Softw 38(1):27–35

    Article  Google Scholar 

  • Fenton NE, Pfleeger SL (1997) Software metrics: a rigorous and practical approach, 2nd edn. Thomson Computer Press, Boston

    Google Scholar 

  • Glover F (1986) Future paths for integer programming and links to artificial intelligence. In: Computers and operations research, pp 533–549

  • Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661

    Article  Google Scholar 

  • Grindal JOM, Mellin J (2006) On the testing maturity of software producing organizations. In: Proceedings of the testing: academic & industrial conference on practice and research techniques, pp 171–180

  • Guéhéneuc Y-G, Antoniol G (2008) DeMIMA: a multilayered approach for design pattern identification. IEEE Trans Softw Eng 34(5):667–684

    Google Scholar 

  • Gyimóthy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910

    Article  Google Scholar 

  • Hassan AE (2009) Predicting faults using the complexity of code changes. In: ICSE, pp 78–88

  • Hosmer D, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York

    Book  MATH  Google Scholar 

  • Kpodjedo S, Ricca F, Galinier P, Antoniol G (2008) Error correcting graph matching application to software evolution. In: Proc. of the working conference on reverse engineering, pp 289–293

  • Kpodjedo S, Ricca F, Antoniol G, Galinier P (2009) Evolution and search based metrics to improve defects prediction. In: Search based software engineering, international symposium on, pp 23–32

  • Kpodjedo S, Ricca F, Galinier P, Antoniol G, Gueheneuc Y-G (2010) Studying software evolution of large object-oriented software systems using an etgm algorithm. J Softw Maint Evol. doi:10.1002/smr.519

    Google Scholar 

  • Moser GSR, Pedrycz W (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: ICSE, pp 181–190

  • Munson J, Elbaum S (1998) Code churn: a measure for estimating the impact of code change. In: Proceedings of the international conference on software maintenance, pp 24–31

  • Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proc. of the international conference on software engineering (ICSE), pp 284–292

  • Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31:340–355

    Article  Google Scholar 

  • Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project

  • Tsai W, Fu K-S (1979) Error-correcting isomorphism of attributed relational graphs for pattern analysis. IEEE Trans Syst Man Cybern 9:757–768

    Article  MATH  Google Scholar 

  • Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen A (2000) Experimentation in software engineering—an introduction. Kluwer Academic Publishers

  • Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: Proceedings of the third international workshop on predictor models in software engineering

Download references

Acknowledgements

This research was partially supported by the Natural Sciences and Engineering Research Council of Canada (Research Chairs in Software Evolution and in Software Patterns and Patterns of Software) and by G. Antoniol Individual Discovery Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Segla Kpodjedo.

Additional information

Editors: Massimiliano Di Penta and Simon Poulding

All artifacts (releases, class diagrams, graph representations) used in this work can be downloaded from the SOCCER laboratory Web server, under the Software Evolution Repository (SER) page, accessible at http://web.soccerlab.polymtl.ca/SER/.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kpodjedo, S., Ricca, F., Galinier, P. et al. Design evolution metrics for defect prediction in object oriented systems. Empir Software Eng 16, 141–175 (2011). https://doi.org/10.1007/s10664-010-9151-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-010-9151-7

Keywords

Navigation