Skip to main content
Log in

On the use of calling structure information to improve fault prediction

Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Previous studies have shown that software code attributes, such as lines of source code, and history information, such as the number of code changes and the number of faults in prior releases of software, are useful for predicting where faults will occur. In this study of two large industrial software systems, we investigate the effectiveness of adding information about calling structure to fault prediction models. Adding calling structure information to a model based solely on non-calling structure code attributes modestly improved prediction accuracy. However, the addition of calling structure information to a model that included both history and non-calling structure code attributes produced no improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. We also corrected a previous data error affecting four of the calling structure attributes in this paper. However, the overall conclusions remain the same as those observed in our earlier study.

References

  • Andersson C, Runeson P (2007) A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans Software Eng 33(5):273–286

    Article  Google Scholar 

  • Arisholm E, Briand LC (Sep. 21-22 2006) Predicting fault-prone components in a Java Legacy System. In: the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, Rio de Janeiro, Brazil, pp 8–17.

  • Basili VR, Perricone BR (1984) Software errors and complexity: an empirical investigation. Comm ACM 27:42–52

    Article  Google Scholar 

  • Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Software Eng 22(10):751–761

    Article  Google Scholar 

  • Boehm BW (1981) Software engineering economics. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Briand LC, Wust J, Ikonomovski SV, Lounis H (16–22 May 1999) Investigating quality factors in object-oriented designs: an industrial case study. In: the 1999 International Conference on Software Engineering (ICSE’99), Los Angeles, CA, USA, pp 345–354.

  • Chidamber SR, Kemerer CF (1994) A metrics suite for object-oriented design. IEEE Trans Software Eng 20(6):476–493

    Article  Google Scholar 

  • Fenton NE, Ohlsson N (2000) Quantitative analysis of faults and failures in a complex software system. IEEE Trans Software Eng 26(8):797–814

    Article  Google Scholar 

  • Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Software Eng 26(7):653–661

    Article  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Research:1157–1182.

  • Hassan AE (2009) Predicting faults using the complexity of code changes. In: the 31st International Conference on Software Engineering, pp 78–88.

  • Kamiya T, Kusumoto S, Inoue K (1999) Prediction of fault-proneness at early phase in object-oriented development. In: 2nd IEEE International Symposium Object-Oriented Real-Time Distributed Computing, pp 253–258.

  • Khoshgoftaar TM, Allen EB, Kalaichelvan KS, Goel N (1996) Early quality prediction: a case study in telecommunications. IEEE Software 13(1):65–71

    Article  Google Scholar 

  • Khoshgoftaar TM, Allen EB, Deng J (2002) Using regression trees to classify fault-prone software modules. IEEE Trans Reliab 51(4):455–462

    Article  Google Scholar 

  • Kim S, Zimmermann T, E. James Whitehead J, Zeller A (2007) Predicting faults from cached history. In: the 29th International Conference on Software Engineering, pp 489–498.

  • McFadden D (1974) Conditional logit analysis of qualitative choice behavior. Frontiers in Econometrics 1(2):105–142

    Google Scholar 

  • Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Software Eng 33(1):2–13

    Article  Google Scholar 

  • Nagappan N, Ball T (20–21 Sept. 2007) Using software dependencies and churn metrics to predict field failures: an empirical case study. In: First International Symposium on Empirical Software Engineering and Measurement, Madrid, Spain, pp 364–373.

  • Nagappan N, Ball T, Zeller A (May 20–28 2006) Mining metrics to predict component failures. In: the 28th International Conference on Software Engineering, Shanghai, China, pp 452–461.

  • Nguyen THD, Adams B, Hassa AE (2010) Studying the impact of dependency network measures on software quality. In: 26th IEEE International Conference on Software Maintenance Timisoara, Romania, pp 1–10.

  • NIST (2002) The economic impacts of inadequate infrastructure for software testing. National Institute of Standards & Technology.

  • Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Software Eng 22(12):886–894

    Article  Google Scholar 

  • Ostrand TJ, Weyuker EJ (July 22–24 2002) The distribution of faults in a large industrial software system. In: the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, Roma, Italy, pp 55–64.

  • Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Software Eng 31(4):340–355

    Article  Google Scholar 

  • Shin Y, Bell R, Ostrand T, Weyuker E (May 16–17 2009) Does calling structure information improve the accuracy of fault prediction? In: 6th IEEE International Working Conference on Mining Software Repositories, Vancouver, BC, Canada, pp 61–70.

  • Stevens WP, Myers GJ, Constantine LL (1974) Structured design. IBM Systems Journal 13(2):115–139

    Article  Google Scholar 

  • Tosun A, Turhan B, Bener A (May 18–19 2009) Validation of network measures as indicators of defective modules in software systems. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE ’09), Vancouver, Canada.

  • UCLA (2011) FAQ: What are pseudo R-squareds? Statistical Consulting Group. http://www.ats.ucla.edu/stat/mult_pkg/faq/general/psuedo_rsquareds.htm.

  • van Heesch D. Doxygen. http://www.stack.nl/~dimitri/doxygen/.

  • Weyuker EJ, Ostrand TJ (July 20 2008) Comparing methods to identify defect reports in a change management database. In: International Workshop on Defects in Large Software Systems (DEFFECTS’08), Seattle, WA.

  • Weyuker EJ, Ostrand TJ, Bell RM (20 May 2007) Using developer information as a factor for fault prediction. In: International Workshop on Predictor Models in Software Engineering (PROMISE ’07), Minneapolis, MN.

  • Weyuker EJ, Ostrand TJ, Bell RM (May 12–13 2008a) Comparing negative binomial and recursive partitioning models for fault prediction. In: the 4th International Workshop on Predictor Models in Software Engineering (PROMISE’08), Leipzig, Germany, pp 3–10.

  • Weyuker EJ, Ostrand TJ, Bell RM (2008b) Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models. Empir Software Eng 13(5):539–559

    Article  Google Scholar 

  • Weyuker EJ, Ostrand TJ, Bell RM (2010) Comparing the effectiveness of several modeling methods for fault prediction. Empir Software Eng 15(3).

  • Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Software Eng 32(10):771–789

    Article  Google Scholar 

  • Zimmermann T, Nagappan N (10–18 May 2008) Predicting defects using network analysis on dependency graphs. In: the 13th International Conference on Software Engineering, pp 531–540.

  • Zimmermann T, Weißgerber P (May 2004) Preprocessing CVS data for fine-grained analysis. In: 1st International Workshop on Mining Software Repositories (MSR’04), Edinburgh, UK.

Download references

Acknowledgment

This work is supported in part by the National Science Foundation Grant No. 0716176 and the CAREER Grant No. 0346903. Any opinions expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The comments by several reviewers helped us greatly to clarify and strengthen the results reported in the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yonghee Shin.

Additional information

Editor: Jim Whitehead and Michael Godfrey

Appendix

Appendix

Table 16 Range of values for System S attributes
Table 17 Range of values for System W attributes
Table 18 Highest pairwise Pearson correlations between attributes

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shin, Y., Bell, R.M., Ostrand, T.J. et al. On the use of calling structure information to improve fault prediction. Empir Software Eng 17, 390–423 (2012). https://doi.org/10.1007/s10664-011-9165-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-011-9165-9

Keywords

Navigation