ABSTRACT
We have developed an interactive tool that predicts fault likelihood for the individual files of successive releases of large, long-lived, multi-developer software systems. Predictions are the result of a two-stage process: first, the extraction of current and historical properties of the system, and second, application of a negative binomial regression model to the extracted data. The prediction model is presented to the user as a GUI-based tool that requires minimal input from the user, and delivers its output as an ordered list of the system's files together with an expected percent of faults each file will have in the release about to undergo system test. The predictions can be used to prioritize testing efforts, to plan code or design reviews, to allocate human and computer resources, and to decide if files should be rewritten.
- P. McCullagh and J. A. Nelder. Generalized Linear Models, Second Edition, Chapman and Hall, London, 1989.Google ScholarCross Ref
- T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Predicting the Location and Number of Faults in Large Software Systems. IEEE Trans. on Software Engineering, Vol 31, No 4, April 2005. Google ScholarDigital Library
- The R Project for Statistical Computing. http://www.r-project.org/Google Scholar
- The BayesTree Package. http://cran.r-project.org/web/packages/BayesTreeGoogle Scholar
- The randomForest Package. http://cran.r-project.org/web/packages/randomForestGoogle Scholar
- The rpart Package. http://cran.r-project.org/web/packages/rpartGoogle Scholar
- E. J. Weyuker, T. J. Ostrand, and R. M. Bell. Do Too Many Cooks Spoil the Broth? Using the Number of Developers to Enhance Defect Prediction Models, Empirical Software Eng., Vol 13, No. 5, October 2008. Google ScholarDigital Library
- E. J. Weyuker, T. J. Ostrand, and R. M. Bell. Comparing the Effectiveness of Several Modeling Methods for Fault Prediction, Empirical Software Eng., June 2009. Google ScholarDigital Library
- Y. Shin, R. M. Bell, T. J. Ostrand, E. J. Weyuker. Does Calling Structure Information Improve the Accuracy of Fault Prediction?, Proc. Mining Software Repositories, Vancouver, May 2009. Google ScholarDigital Library
- SAS Institute Inc. SAS/STAT User's Guide, Version 8, SAS Institute, Cary, NC, 1999.Google Scholar
Index Terms
- Software fault prediction tool
Recommendations
Comparing negative binomial and recursive partitioning models for fault prediction
PROMISE '08: Proceedings of the 4th international workshop on Predictor models in software engineeringTwo different software fault prediction models have been used to predict the N% of the files of a large software system that are likely to contain the largest numbers of faults. We used the same predictor variables in a negative binomial regression ...
An in-depth study of the potentially confounding effect of class size in fault prediction
Background. The extent of the potentially confounding effect of class size in the fault prediction context is not clear, nor is the method to remove the potentially confounding effect, or the influence of this removal on the performance of fault-...
Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models
FSE 2016: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software EngineeringUnsupervised models do not require the defect data to build the prediction models and hence incur a low building cost and gain a wide application range. Consequently, it would be more desirable for practitioners to apply unsupervised models in effort-...
Comments