ABSTRACT
Background. Practical use of a measure X for an internal attribute (e.g., size, structural complexity, cohesion, coupling) of a software module often requires setting a threshold on X, to make decisions as to which software modules may be estimated to be potentially faulty. To keep quality under control, practitioners may want to set a threshold on X to identify "early symptoms" of possible faultiness of a module, which should be closely monitored and possibly modified.
Objective. We propose and evaluate an approach to setting a threshold on X to identify "early symptoms" of possible faultiness of software modules.
Method. Our proposal is based on the existence of a statistically significant model that relates X to fault-proneness, defined as the probability that a module contains at least one fault. The curve representing a fault-proneness model is usually fairly "flat" for relatively small values of X and becomes steeper and steeper for larger values of X. We define two ways in which values of X can be used as "early symptoms" of possible faultiness. First, we use the value of X where the fault-proneness model curve changes direction the most, i.e., has maximum convexity. Second, we use the value in which the slope of the curve reaches a proportion (e.g., one half) of the maximum slope that is relevant for the developers.
Results. First, we provide the theoretical underpinnings for our approach. Second, we show the empirical results obtained by applying our approach to data from the PROMISE repository by using fault-proneness models built via Binary Logistic and Probit regressions. Our results show that the proposed thresholds are actually effective in showing "early symptoms" of possible faultiness of a module, while achieving a level of accuracy in classifying faulty modules that is fairly close to other typical fault-proneness thresholds.
Conclusions. Our method can be practically used for setting "early symptom" thresholds based on evidence captured by statistically significant models. In particular, the threshold based on the maximum convexity depends on characteristics of the models alone, so software project managers do not need to devise the thresholds themselves. If they choose to use the other kind of slope-based threshold, software project managers can choose a different proportion based on the level of risk-aversion they need when recognizing early symptoms of faultiness.
- The PROMISE repository of empirical software engineering data, 2015.Google Scholar
- T. L. Alves, C. Ypma, and J. Visser. Deriving metric thresholds from benchmark data. In 26th IEEE Int. Conf. on Software Maintenance -- ICSM, September 12-18, Timisoara, Romania, 2010. Google ScholarDigital Library
- R. Bender. Quantitative risk assessment in epidemiological studies investigating threshold effects. Biometrical Journal, 41(3), 1999.Google ScholarCross Ref
- L. C. Briand, S. Morasca, and V. R. Basili. Defining and validating measures for object-based high-level design. IEEE Trans. on Software Eng., 25. Google ScholarDigital Library
- S. R. Chidamber, D. P. Darcy, and C. F. Kemerer. Managerial use of metrics for object-oriented software: An exploratory analysis. IEEE Trans. Software Eng., 24(8), 1998. Google ScholarDigital Library
- S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. IEEE Trans. on Software Eng., 20(6), 1994. Google ScholarDigital Library
- D. M. Coleman, B. Lowther, and P. W. Oman. The application of software maintainability models in industrial software systems. Journal of Systems and Software, 29(1), 1995. Google ScholarDigital Library
- K. Erni and C. Lewerentz. Applying design-metrics to object-oriented frameworks. In 3rd IEEE Int. Software Metrics Symposium, March 25-26, Berlin, 1996. Google ScholarDigital Library
- N. Fenton and J. Bieman. Software metrics: a rigorous and practical approach. CRC Press, 2014. Google ScholarDigital Library
- K. A. M. Ferreira, M. A. da Silva Bigonha, R. da Silva Bigonha, L. F. O. Mendes, and H. C. Almeida. Identifying thresholds for object-oriented software metrics. J. of Systems and Software, 85(2), 2012. Google ScholarDigital Library
- T. G. S. Filó, M. A. da Silva Bigonha, and K. A. M. Ferreira. A catalogue of thresholds for object-oriented software metrics. In 1st Int. Conf. on Advances and Trends in Software Engineering, April 19-24, Barcelona, 2015.Google Scholar
- M. Foucault, M. Palyart, J. Falleri, and X. Blanc. Computing contextual metric thresholds. In Symposium on Applied Computing, SAC, Gyeongju, Republic of Korea - March 24-28, 2014. Google ScholarDigital Library
- T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. on Software Eng., 38(6), 2012. Google ScholarDigital Library
- J. W. Hardin and H. J. M. Generalized Estimating Equations. CRC Press, Abingdon, 2002.Google Scholar
- D. W. Hosmer Jr, S. Lemeshow, and R. X. Sturdivant. Applied logistic regression. John Wiley & Sons, 2013.Google ScholarCross Ref
- T. M. Khoshgoftaar. Improving usefulness of software quality classification models based on boolean discriminant functions. In 13th Int. Symposium on Software Reliability Engineering --ISSRE, 12--15 November, Annapolis, MD, 2002. Google ScholarDigital Library
- D. H. Krantz, R. D. Luce, P. Suppes, and A. Tversky. Foundations of Measurement, volume 1. Academic Press, San Diego, 1971.Google Scholar
- M. Lanza and R. Marinescu. Object-Oriented Metrics in Practice - Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems. Springer, 2006. Google ScholarDigital Library
- T. McCabe. A complexity measure. IEEE Trans. on Software Eng., 2(4), 1976. Google ScholarDigital Library
- J. Mendling, L. Sánchez-González, F. García, and M. L. Rosa. Thresholds for error probability measures of business process models. Journal of Systems and Software, 85(5), 2012. Google ScholarDigital Library
- S. Morasca. A probability-based approach for measuring external attributes of software artifacts. In 3rd Int. Symposium on Empirical Software Engineering and Measurement. IEEE Computer Society, 2009. Google ScholarDigital Library
- P. Oliveira, M. T. Valente, and F. P. Lima. Extracting relative thresholds for source code metrics. In 2014 Software Evolution Week - IEEE Conf. on Software Maintenance, Reengineering, and Reverse Engineering, CSMR-WCRE, Antwerp, 2014.Google ScholarCross Ref
- C. Rijsbergen. Information Retrieval. Butterworths, 1979. Google ScholarDigital Library
- L. H. Rosenberg, R. Stapko, and A. Gallo. Risk-based object oriented testing. In Proc. 24th Annual NASA-SEL Soft. Eng. Workshop, Greenbelt, 1999.Google Scholar
- L. Sánchez-González, F. García, F. Ruiz, and J. Mendling. A study of the effectiveness of two threshold definition techniques. In 16th Int. Conf. on Evaluation & Assessment in Software Engineering, EASE, Ciudad Real, Spain, May 14-15, 2012.Google ScholarCross Ref
- N. F. Schneidewind. Software metrics model for integrating quality control and prediction. In 8th Int. Symposium on Software Reliability Engineering, ISSRE, Albuquerque, NM, USA, November 2-5, 1997. Google ScholarDigital Library
- N. F. Schneidewind. Investigation of logistic regression as a discriminant of software quality. In 7th IEEE Int. Software Metrics Symposium--METRICS, 2001. Google ScholarDigital Library
- R. Shatnawi. A quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans. Software Eng., 36(2), 2010. Google ScholarDigital Library
- R. Shatnawi, W. Li, J. Swain, and T. Newman. Finding software metrics threshold values using ROC curves. Journal of Software Maintenance, 22(1), 2010. Google ScholarDigital Library
- A. Tosun and A. B. Bener. Reducing false alarms in software defect prediction by decision threshold optimization. In 3rd Int. Symp. on Empirical Software Engineering and Measurement--ESEM, 2009. Google ScholarDigital Library
- A. H. Watson and T. J. McCabe. Structured testing: A testing methodology using the cyclomatic complexity metric. NIST report 500-235, 1996.Google Scholar
Index Terms
- Slope-based fault-proneness thresholds for software engineering measures
Recommendations
Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...
An Empirical Evaluation of Distribution-based Thresholds for Internal Software Measures
PROMISE 2016: Proceedings of the The 12th International Conference on Predictive Models and Data Analytics in Software EngineeringBackground Setting thresholds is important for the practical use of internal software measures, so software modules can be classified as having either acceptable or unacceptable quality, and software practitioners can take appropriate quality ...
Understanding the value of considering client usage context in package cohesion for fault-proneness prediction
By far, many package cohesion metrics have been proposed from internal structure view and external usage view. Based on whether client usage context (i.e., the way packages are used by their clients) is exploited, we group these metrics into two ...
Comments