Skip to main content
Log in

A permutation test based on regression error characteristic curves for software cost estimation models

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Background

Regression Error Characteristic (REC) curves provide a visualization tool, able to characterize graphically the prediction power of alternative predictive models. Due to the benefits of using such a visualization description of the whole distribution of error, REC analysis was recently introduced in software cost estimation to aid the decision of choosing the most appropriate cost estimation model during the management of a forthcoming project.

Aims

Although significant information can be retrieved from a readable graph, REC curves are not able to assess whether the divergences between the alternative error functions can constitute evidence for a statistically significant difference.

Method

In this paper, we propose a graphical procedure that utilizes (a) the process of repetitive permutations and (b) and the maximum vertical deviation between two comparative Regression Error Characteristic curves in order to conduct a hypothesis test for assessing the statistical significance of error functions.

Results

In our case studies, the data used come from software projects and the models compared are cost prediction models. The results clearly showed that the proposed statistical test is necessary in order to assess the significance of the superiority of a prediction model, since it provides an objective criterion for the distances between the REC curves. Moreover, the procedure can be easily applied to any dataset where the objective is the prediction of a response variable of interest and the comparison of alternative prediction techniques in order to select the best strategy.

Conclusions

The proposed hypothesis test, accompanying an informative graphical tool, is more easily interpretable than the conventional parametric and non-parametric statistical procedures. Moreover, it is free from normality assumptions of the error distributions when the samples are small-sized and highly skewed. Finally, the proposed graphical test can be applied to the comparisons of any alternative prediction methods and models and also to any other validation procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Abbreviations

REC:

Regression Error Characteristic

SCE:

Software Cost Estimation

CDF:

Cumulative Distribution Function

KS:

Kolmogorov-Smirnov

MRE:

Magnitude of Relative Error

AE:

Absolute Error

MMRE:

mean of MRE

MdMRE:

median of MRE

MAE:

Mean of AE

MdAE:

Median of AE

LS:

Least Squares

EbA:

estimation by analogy

LSEbA:

Combination of LS and EbA

ISBSG:

International Software Benchmarking Standards Group

References

  • Bi J, Bennet KP (2003) Regression Error Characteristics Curves. Proceedings of the AIII 20th International Conference on Machine Learning (ICML’03), August, 43–50

  • Briand L, Langley T, Wieczorek I (2000) A replicated assessment and comparison of common software cost modeling techniques. Proceedings of the IEEE International Conference Software Engineering (ICSE 22), 377–386

  • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Early Estimate Checker V5.0 (2007) http://www.isbsg.org/

  • Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall.

  • Egan JP (1975) Signal detection theory and ROC analysis. Series in Cognition and Perception, Academic Press.

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874

    Article  Google Scholar 

  • Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A simulation study of the model evaluation criterion MMRE. IEEE Trans Software Eng 29(11):985–995

    Article  Google Scholar 

  • Good IG (2000) Permutation tests: a practical guide to resampling methods for testing hypotheses. Springer Series in Statistics 2nd edition

  • Green DM, Swets JM (1966) Signal detection theory and psychophysics. Wiley, New York

    Google Scholar 

  • ISBSG dataset 10 (2007) http://www.isbsg.org

  • Jeffery R, Ruhe M, Wieczorek I (2001) Using public domain metrics to estimate software development effort. In: Proceedings of the IEEE 7th International Software Metrics Symposium (METRICS 2001), April, 16–27

  • Jorgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Software Eng 33(1):33–53

    Article  Google Scholar 

  • Kitchenham B, Mendes T (2004) A comparison of cross-company and within-company effort estimation models for web applications. In: Proceedings of the Empirical Assessment in Software Engineering (EASE), 47–55.

  • Kitchenham B, Mendes T (2009) Why comparative effort prediction studies may be invalid. In: Proceedings of the ACM 5th International Conference on Predictor Models in Software Engineering, May

  • Kitchenham B, Pfleeger SL, McColl B, Eagan S (2002) An empirical study of maintenance and development accuracy. J Syst Softw 64:57–77

    Article  Google Scholar 

  • Kitchenham B, Pickard L, MacDonell S, Shepperd M (2001) What accuracy statistics really measure. IEE Proc Software 148(3):81–85

    Article  Google Scholar 

  • Korte M, Port D (2008) Confidence in software cost estimation results based on mmre and pred. In Proceedings of the 4th ACM International Workshop on Predictor Models in Software Engineering, 63–70

  • Liebchen G, Shepperd M (2008) Data sets and data quality in software engineering. Proceedings of the 4th ACM International Workshop on Predictor Models in Software Engineering, 39–44

  • Lusted LB (1978) General problems in medical decision making with comments on ROC analysis. Semin Nucl Med 8(4):299–306

    Article  Google Scholar 

  • Obuchowski NA (2003) Receiver operating characteristic curves and their use in radiology. Radiology 229(1):3–8

    Article  Google Scholar 

  • Port D, Korte M (2008) Comparative studies of the model evaluation criterions mmre and pred in software cost estimation research. In Proceedings of the ACM-IEEE 2nd International Symposium on Empirical Software Engineering and Management, 51–60

  • Mair C, Shepperd M (2005) The consistency of empirical comparisons of regression and analogy-based software project cost prediction. In: Proceedings of the International Symposium on Empirical Software Engineering (ISESE’05), November, 509–518

  • Mendes E, Di Martino S, Ferrucci F, Gravino C (2008) Cross-company vs. single-company web effort models using the Tukutuku database: an extended study. J Syst Softw 81(5):673–690

    Article  Google Scholar 

  • Metz CE (1978) Basic principles of ROC analysis. Semin Nucl Med 8(4):283–298

    Article  Google Scholar 

  • Mittas N, Angelis L (2008a) Comparing cost prediction models by resampling techniques. J Syst Softw 81(5):616–632

    Article  Google Scholar 

  • Mittas N, Angelis L, (2008b) Comparing software cost prediction models by a visualization tool. In Proceedings of the IEEE 34th Euromicro Conference on Software Engineering and Advanced Applications (SEAA’08), 433–440

  • Mittas N, Angelis L (2010a) Visual comparison of software cost estimation models by regression error characteristic analysis. J Syst Softw 83:621–637

    Article  Google Scholar 

  • Mittas N, Angelis L (2010b) LSEbA: least squares regression and estimation by analogy in a semi-parametric model for software cost estimation.

  • Mittas N, Athanasiades M, Angelis L (2008) Improving analogy-based software cost estimation by a resampling method. Inform Softw Technol 50(3):221–230

    Article  Google Scholar 

  • Miyazaki Y, Terakado M, Ozaki K, Nozaki H (1994) Robust regression for developing software estimation models. J Syst Softw 27:3–16

    Article  Google Scholar 

  • Moore DS, McCabe GP, Duckworth WD II, Sclove SL (2003) The practice of business statistics. Freeman, New York

    Google Scholar 

  • Myrtveit I, Stensrud E, Shepperd M (2005) Reliability and validity in comparative studies of software prediction models. IEEE Trans Software Eng 31(5):380–391

    Article  Google Scholar 

  • NASA93 dataset (2007) http://promisedata.org/

  • Sheskin DJ (2004) Handbook of parametric and nonparametric statistical procedures, third ed. Chapman & Hall/CRC

  • Šidàk Z (1967) Rectangular confidence region for the means of multivariate normal distributions. J Am Stat Assoc 62:626–633

    Article  MATH  Google Scholar 

  • Stensrud E, Myrveit I (1998) Human performance estimating with analogy and regression models: An empirical validation. In: Proceedings of the IEEE 5th International Software Metrics Symposium (Metrics’98), CA, 205–213

  • Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39(8):561–577

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lefteris Angelis.

Additional information

Editor: Martin Shepperd and Tim Menzies

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mittas, N., Angelis, L. A permutation test based on regression error characteristic curves for software cost estimation models. Empir Software Eng 17, 34–61 (2012). https://doi.org/10.1007/s10664-011-9177-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-011-9177-5

Keywords

Navigation