Finding conclusion stability for selecting the best effort predictor in software effort estimation

Keung, Jacky; Kocaguneli, Ekrem; Menzies, Tim

doi:10.1007/s10515-012-0108-5

Finding conclusion stability for selecting the best effort predictor in software effort estimation

Published: 12 May 2012

Volume 20, pages 543–567, (2013)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Jacky Keung¹,
Ekrem Kocaguneli² &
Tim Menzies²

1015 Accesses
55 Citations
1 Altmetric
Explore all metrics

Abstract

Background: Conclusion Instability in software effort estimation (SEE) refers to the inconsistent results produced by a diversity of predictors using different datasets. This is largely due to the “ranking instability” problem, which is highly related to the evaluation criteria and the subset of the data being used.

Aim: To determine stable rankings of different predictors.

Method: 90 predictors are used with 20 datasets and evaluated using 7 performance measures, whose results are subject to Wilcoxon rank test (95 %). These results are called the “aggregate results”. The aggregate results are challenged by a sanity check, which focuses on a single error measure (MRE) and uses a newly developed evaluation algorithm called CLUSTER. These results are called the “specific results.”

Results: Aggregate results show that: (1) It is now possible to draw stable conclusions about the relative performance of SEE predictors; (2) Regression trees or analogy-based methods are the best performers. The aggregate results are also confirmed by the specific results of the sanity check.

Conclusion: This study offers means to address the conclusion instability issue in SEE, which is an important finding for empirical software engineering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Negative results for software effort estimation

Article 21 November 2016

Appropriate number of analogues in analogy based software effort estimation using quality datasets

Article 22 January 2023

Improving the Software Estimation Models Based on Functional Size through Validation of the Assumptions behind the Linear Regression and the Use of the Confidence Intervals When the Reference Database Presents a Wedge-Shape Form

Article 28 December 2021

Notes

http://promisedata.org/data.

References

Albrecht, A., Gaffney, J.: Software function, source lines of code and development effort prediction: a software science validation. IEEE Trans. Softw. Eng. 9, 639–648 (1983)
Article Google Scholar
Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2004)
Google Scholar
Auer, M., Trendowicz, A., Graser, B., Haunschmid, E., Biffl, S.: Optimal project feature weights in analogy-based cost estimation: improvement and limitations. IEEE Trans. Softw. Eng. 32, 83–92 (2006)
Article Google Scholar
Baker, D.: A hybrid approach to expert and model-based effort estimation. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University (2007). Available from https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5443
Bakir, A., Turhan, B., Bener, A.B.: A new perspective on data homogeneity in software cost estimation: a study in the embedded systems domain. Softw. Qual. Control 18, 57–80 (2010)
Article Google Scholar
Boehm, B.W.: Software Engineering Economics. Prentice Hall PTR, Upper Saddle River (1981)
MATH Google Scholar
Brady, A., Menzies, T.: Case-based reasoning vs parametric models for software quality optimization. In: International Conference on Predictive Models in Software Engineering PROMISE’10, Sept. IEEE, New York (2010)
Google Scholar
Breiman, L.: Technical note: some properties of splitting criteria. Mach. Learn. 24(41–47), 10 (1996) doi:1023/A:1018094028462
MathSciNet Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)
MATH Google Scholar
Chang, C.-L.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. Comput. C-23(11), 1179–1184 (1974)
Article Google Scholar
Fayyad, U.M., Irani, I.H.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the International Joint Conference on Uncertainty in AI, pp. 1022–1027 (1993)
Google Scholar
Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I.: A simulation study of the model evaluation criterion mmre. IEEE Trans. Softw. Eng. 29(11), 985–995 (2003)
Article Google Scholar
Gama, J., Pinto, C.: Discretization from data streams: applications to histograms and data mining. In: SAC ’06: Proceedings of the 2006 ACM Symposium on Applied Computing, pp. 662–667. ACM Press, New York (2006). Available from http://www.liacc.up.pt/~jgama/IWKDDS/Papers/p6.pdf
Chapter Google Scholar
Hall, M., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003)
Article Google Scholar
Jørgensen, M.: A review of studies on expert estimation of software development effort. J. Syst. Softw. 70(1–2), 37–60 (2004)
Article Google Scholar
Jorgensen, M.: Practical guidelines for expert-judgment-based software effort estimation. IEEE Softw. 22(3), 57–63 (2005)
Article Google Scholar
Kadoda, G., Cartwright, M., Shepperd, M.: On configuring a case-based reasoning software project prediction system. In: UK CBR Workshop, Cambridge, UK, pp. 1–10 (2000)
Google Scholar
Kemerer, C.: An empirical validation of software cost estimation models. Commun. ACM 30(5), 416–429 (1987)
Article Google Scholar
Keung, J.: Empirical evaluation of analogy-x for software cost estimation. In: ESEM ’08: Proceedings of the Second International Symposium on Empirical Software Engineering and Measurement, pp. 294–296. ACM, New York (2008)
Chapter Google Scholar
Keung, J., Kitchenham, B.: Experiments with analogy-x for software cost estimation. In: ASWEC ’08: Proceedings of the 19th Australian Conference on Software Engineering, pp. 229–238. IEEE Computer Society, Washington (2008)
Chapter Google Scholar
Keung, J.W., Kitchenham, B.A., Jeffery, D.R.: Analogy-x: providing statistical inference to analogy-based software cost estimation. IEEE Trans. Softw. Eng. 34(4), 471–484 (2008)
Article Google Scholar
Kirsopp, C., Shepperd, M., Premrag, R.: Case and feature subset selection in case-based software project effort prediction. In: Research and Development in Intelligent Systems XIX: Proceedings of ES2002, the Twenty-Second SGAI International Conference on Knowledge Based Systems and Applied Artificial Intelligence, p. 61 (2003)
Google Scholar
Kirsopp, C., Shepperd, M.J.: Making inferences with small numbers of training sets. IEEE Softw. 149(5), 123–130 (2002)
Article Google Scholar
Kitchenham, B., Känsälä, K.: Inter-item correlations among function points. In: ICSE’93:Proceedings of the 15th International Conference on Software Engineering, ICSE ’93, pp. 477–480. IEEE Computer Society Press, Los Alamitos (1993)
Chapter Google Scholar
Kitchenham, B., Mendes, E., Travassos, G.H.: Cross versus within-company cost estimation studies: a systematic review. IEEE Trans. Softw. Eng. 33(5), 316–329 (2007)
Article Google Scholar
Kliijnen, J.: Sensitivity analysis and related analyses: a survey of statistical techniques. J. Stat. Comput. Simul. 57(1–4), 111–142 (1997)
Article Google Scholar
Li, J., Ruhe, G.: A comparative study of attribute weighting heuristics for effort estimation by analogy. In: Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, p. 74 (2006)
Google Scholar
Li, J., Ruhe, G.: Decision support analysis for software effort estimation by analogy. In: International Conference on Predictive Models in Software Engineering PROMISE’07, May (2007)
Google Scholar
Li, Y., Xie, M., Goh, T.: A study of project selection and feature weighting for analogy based software cost estimation. J. Syst. Softw. 82, 241–252 (2009)
Article Google Scholar
Lipowezky, U.: Selection of the optimal prototype subset for 1-nn classification. Pattern Recognit. Lett. 19, 907–918 (1998)
Article Google Scholar
Maxwell, K.D.: Applied Statistics for Software Managers. Prentice Hall, PTR, Upper Saddle River (2002)
Google Scholar
Mendes, E., Watson, I.D., Triggs, C., Mosley, N., Counsell, S.: A comparative study of cost estimation models for web hypermedia applications. Empir. Softw. Eng. 8(2), 163–196 (2003)
Article Google Scholar
Menzies, T., Jalali, O., Hihn, J., Baker, D., Lum, K.: Stable rankings for different effort models. Autom. Softw. Eng. 17, 409–437 (2010)
Article Google Scholar
Milicic, D., Wohlin, C.: Distribution patterns of effort estimations. In: EUROMICRO, pp. 422–429 (2004)
Google Scholar
Miyazaki, Y., Terakado, M., Ozaki, K., Nozaki, H.: Robust regression for developing software estimation models. J. Syst. Softw. 27(1), 3–16 (1994)
Article Google Scholar
Myrtveit, I., Stensrud, E., Shepperd, M.: Reliability and validity in comparative studies of software prediction models. IEEE Trans. Softw. Eng. 31, 380–391 (2005)
Article Google Scholar
Robson, C.: Real World Research: A Resource for Social Scientists and Practitioner-Researchers. Blackwell Publisher Ltd, Oxford (2002)
Google Scholar
Shepperd, M., Kadoda, G.: Comparing software prediction techniques using simulation. IEEE Trans. Softw. Eng. 27(11), 1014–1022 (2001)
Article Google Scholar
Shepperd, M., Schofield, C.: Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23(11), 736–743 (1997)
Article Google Scholar
Shepperd, M., Schofield, C., Kitchenham, B.: Effort estimation using analogy. In: Proceedings of the 18th International Conference on Software Engineering, pp. 170–178 (1996)
Google Scholar
Walkerden, F., Jeffery, R.: An empirical study of analogy-based software effort estimation. Empir. Softw. Eng. 4(2), 135–158 (1999)
Article Google Scholar
Yang, Y., Webb, G.I.: A comparative study of discretization methods for naive-Bayes classifiers. In: Proceedings of PKAW 2002: The 2002 Pacific Rim Knowledge Acquisition Workshop, pp. 159–173 (2002)
Google Scholar

Download references

Acknowledgements

This research has been funded by the Qatar/West Virginia University research grant NPRP 09-12-5-2-470.

Author information

Authors and Affiliations

Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong
Jacky Keung
Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, 26505, USA
Ekrem Kocaguneli & Tim Menzies

Authors

Jacky Keung
View author publications
You can also search for this author in PubMed Google Scholar
Ekrem Kocaguneli
View author publications
You can also search for this author in PubMed Google Scholar
Tim Menzies
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ekrem Kocaguneli.

Appendix: Data Used in This Study

All the data used in this study is available either at http://promisedata.org/data or through the authors. As shown in Fig. 1, we use a variety of different data sets in this research. The standard COCOMO data sets (cocomo*, nasa*), which are collected with the COCOMO approach (Boehm 1981). The desharnais data set, which contains software projects from Canada. It is collected with function points approach. SDR, which contains data from projects of various software companies in Turkey. SDR is collected by Softlab, the Bogazici University Software Engineering Research Laboratory (Bakir et al. 2010). albrecht data set consists of projects completed in IBM in the 1970’s and details are given in Albrecht and Gaffney (1983). finnish data set originally contains 40 projects from different companies and data were collected by a single person. The two projects with missing values are omitted here, hence we use 38 instances. More details can be found in Kitchenham and Känsälä (1993). kemerer is a relatively small dataset with 15 instances, whose details can be found in Kemerer (1987). maxwell data set comes from finance domain and is composed of Finnish banking software projects. Details of this dataset are given in Maxwell (2002). miyazaki data set contains projects developed in COBOL. For details see Miyazaki et al. (1994). telecom contains projects which are enhancements to a U.K. telecommunication product and details are provided in Shepperd and Schofield (1997). china dataset includes various software projects from multiple companies developed in China.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Keung, J., Kocaguneli, E. & Menzies, T. Finding conclusion stability for selecting the best effort predictor in software effort estimation. Autom Softw Eng 20, 543–567 (2013). https://doi.org/10.1007/s10515-012-0108-5

Download citation

Received: 02 January 2011
Accepted: 25 April 2012
Published: 12 May 2012
Issue Date: December 2013
DOI: https://doi.org/10.1007/s10515-012-0108-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finding conclusion stability for selecting the best effort predictor in software effort estimation

Abstract

Access this article

Similar content being viewed by others

Negative results for software effort estimation

Appropriate number of analogues in analogy based software effort estimation using quality datasets

Improving the Software Estimation Models Based on Functional Size through Validation of the Assumptions behind the Linear Regression and the Use of the Confidence Intervals When the Reference Database Presents a Wedge-Shape Form

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Data Used in This Study

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Finding conclusion stability for selecting the best effort predictor in software effort estimation

Abstract

Access this article

Similar content being viewed by others

Negative results for software effort estimation

Appropriate number of analogues in analogy based software effort estimation using quality datasets

Improving the Software Estimation Models Based on Functional Size through Validation of the Assumptions behind the Linear Regression and the Use of the Confidence Intervals When the Reference Database Presents a Wedge-Shape Form

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Data Used in This Study

Appendix: Data Used in This Study

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation