Assessing Variation in Development Effort Consistency Using a Data Source with Missing Data

Moses, John; Farrow, Malcolm

doi:10.1007/s11219-004-5261-z

Assessing Variation in Development Effort Consistency Using a Data Source with Missing Data

Published: March 2005

Volume 13, pages 71–89, (2005)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

John Moses¹ &
Malcolm Farrow¹

72 Accesses
7 Citations
Explore all metrics

Abstract

In this study the authors analyse the International Software Benchmarking Standards Group data repository, Release 8.0. The data repository comprises project data from several different companies. However, the repository exhibits missing data, which must be handled in an appropriate manner, otherwise inferences may be made that are biased and misleading. The authors re-examine a statistical model that explained about 62% of the variability in actual software development effort (Summary Work Effort) which was conditioned on a sample from the repository of 339 observations. This model exhibited covariates Adjusted Function Points and Maximum Team Size and dependence on Language Type (which includes categories 2nd, 3rd, 4th Generation Languages and Application Program Generators) and Development Type (enhancement, new development and re-development). The authors now use Bayesian inference and the Bayesian statistical simulation program, BUGS, to impute missing data avoiding deletion of observations with missing Maximum Team size and increasing sample size to 616. Providing that by imputing data distributional biases are not introduced, the accuracy of inferences made from models that fit the data will increase. As a consequence of imputation, models that fit the data and explain about 59% of the variability in actual effort are identified. These models enable new inferences to be made about Language Type and Development Type. The sensitivity of the inferences to alternative distributions for imputing missing data is also considered. Furthermore, the authors contemplate the impact of these distributions on the explained variability of actual effort and show how valid effort estimates can be derived to improve estimate consistency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abran, A., Desharnais, J.-M., Oligny, S., S.T.-Pierre, D., and Symons, C. 2003. COSMIC-FFP, Measurement Manual, Version 2.2, January.
Albrecht, A.J. 1979. Measuring application development, Proceedings of IBM Applications Development Joint SHARE/GUIDE Symposium, Monterey, CA, pp. 83–92.
Albrecht, A.J. and Gaffney, J.E. 1983. Software function, source lines of code, and development effort prediction: A software science validation, IEEE Transactions on Software Engineering 9(6): 639–648.
Google Scholar
Altman, D.G. 1993. Practical Statistics for Medical Research. Chapman & Hall.
Angelis, L., Stamelos, I., and Morrisio, M. 2001. Building a software cost estimation model based on categorical data, Proceedings of the Conference IEEE Metrics 2001, London, 4–6 April, pp. 4–15.
Boehm, B.W. 1981. Software Engineering Economics. New Jersey, Prentice-Hall.
Google Scholar
Cartwright, M.H., Shepperd, M.J., and Song, Q. 2003. Dealing with missing software project data, 9th International Software Metrics Symposium (METRICS’03), September, pp. 154–166.
Congdon, P. 2001. Bayesian Statistical Modelling, Wiley Series in Probability and Statistics. Wiley.
Dekker, T. 2004. Control enhancement projects based on size measurement, Proceedings of the 1st Software Measurement European Forum, Istituto di Ricerca Internazionale, 28–30 January, Rome, Italy, pp. 63–72.
Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B. 1998. Bayesian Data Analysis. Chapman & Hall.
Gilks, W.R., Richardson, S., and Spiegelhalter, D.J. 1996. Markov Chain Monte Carlo in Practice. Chapman & Hall.
Hughes, R.T. 1996. Expert judgement as an estimating method, Information and Software Technology 38: 67–75.
Google Scholar
International Software Benchmarking Standards Group. 2003. Data Repository, http://www.isbg.org.au.
Kitchenham, B.A. 1992. Empirical assumptions that underlie software cost-estimation models, Information and Software Technology 34(4): 211–218.
Google Scholar
Lindley, D.V. 2000. The philosophy of statistics, The Statistician 49(3): 293–337.
Google Scholar
Little, R. and Rubin, D. 1999. Comment on “Adjusting non-ignorable dropout using semiparametric models,” by D.O. Scharfstein, Rotnitzky, and Robins, Journal of the American Statistical Association 94: 1130–1132.
Google Scholar
Little, R.J.A. and Rubin, D.B., 2002. Statistical Analysis with Missing Data, 2nd ed. New York, Wiley.
Google Scholar
Matson, J.E., Barrett, B.E., and Mellichamp, J.M. 1994. Software development cost estimation using function points, IEEE Transactions on Software Engineering 20(4): 275–287.
Google Scholar
Moses, J. 2001. A consideration of the impact of interactions with module effects on the direct measurement of subjective software attributes, Proceedings of the 7th IEEE Symposium on Software Metrics, London, UK, April, pp. 112–123.
Moses, J. and Farrow, M. 2003. A procedure for assessing the influence of problem domain on effort estimation consistency, Software Quality Journal 11(4): 283–300.
Google Scholar
Moses, J. and Farrow, M. 2004. A consideration of the variation in development effort consistency due to function points, Proceedings of the 1st Software Measurement European Forum, Istituto di Ricerca Internazionale, 28–30 January, Rome, Italy, pp. 247–256.
Myrtveit, I., Stensrud, E., and Olsson, U.H. 2001. Analyzing data sets with missing data: An empirical evaluation of imputation methods and likelihood-based methods, IEEE Transactions on Software Engineering, November, 999–1013.
Google Scholar
Spiegelhalter, D.J., Thomas, A., Best, N., and Gilks, W. 1996. BUGS 0.5, Bayesian Inference Using Gibbs Sampling Manual (version 2), MRC Biostatistics Unit, Cambridge, UK.
Stensrud, E., Foss, T., Kitchenham, B., and Myrtveit, I. 2003. A further empirical investigation of the relationship between MRE and project size, Empirical Software Engineering 8(2): 139–161.
Google Scholar
Strike, K., El Emam, K., and Madhavji, N. 2001. Software cost estimation with incomplete data, IEEE Transactions on Software Engineering 27(10): 890–908.
Google Scholar
Symons, C.R. 1991. Software Sizing and Estimating Mk II (Function Point Analysis). Wiley.
Walpole, R.E. and Myers, R.H. 1993. Probability and Statistics for Engineers and Scientists, 5th ed. Prentice-Hall.

Download references

Author information

Authors and Affiliations

School of Computing and Technology, University of Sunderland, UK, SR6 0DD
John Moses & Malcolm Farrow

Authors

John Moses
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm Farrow
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John Moses.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moses, J., Farrow, M. Assessing Variation in Development Effort Consistency Using a Data Source with Missing Data. Software Qual J 13, 71–89 (2005). https://doi.org/10.1007/s11219-004-5261-z

Download citation

Issue Date: March 2005
DOI: https://doi.org/10.1007/s11219-004-5261-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing Variation in Development Effort Consistency Using a Data Source with Missing Data

Abstract

Access this article

Similar content being viewed by others

Bayesian Data Analysis in Empirical Software Engineering: The Case of Missing Data

Is it possible to disregard obsolete requirements? a family of experiments in software effort estimation

Negative results for software effort estimation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing Variation in Development Effort Consistency Using a Data Source with Missing Data

Abstract

Access this article

Similar content being viewed by others

Bayesian Data Analysis in Empirical Software Engineering: The Case of Missing Data

Is it possible to disregard obsolete requirements? a family of experiments in software effort estimation

Negative results for software effort estimation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation