Abstract
Lines of code metrics are routinely used as measures of software system complexity, programmer productivity, and defect density, and are used to predict both effort and cost. The guidelines for using a direct metric, such as lines of code, as a proxy for a quality factor such as complexity or defect density, or in derived metrics such as cost and effort are clear. Amongst other criteria, the direct metric must be linearly related to, and accurately predict, the quality factor and these must be validated through statistical analysis following a rigorous validation methodology. In this paper, we conduct such an analysis to determine the validity and utility of lines of code as a measure using the ISBGS-10 data set. We find that it fails to meet the specified validity tests and, therefore, has limited utility in derived measures.
Similar content being viewed by others
References
Mendes E, Kitchenham B (2004) Further comparison of cross-company and within-company effort estimation models for web applications. In: Proceedings of the 10th International Symposium on Software Metrics, 14–16 Sept 2004, pp 348–357, Chicago, IL, USA
National Instruments Developer Zone Tutorial (2009) Estimating code complexity in labview. http://www.ni.com/white-paper/3324/en. Accessed 16 Mar 2012
Aggarwal KK, Singh Y, Ch P, Puri M (2005) Bayesian regularization in a neural network model to estimate lines of code using function points. J Comput Sci 1(4):505–509
Akiyama F (1971) An example of software system debugging. Inf Process 71(1):353–379
Albrecht AJ, Gaffney Jr JE (1983) Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans Softw Eng SE-9(6):639–648
Anderson AB, Basilevsky A, Hum DPJ (1983) Missing data: a review of the literature. Handb Surv Res 4:415–494
Antoniol G, Fiutem R, Lokan C (2003) Object-oriented function points: an empirical validation. Empir Softw Eng 8(3):225–254
Armel K (2012) History is the key to estimation success. J Softw Technol 15(1):16–22
Armour PG (2004) Beware of counting LOC. Commun ACM 47(3):21–24
Attarzadeh I, Ow SH (2009) Software development effort estimation based on a new fuzzy logic model. Int J Comput Theory Eng 1(4):473–476
Bannerman S, Martin A (2011) A multiple comparative study of test-with development product changes and their effects on team speed and product quality. Empir Softw Eng 16(2):177–210
Barb A, Neill C, Sangwan R, Piovoso M (2010) Statistical analysis of the relevance of lines of code measures. In: Proceedings of the 2010 International Conference on Software Engineering Research and Practice, 12–15 July, Las Vegas, NV, USA
de Barcelos Tronto IF, da Silva JDS, Sant’Anna N (2008) An investigation of artificial neural networks based prediction systems in software project management. J Syst Softw 81(3):356–367
Bell RM, Ostrand TJ, Weyuker EJ (2013) The limited impact of individual developer data on software defect prediction. Empir Softw Eng 18(3):478–505
Boehm BW (1984) Software engineering economics. IEEE Trans Softw Eng 10(1):4–21
Booch G (2008) Measuring architectural complexity. IEEE Softw 25(4):14–15
Box GEP, Cox DR (1964) An analysis of transformations (with discussion). J R Stat Soc B26:211–252
Briand LC, Langley T, Wieczorek I (2000) A replicated assessment and comparison of common software cost modeling techniques. In: Proceedings of the 22nd international conference on software engineering, pp 377–386, ACM, New York
Capretz LF, Marza V (2009) Improving effort estimation by voting software estimation models. Adv Softw Eng 2009:4
Chulani S, Clark B, Boehm BW, Steece B (1998) Calibration approach and results of the COCOMO II post-architecture model. In: Proceedings of the 20th annual conference of the international society of parametric analysts (ISPA), pp 1–5
Cohen J (2003) Applied multiple regression/correlation analysis for the behavioral sciences. In: Inquiry and pedagogy across diverse contexts series. Lawrence Erlbaum Associates, Incorporated, Mahwah
D’Ambros M, Lanza M, Robbes R (2012) Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir Softw Eng 17(4–5):531–577
De Souto M, de Araujo D, Costa I, Soares R, Ludermir T, Schliep A (2008) Comparative study on normalization procedures for cluster analysis of gene expression datasets. In: IEEE international joint conference on neural networks, 2008 (IJCNN 2008) (IEEE world congress on computational intelligence), pp 2792–2798
DeMarco T (1995) Why does software cost so much? Dorset House Publishing, London
Fenton NE, Pfleeger SL (1996) Software metrics—a practical and rigorous approach, 2nd edn. International Thomson, Belmont
Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A simulation study of the model evaluation criterion MMRE. IEEE Trans Softw Eng 29(11):985–995
Fox J (1997) Applied regression analysis, linear models, and related methods. SAGE Publications, New York
Frazier TP, Bailey JW, Corso ML (1996) Comparing ada and fortran lines of code: some experimental results. Empir Softw Eng 1(1):45–59
Gaffney JE (1984) Estimating the number of faults in code. IEEE Trans Softw Eng SE-10(4):459–464
Gale EAM (2004) The Hawthorne studies—a fable for our times? QJM 97(7):439–449
Gelman A, Pardoe I (2006) Bayesian measures of explained variance and pooling in multilevel (hierarchical) models. Technometrics 48(2):241–251
Gurka MJ, Edwards LJ, Muller KE, Kupper LL (2006) Extending the Box–Cox transformation to the linear mixed model. J R Stat Soc Ser A (Stat Soc) 169(2):273–288
Harris JW, Stocker H (1998) Maximum likelihood method. Handb Math Comput Sci 1:824
Heck BS, Wills LM, Vachtsevanos GJ (2009) Software technology for implementing reusable, distributed control systems. In: Applications of intelligent control to engineering systems, pp 267–293. Springer, New York
IEEE (1998) IEEE Standard for a Software Quality Metrics Methodology, IEEE Std. 1061–1998
Jeffery R, Ruhe M, Wieczorek I (2001) Using public domain metrics to estimate software development effort. In: Proceedings of the 7th international software metrics symposium METRICS 2001, pp 16–27
Jiang Y, Cukic B, Ma Y (2008) Techniques for evaluating fault prediction models. Empir Softw Eng 13(5):561–595
Jones C (1997) Software quality, analysis and guidelines for success. Thomson, Boston
Jorgensen M (2004) Regression models of software development effort estimation accuracy and bias. Empir Softw Eng 9(4):297–314
Kaner C, Bond WP (2004) Software engineering metrics: what do they measure and how do we know. In: 10th International Software Metrics Symposium, METRICS 14–16 Sept 2004, pp 1–12, Chicago, IL, USA
Kim M, Hill RC (1993) General transformation of variables in regression. Empir Econ 18:307–319
Kitchenham B, Mendes E (2009) Why comparative effort prediction studies may be invalid. In: Proceedings of the 5th international conference on predictor models in software engineering, PROMISE ’09, New York, pp 4:1–4:5
Kitchenham BA, Mendes E, Travassos GH (2007) Cross versus within-company cost estimation studies: a systematic review. IEEE Trans Softw Eng 33(5):316–329
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, Chichester
Lokan C, Mendes E (2006) Cross-company and single-company effort models using the ISBSG database: a further replicated study. In: Proceedings of the 2006 ACM/IEEE international symposium on empirical software engineering, pp 75–84. ACM, New York
Lokan C, Mendes E (2009) Investigating the use of chronological split for software effort estimation. IET Softw 3(5):422–434.10.1049/iet-sen.2008.0107. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=5273794. Accessed 16 Mar 2012
Lopez-Martin C, Yáñez Márquez C, Gutierrez-Tornes A (2006) A fuzzy logic model for software development effort estimation at personal level. In: Proceedings of the 5th Mexican international conference on artificial intelligence, MICAI’06. Springer-Verlag, Berlin, pp 122–133
Mair C, Shepperd M, Jørgensen M (2005) An analysis of data sets used to train and validate cost prediction systems. SIGSOFT Softw Eng Notes 30(4):1–6
Marazzi A, Yohai V (2006) Robust Box-Cox transformations based on minimum residual autocorrelation. Comput Stat Data Anal 50(10):2752–2768
Maronna R, Martin D, Yohai V (2006) Robust statistics: theory and methods. In: Wiley series in probability and statistics. Wiley, New York
Mendes E, Lokan C (2008) Replicating studies on cross- vs single-company effort models using the ISBSG database. Empir Softw Eng 13:3–37
Mendes E, Lokan C, Harrison R, Triggs C (2005) A replicated comparison of cross-company and within-company effort estimation models using the ISBSG database. In: Proceedings of the 11th IEEE international software metrics symposium, p 36. IEEE Computer Society, Washington
Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13
Parareda B, Pizka M (2007) Measuring productivity using the infamous lines of code metric. In: Proceedings of SPACE 2007 Workshop, Nagoya, Japan
Park RE (1992) Software size measurement: a framework for counting source statements. In: Technical report, DTIC document
Pendharkar PC, Rodger JA (2007) An empirical study of the impact of team size on software development effort. Inf Technol Manag 8(4):253–262
Porter A, Selby RW (1990) Empirically guided software development using metric-based classification trees. IEEE Softw 7(2):46–54
Prasad L, Nagar A (2009) Experimental analysis of different metrics (object-oriented and structural) of software. In: IEEE 1st international conference on computational intelligence, communication systems and networks, CICSYN’09, pp 235–240
R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org. (ISBN 3-900051-07-0). Accessed 1 June 2010
Rosenberg J (1997) Some misconceptions about lines of code. In: IEEE proceedings of the 4th international software metrics symposium, pp 137–142
Schafer JL (2010) Analysis of incomplete multivariate data, vol 72. Chapman and Hall/CRC, London
Sentas P, Angelis L, Stamelos I, Bleris G (2005) Software productivity and effort prediction with ordinal regression. Inf Softw Technol 47(1):17–29
Stensrud E, Myrtveit I (1998) Human performance estimating with analogy and regression models: an empirical validation. In: Proceedings of the 5th international symposium on software metrics, pp 205
Succi G, Pedrycz W, Stefanovic M, Russo B (2003) An investigation on the occurrence of service requests in commercial software applications. Empir Softw Eng 8(2):197–215
The International Software Benchmarking Standards Group (2008) ISBSG estimating benchmarking and research suite release 10. http://www.isbsg.org/. Accessed 1 June 2010
Tian J, Zelkowitz MV (1995) Complexity measure evaluation and selection. IEEE Trans Softw Eng 21(8):641–650
Walkerden F, Jeffery R (1999) An empirical study of analogy-based software effort estimation. Empir Softw Eng 4(2):135–158
Weyuker EJ, Ostrand TJ, Bell RM (2008) Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models. Empir Softw Eng 13(5):539–559
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83
Author information
Authors and Affiliations
Corresponding author
Additional information
Data acquisition for this project was supported Research and Development Grant from the School of Graduate Professional Studies, Penn State University.
Rights and permissions
About this article
Cite this article
Barb, A.S., Neill, C.J., Sangwan, R.S. et al. A statistical study of the relevance of lines of code measures in software projects. Innovations Syst Softw Eng 10, 243–260 (2014). https://doi.org/10.1007/s11334-014-0231-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-014-0231-5