Abstract
Recent studies on open-source software (OSS) products report that smaller modules are proportionally more defect prone compared to larger ones. This phenomenon, referred to as the Theory of Relative Defect Proneness (RDP), challenges the traditional QA approaches that give a higher priority to larger modules, and it attracts growing interest from closed-source software (CSS) practitioners. In this paper, we report the findings of a study where we tested the theory of RDP using ten CSS products. The results clearly confirm the theory of RDP. We also demonstrate the useful practical implications of this theory in terms of defect-detection effectiveness. Therefore, this study does not only make research contributions by rigorously testing a scientific theory for a different category of software products, but also provides useful insights and evidence to practitioners for revising their existing QA practices.





Similar content being viewed by others
Notes
It is worth to note that the conclusions drawn in Koru et al. (2007b, 2008, 2009) were not obtained by correlating or plotting size against defect density (defect count divided by size). Such an approach is problematic and it would result in artificial ratio correlations as demonstrated in Rosenberg (1997), El Emam et al. (2002). Instead, Koru et al. simply examined the functional form of the relationship.
If one wants to show that the Theory of RDP does not hold, the null hypothesis should be stated as “smaller modules are proportionally more defect prone”, and strong statistical evidence should be obtained to reject this null hypothesis safely.
Note that, even though they might look similar, the comparison plots presented in Fig. 4 are different from the concentration curves shown in Fig. 2. In concentration curves, the unique module sizes are identified along the x-axis; whereas, for the comparison plots in Fig. 4, the percentiles of total size are identified along the x-axis where modules are simply sorted from left to right in the increasing order of their size.
Some names are adopted from the 1726 novel of Jonathan Swift, Gulliver’s Travels.
References
Akiyama F (1971) An example of software system debuggings. In: Information processing 71, proceedings of IFIP congress 71, vol 1, pp 353–359
Andersson C, Runeson P (2007) A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans Softw Eng 33(5):273–286
Basili VR, Perricone BT (1984) Software errors and complexity: an empirical investigation. Commun ACM 27(1):42–52
Briand LC, Wüst J, Lounis H (2001) Replicated case studies for investigating quality factors in object oriented designs. J Syst Softw 56(1):11–58
Briand LC, Melo WL, Wüst J (2002) Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans Softw Eng 28(7):706–720
Chayes F (1971) Ratio correlation: a manual for students of petrology and geochemistry. University of Chicago Press, Chicago, IL
Cox DR (1972) Regression models and life tables. J R Stat Soc 34:187–220
Crouchley R, Pickes A (1993) A specification test for univariate and multivariate proportional hazards models. Biometrics 49:1067–1076
El Emam K (2005) The ROI from software quality. Auerbach Publications, Taylor and Francis Group, LLC, Boca Raton, FL
El Emam K, Koru G (2008) A replicated survey of it software project failure rates. IEEE Softw 25(5):84–90. ISSN 0740-7459. doi:10.1109/MS.2008.107
El Emam K, Benlarbi S, Goel N, Rai SN (2001) The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans Softw Eng 27(7):630–650
El Emam K, Benlarbi S, Goel N, Melo W, Lounis H, Rai SN (2002) The optimal class size for object-oriented software. IEEE Trans Softw Eng 28(5):494–509
Fenton N, Pfleeger SL (1996) Software metrics: a rigorous and practical approach, 2nd edn. PWS Publishing
Fenton NE, Neil M (1999) A critique of software defect prediction models. IEEE Trans Softw Eng 25(5):675–689
Fenton NE, Ohlsson N (2000) Quantitative analysis of faults and failures in a complex software system. IEEE Trans Softw Eng 26(8):797–814
Funami Y, Halstead MH (1976) A software physics analysis of Akiyama’s debugging data. In: Proceedings of MRI XXIV international symposium on computer software engineering, pp 133–138
Gaffney JE (1984) Estimating the number of faults in code. IEEE Trans Softw Eng 10(4):459–465
Halstead MH (1977) Elements of software science. Elsevier, Amsterdam, The Netherlands
Hamer PG, Frewin GD (1982) M.h. halstead’s software science—a critical examination. In: ICSE ’82: proceedings of the 6th international conference on software engineering, pp 197–206
Harrell FE (2001) Regression modeling strategies: with applications to linear modes, logistic regression, and survival analysis. Springer, Berlin Heidelberg New York
Harrell FE (2005) Design: design package. R package version 2.0-12
Hatton L (1997) Reexamining the fault density-component size connection. IEEE Softw 14(2):89–97
Hatton L (1998) Does oo sync with how we think? IEEE Softw 15(3):46–54
Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: Proceedings of the fifth Berkeley symposium in mathematical statistics, vol 1, pp 221–233. University of California Press, Berkeley, CA
Kakwani N, Wagstaff A, Van Doorslaer E (1997) Socioeconomic inequalities in health: Measurement, computation, and statistical inference. J Econom 77:87–103
Khoshgoftaar TM, Szabo RM (1996) Using neural networks to predict software faults during testing. IEEE Trans Reliab 45(3):456–462
Khoshgoftaar TM, Allen EB, Hudepohl J, Aud SJ (1997) Applications of neural networks to software quality modeling of a very large telecommunications system. IEEE Trans Neural Netw 8(4):902–909
Koru G, El Emam K, Neisa A, Umarji M (2007a) A survey of quality assurance practices in biomedical open-source software projects. J Med Internet Res 9(2). doi:10.2196/jmir.9.2.e8. http://www.jmir.org/2007/2/e8
Koru G, Zhang D, Liu H (2007b) Modeling the effect of size on defect proneness for open-source software. In: PROMISE ’07: proceedings of the third international workshop on predictor models in software engineering. IEEE Computer Society, Washington, DC, USA, pp 115–124. ISBN 0-7695-2954-2. doi:10.1109/PROMISE.2007.9
Koru G, Zhang D, El Emam K, Liu H (2009) An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans Softw Eng 35(2):293–304. ISSN 0098-5589. doi:10.1109/TSE.2008.90
Koru G, El Emam K, Zhang D, Liu H, Mathew D (2008) Theory of relative defect proneness. Empir Softw Eng 13(5):473–498. ISSN 1382-3256. doi:10.1007/s10664-008-9080-x
Lindsay RM, Ehrenberg ASC (1993) The design of replicated studies. Am Stat 47(3):217–228
Lipow M (1982) Number of faults per line of code. IEEE Trans Softw Eng 8(4):437–439
Mockus A, Fielding RT, Herbsleb J (2000) A case study of open source software development: the apache server. In: Proceedings of the 22nd international conference on software engineering, ICSE 2000, pp 263–272
Mockus A, Fielding RT, Herbsleb J (2002) Two case studies of open source software development: apache and mozilla. ACM Trans Softw Eng Methodol 11(3):309–346
NASA IV&V Facility (2009, archival time) Metrics data program. Archived at http://www.webcitation.org/5fzQE6wVI
Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355
Ottenstein LM (1979) Quantitative estimates of debugging requirements. IEEE Trans Softw Eng 5(5):504–514
Popper K (1959) The logic of scientific discovery. Routledge classics, 2nd edn. Reprint, 2007
Porter A, Votta L (1998) Comparing detection methods for software requirements inspections: a replication using professional subjects. Empir Softw Eng 3(4):355–379
Porter AA, Selby RW (1990) Empirically guided software development using metric-based classification trees. IEEE Softw 7(2):46–54
Promise (2007) Promise data repository
Raymond ES (1999) The cathedral and the bazaar: musings on linux and open source by an accidental revolutionary. O’Reilly and Associates, Sebastopol, CA, 95472, USA
Rosenberg J (1997) Some misconceptions about lines of code. In: METRICS ’97: proceedings of the 4th international symposium on software metrics. IEEE Computer Society, Washington, DC, USA, pp 137–142
Selby RW, Basili VR (1991) Analyzing error-prone system structure. IEEE Trans Softw Eng 17(2):141–152
Shen VY, Yu TJ, Thebaut SM, Paulsen L (1985) Identifying error-prone software—an empirical study. IEEE Trans Softw Eng 11(4):317–324
Shull F, Carver J, Travassos GH, Maldonado JC, Conradi R, Basili VR (2003) Replicated studies: building a body of knowledge about software reading techniques. In: Lecture notes on empirical software engineering. World Scientific Publishing Co, Inc, pp 39–84
Thayer R, Lipow M, Nelson E (1978) Software reliability. North-Holland
Therneau TM, Grambsch PM (2000) Modeling survival data: extending the Cox model. Springer, Berlin Heidelberg New York
Wagstaff A, Paci P, van Doorslaer E (1991) On the measurement of inequalities in health. Soc Sci Med (1982) 33(5):545–57
Zhao L, Elbaum S (2003) Quality assurance under the open source development model. J Syst Softw 66(1):65–75
Acknowledgements
We would like to thank the creators, contributors, and maintainers of the PROMISE and NASA repositories, especially Tim Menzies and Justin Di Stefano, for the clarifications they made about some of the data sets.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Nachiappan Nagappan
Rights and permissions
About this article
Cite this article
Koru, G., Liu, H., Zhang, D. et al. Testing the theory of relative defect proneness for closed-source software. Empir Software Eng 15, 577–598 (2010). https://doi.org/10.1007/s10664-010-9132-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-010-9132-x