Abstract
Estimating thresholds for software metrics is a key step towards assigning a quality index. In defect prediction, two approaches are widely used those based on statistics and, that which uses rigorous mathematical models. Although significant insights have been surmised, a general consensus on their results is still far from generalizations. In these perspectives, we attempt to check whether there exists any relationship between the two approaches. An empirical investigation is carried out in this work to study the relationship between estimated threshold values calculated at various risk levels using Bender’s approach and measures of central tendency using the Apache Click web application. The effect of these different threshold estimates on the performance of the developed defect prediction models is also studied and validated using different releases of the dataset. We find that the threshold indicator obtained from the representational models such as that due to Bender has an intricate relationship with the median value of the dataset. The close association between the model and statistical parameters mainly stems from the underlying characteristics of the data set itself. Descriptive statistical analysis of all Apache Click metrics dataset is found to be positively skewed, and hence median render the most relevant central measure for threshold estimation. Additionally, we also find that with increasing risk level, the threshold value subsequently shifts from median to mean value of the underlying metric data. Our preposition that the performance of the defect prediction model is best when threshold estimates are closer to the median is also verified with inter-version project comparison.
Similar content being viewed by others
References
Alves TL, Ypma C, Visser J (2010) Deriving metric thresholds from benchmark data. In: IEEE international conference on software maintenance, pp 1–10
Arar OF, Ayan K (2016) Deriving thresholds of software metrics to predict faults on open source software: replicated case studies. Expert Syst Appl 61:106–121
Arora I, Saha AA (2014) Literature review on software defect prediction. In: 2nd International conference on emerging research in computing, information, communication and applications, Bangalore, pp 478–487
Arora I, Tetarwal V, Saha A (2015) Open issues in software defect prediction. In: International conference on information and communication technologies, procedia computer science, vol 46, pp 906–912
Bansiya J, Davis A (2002) A hierarchical model for object-oriented design quality assessment. IEEE Trans Softw Eng 28(1):4–17
Basili V, Briand L, Melo W (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761
Bender R (1999) Quantitative risk assessment in epidemiological studies investigating threshold effects. Biom J 41(3):305–319
Benlarbi S, Emam KE, Goel N, Rai S (2000) Thresholds for object-oriented measures. In: Proceedings of the international symposium software reliability engnieering. ISSRE, pp 24–37
Briand L, Wust J, Daly J, Porter D (2000) Exploring the relationship between design measures and software quality in object oriented systems. J Syst Softw 51(3):245–273
Chidamber SR, Kemerer CF (1994) A metrics suite for OO design. IEEE Trans Softw Eng 20(6):476–493
Coleman D, Lowther B, Oman P (1995) The application of software maintainability models in industrial software systems. J Syst Softw 29(1):3–16
Emam KE, Melo W, Machado JC (2001) The prediction of faulty classes using object-oriented design metrics. J Syst Softw 56:63–75
Emam KE, Benlarbi S, Goel N, Melo W, Lounis H, Rai SN (2002) The optimal class size for object-oriented software. IEEE Trans Softw Eng 28(5):494–509
Erni K, Lewerentz C (1996) Applying design-metrics to object-oriented frameworks. In: METRICS 96: proceedings of the 3rd international symposium on software metrics. IEEE Computer Society, Washington, p 64
Fawcett T (2004) ROC graphs: notes and practical considerations for researchers. In: Machine learning, pp 31–35
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874
Ferreira KAM, Bigonha MAS, Bigonha RS, Mendes LFO, Almeida HC (2012) Identifying thresholds for object-oriented software metrics. J Syst Softw 85(2):244–257
French VA (1999) Establishing software metric thresholds. In: International workshop on software measurement
Genkin A, Lewis DD, Madigan D (2004) Large-scale bayesian logistic regression for text categorization. Technical report, DIMACS
Glasberg D, Emam KE, Melo W, Madhavji N (2000) Validating object-oriented design metrics on a commercial java application
Harrison R, Counsell SJ, Nithi RV (1998) An investigation into the applicability and validity of object-oriented design metrics. Empir Softw Eng 3(3):255–273
Henderson-Sellers B (1996) Object-oriented metrics: measures of complexity. Prentice-Hall, UpperSaddle River
Hitz M, Montazeri B (1996) Chidamber and Kemerer’s metrics suite: a measurement theory perspective. IEEE Trans Softw Eng 22(4):267–271
Hosmer D, Lemshow S (2000) Applied logistic regression, 2nd edn. Wiley-Interscience, New York
Humphrey WS (1997) Introduction to the personal software process. Addison-Wesley Longman Publishing Co. Inc., Boston
Hussain S, Keung J, Khan AA, Bennin KE (2016) Detection of fault-prone classes using logistic regression based object-oriented metrics thresholds. In: Proceedings of IEEE international conference on software quality, reliability and security companion, QRS-C, pp 93–100
Kitchenham BA, Mendes E, Travassos GH (2007) Cross versus within-company cost estimation studies: a systematic review. IEEE Trans Soft Eng 33(5):316–329
Li W, Henry S (1993) Object oriented metrics that predict maintainability. J Syst Softw 23:111–122
Lorenz M, Kidd J (1994) Object-oriented software metrics: a practical guide. Prentice-Hall Inc., Upper Saddle River
Madeyski L, Jureczko M (2014) Which process metrics can significantly improve defect prediction models? An empirical study. Softw Qual J 23:1–30
Malhotra R, Bansal AJ (2015) Fault prediction considering threshold effects of objectoriented metrics. Expert Syst 32(2):203–219
Malhotra R, Pritam N, Nagpal K, Upmanyu P (2014) Defect collection and reporting system for git based open source software. In: Proceedings of international conference on ICDMIC, pp 1–7
McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320
Misra SC, Bhavsar VC (2003) Relationships between selected software measures and latent bugdensity: guidelines for improving quality. Springer, New York, p 2003
Nejmeh BA (1988) NPATH: a measure of execution path complexity and its applications. ACM Commun 31(2):188–200
Ronchieri E, Canaparo M (2016) A preliminary mapping study of software metrics thresholds. In: Maciaszek L, Cardoso J, Cabello E, van Sin-deren M, Maciaszek L, Ludwig A, Cardoso J (eds) Proceedings of the 11th international joint conference on software technologies, vol 1. SciTePress, pp 232–240
Sanchez-Gonzalez L, Garcia F, Ruiz F, Mendling J (2012) A study of the effectiveness of two threshold definition techniques. In: 16th International conference on evaluation assessment in software engineering, pp 197–205
Sandhu PS, Brar AS, Goel R, Kaur J, Anand SA (2010) Model for early prediction of faults in software systems. In: 2nd International conference on computer and automation engineering, Singapore, pp 281–285
Scotto M, Sillitti A, Succi G, Vernazza T (2004) Dealing with software metrics collection and analysis: a relational approach. Stud Inf Univ 3(3):343–366
Shah SMA, Morisio M, Marco T (2012) An overview of software defect density: a scoping study. In: 19th Asia-Pacific software engineering conference, pp 406–415
Shatnawi R (2010) A quantitative investigation of the acceptable risk levels of OO metrics in open-source systems. IEEE Trans Softw Eng 36(2):216–225
Shatnawi R (2015) Deriving metrics thresholds using log transformation. J Softw Evol Process 27(2):95–113
Shatnawi R, Li W, Swain J, Newman T (2010) Finding software metrics threshold values using ROC curves. J Softw Maint Evolut 22(1):1–16
Tang MH, Kao MH, Chen MH (1999) An empirical study on object-oriented metrics. In: Proceedings of international symposium on software metrics, METRICS, pp 242–250
Zimmermann T, Nagappan N (2008) Predicting defects using network analysis on dependency graphs. In Proceedings of the international conference on software engineering, pp 531–540
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Malhotra, R., Sharma, A. Estimating the threshold of software metrics for web applications. Int J Syst Assur Eng Manag 10, 110–125 (2019). https://doi.org/10.1007/s13198-019-00773-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-019-00773-1