Skip to main content

Advertisement

Log in

Deriving change-prone thresholds from software evolution using ROC curves

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Software evolution measurement is required to control software costs and aid in the development of cost-effective software. Early detection of potential changes gives developers time to plan for change. Simple techniques to detect the change-proneness of classes are required such as thresholds, particularly in incremental software development. In this study, we propose to derive thresholds to detect the change-proneness of classes using ROC analysis. The analysis is conducted on the evolution of five systems for six object-oriented metrics, Chidamber and Kemerer. Thresholds are considered in software evolution in three intervals: 6 months, 12 months, and 3 years. Thresholds are reported for four metrics that can predict change-proneness. Similar thresholds are reported at 6 and 12 months. For the same metrics, fault-proneness thresholds are identified, and the results are compared to their counterparts in change-proneness thresholds. The change-proneness thresholds derived are smaller and identify more classes for further investigation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

No datasets were generated or analysed during the current study.

References

  1. Alenezi M (2021) Internal quality evolution of open-source software systems. Appl Sci 11(12):5690

    Article  MathSciNet  Google Scholar 

  2. Christa S, Suma V, Mohan U (2022) Regression and decision tree approaches in predicting the effort in resolving incidents. Int J Bus Inf Syst 39(3):379–399

    Google Scholar 

  3. Malhotra R, Khanna M (2017) An exploratory study for software change prediction in object-oriented systems using hybridized techniques. Autom Softw Eng 24:673–717

    Article  Google Scholar 

  4. Elish M, Aljamaan H, Ahmad I (2015) Three empirical studies on predicting software maintainability using ensemble methods. Soft Comput 19:2511–2524

    Article  Google Scholar 

  5. Mishra A, Shatnawi R, Catal C, Akbulut A (2021) Techniques for calculating software product metrics threshold values: a systematic mapping study. Appl Sci 11(23):11377

    Article  Google Scholar 

  6. Kretsou M, Arvanitou M, Ampatzoglou A, Deligiannis I, Gerogiannis V (2021) Change impact analysis: a systematic mapping study. J Syst Softw 174:110892

    Article  Google Scholar 

  7. Sakhrawi Z, Sellami A, Bouassida N (2022) Software enhancement effort estimation using correlation-based feature selection and stacking ensemble method. Clust Comput 25(4):2779–2792

    Article  Google Scholar 

  8. Koru A, Tian J (2005) Comparing high-change modules and modules with the highest measurement values in two large-scale open-source products. IEEE Trans Softw Eng 31(8):625–642

    Article  Google Scholar 

  9. Arisholm E, Briand L, Føyen A (2004) Dynamic coupling measurement for object-oriented software. IEEE Trans Softw Eng 30(8):491–506

    Article  Google Scholar 

  10. Lindvall M (1998) Are large C++ classes change-prone? Empir Invest Software Pract Ex 28(15):1551–1558

    Article  Google Scholar 

  11. Chidamber S, Kemerer C (1994) A metrics suite for object-oriented design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  12. Radjenović D, Heričko M, Torkar R, Živkovič A (2013) Software fault prediction metrics: a systematic literature review. Inf Softw Technol 55(8):1397–1418

    Article  Google Scholar 

  13. Shatnawi R, Li W, Swain J, Newman T (2010) Finding software metrics threshold values using roc curves. J Softw Maint Evol Res Pract 22(1):1–16

    Article  Google Scholar 

  14. Ferreira K, Bigonha M, Bigonha S, Mendes L, Almeida H (2012) Identifying thresholds for object-oriented software metrics. J Syst Softw 85(2):244–257

    Article  Google Scholar 

  15. Oliveira P, Tulio F, Lima V (2014) Extracting Relative Thresholds for Source Code Metrics, In: IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE) pp. 254–263.

  16. Kaur N, Singh H (2021) An empirical assessment of threshold techniques to discriminate the fault status of software. J King Saud Univ-Comput Inf Sci. 34(8):6339–6353

    Google Scholar 

  17. Hassan E, Holt R (2005) The top ten list: dynamic fault prediction, In: Proceedings of ICSM, pp. 263–272.

  18. Giger E, Pinzger M, Gall H (2012) Can we Predict Types of Code Changes? An Empirical Analysis, Mining Software Repositories (MSR), 2012, 9th IEEE Working Conference on, 2012, pp. 217–226.

  19. Lu H, Zhou Y, Xu B, Leung H, Chen L (2012) The ability of object-oriented metrics to predict change-proneness: a meta-analysis. Empir Softw Eng 17:200–242. https://doi.org/10.1007/s10664-011-9170-z

    Article  Google Scholar 

  20. Yan M, Zhang X, Liu C, Xu L, Yang M, Yang D (2017) Automated change-prone class prediction on unlabeled dataset using unsupervised method. Inf Softw Technol 92:1–16

    Article  Google Scholar 

  21. Malhotra R, Rupender J (2017) Prediction & assessment of change prone classes using statistical & machine learning techniques. J Inf Process Syst 13:778–804. https://doi.org/10.3745/JIPS.04.0013

    Article  Google Scholar 

  22. Kumar L, Rath S, and Sureka A (2017) Empirical Analysis on Effectiveness of Source Code Metrics for Predicting Change-Proneness, In: Proceedings of the 10th Innovations in Software Engineering Conference (ISEC ‘17). Association for Computing Machinery, New York, NY, USA, 4–14.

  23. Catal C, Alan O, Balkan K (2011) Class noise detection based on software metrics and ROC curves. Inf Sci 181(21):4867–4877

    Article  Google Scholar 

  24. Shatnawi R (2010) A quantitative investigation of the acceptable risk levels of object-oriented metrics in open-source systems. IEEE Trans Software Eng 36(2):216–225

    Article  Google Scholar 

  25. Malhotra R, Bansal A (2015) Fault prediction considering threshold effects of object-oriented metrics. Expert Syst 32:203–219

    Article  Google Scholar 

  26. Arar O, Ayan K (2016) Deriving thresholds of software metrics to predict faults on open source software: replicated case studies. Expert Syst Appl 61:106–121

    Article  Google Scholar 

  27. Boucher A, Badri M (2018) Software metrics thresholds calculation techniques to predict fault-proneness: an empirical comparison. Inf Softw Technol 196:38–67

    Article  Google Scholar 

  28. Samal U, Kumar A (2023) Redefining software reliability modeling: embracing fault-dependency, imperfect removal, and maximum fault considerations. Qual Eng 1–10.

  29. Malhotra R, Chug A, & Khosla P (2015) Prioritization of Classes for Refactoring: a Step Towards Improvement in Software Quality. In: Proceedings of the Third International Symposium on Women in Computing and Informatics (pp. 228–234).

  30. Mayvan B, Rasoolzadegan A, Javan Jafari A (2020) Bad smell detection using quality metrics and refactoring opportunities. J Softw Evol Process 32(8):e2255

    Article  Google Scholar 

  31. Alves T, Christiaan Y. Joost V (2010) Deriving Metric Thresholds From Benchmark Data. In: Proceedings of the IEEE International Conference on Software Maintenance (ICSM), Timisoara, Romania, 12–18; pp. 1–10.

  32. Jabangwe R, Borstler J, Smite D, Wohlin C (2015) Empirical evidence on the link between object-oriented measures and external quality attributes: a systematic literature review. Empir Softw Eng 20(3):640–693s

    Article  Google Scholar 

  33. Abreu F, Goulao M, Esteves R (1995) Toward the Design Quality Evaluation of Object-Oriented Software Systems, In: Proceedings of the 5th International Conference on Software Quality, pp. 44–57, 1995.

  34. Bansiya J, Davis CG (2002) A hierarchical model for object-oriented design quality assessment. IEEE Trans Software Eng 28(1):4–17

    Article  Google Scholar 

  35. Lorenz M (1994) Kidd J (1994) Object-oriented software metrics: A practical guide. Prentice-Hall, New Jersey, USA

    Google Scholar 

  36. D’Ambros M, Lanza M, Robbes R (2010) An Extensive Comparison of Bug Prediction Approaches, In: Proceedings of MSR 2010 (7th IEEE Working Conference on Mining Software Repositories), pp. 31 - 41. IEEE CS Press, 2010.

  37. Demeyer S, Tichelaar S, Ducasse S (2001) FAMIX 2.1—The FAMOOS Information Exchange Model, University of Bern, Tech. Rep

  38. Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2012) Experimentation in software engineering, Springer Science & Business Media

  39. Elish M, Elish K (2009) Application of TreeNet in Predicting Object-Oriented Software Maintainability: a Comparative Study, In: 13th European Conference on Software Maintenance and Reengineering (CSMR ‘09), pp 69–78

  40. Koten C, Gray A (2006) An application of Bayesian network for predicting object-oriented software maintainability. Inf Softw Technol 48(1):59–67

    Article  Google Scholar 

  41. Zhou Y, Xu B, Leung H, Chen L (2014) An in-depth study of the potentially confounding effect of class size in fault prediction. ACM Trans Software Eng Methodol 23(1):1–51

    Article  Google Scholar 

  42. Kaur A, Kaur M, and Kaur H (2016) Application of Machine Learning on Process Metrics for Defect Prediction in Mobile Application, In: Information Systems Design and Intelligent Applications, pp. 81–98.

  43. Kaur A, Kaur M (2018) An empirical evaluation of classification algorithms for fault prediction in open source projects. J King Saud Univ—Comput Inf Sci 30:2–17

    Google Scholar 

  44. Jindal R, Malhotra R, Jain A (2017) Prediction of defect severity by mining software project reports. Int J Syst Assur Eng Manag 8(2):334–351

    Article  Google Scholar 

  45. Hosmer D, Lemeshow S (2000) Applied Logistic Regression, 2nd edn. New York NY, Wiley-Interscience

    Book  Google Scholar 

  46. Al Dallal J, Morasca S (2018) Investigating the impact of fault data completeness over time on predicting class fault-proneness. Inf Softw Technol 95:86–105

    Article  Google Scholar 

  47. Meilong S, He P, Xiao H, Li H, Zeng C (2020) An approach to semantic and structural features learning for software defect prediction. Math Probl Eng 2020:6038619

    Article  Google Scholar 

  48. Ohlsson M, Amschler A, Wohlin C (2001) Modelling fault-proneness statistically over a sequence of releases: a case study. J Softw Maint 13(3):167–199

    Article  Google Scholar 

  49. Singh S, Kahlon K (2014) Object-oriented software metrics threshold values at quantitative acceptable risk level. CSI Trans ICT 2(3):191–205

    Article  Google Scholar 

  50. Hussain S, Keung J, Khan A, Ebo Bennin K (2016) Detection of fault-prone classes using logistic regression based object-oriented metrics thresholds, Software Quality, Reliability and Security Companion (QRS-C), IEEE International Conference on, pp. 93–100.

  51. Shatnawi R (2017) The application of ROC analysis in threshold identification data imbalance and metrics selection for software fault prediction. Innov Syst Soft Eng 13:201–217

    Article  Google Scholar 

  52. Jiang Y, Cukic B, Ma Y (2008) Techniques for evaluating fault prediction models. Empir Softw Eng 13:561–595

    Article  Google Scholar 

  53. Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, and Devanbu P (2009) Fair and balanced: Bias in bug-fix datasets, In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2009), pp. 121–130.

  54. Samal U, Kumar A (2023) A software reliability model incorporating fault removal efficiency and it’s release policy. Comput Stat. https://doi.org/10.1007/s00180-023-01430-9

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

I am the sole author of this research paper.

Corresponding author

Correspondence to Raed Shatnawi.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shatnawi, R. Deriving change-prone thresholds from software evolution using ROC curves. J Supercomput 80, 23565–23591 (2024). https://doi.org/10.1007/s11227-024-06366-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06366-5

Keywords