Skip to main content
Log in

Bug prediction modeling using complexity of code changes

  • Original Article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

Researchers have proposed and implemented a plethora of bug prediction approaches in terms of different mathematical models for measuring the reliability growth of the software and to predict the latent bugs lying dormant in the software. During the last four decades, software reliability growth models (SRGM) have been successfully used to measure the reliability growth of closed source software. The SRGM developed were based on either calendar time or on testing effort. In late 90s, due to the advancement in communication and internet technologies, the development of open source software gets an edge and is proven to be very successful in different fields. Recently, researchers have measured the latent bugs in the open source software using an SRGM which has been developed for closed source software and concluded that the existing SRGM can well predict the latent bugs, but, still, it needs more investigation. In open source software, the source codes are frequently changes (the complexity of code changes) to meet the new feature introduction, feature enhancement and bug repair. In this paper, we have developed two complexity of code changes/entropy based bug prediction models namely (i) time vs entropy and (ii) entropy vs bugs. We have compared the proposed models with the existing time vs bugs SRGM. The empirical work has been carried out using three subsystems of Mozilla project. The statistical significance of different approaches has been tested using a non-parametric Kolmogorov–Smirnov (K–S) test. The bug prediction approaches have been compared on the basis of various performance measures namely R-Square (R2), Adjusted R-Square (adj. R2), Bias, variation and root mean square prediction errors. We found that the potential complexity of code changes based bug prediction approach i.e. time vs entropy is better over the time vs bugs and entropy vs bugs on the basis of different comparison criteria and statistical test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Arisholm E, Briand LC (2006) Predicting fault prone components in a java legacy system. In: Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. ACM Press, p 8–17

  • Chaturvedi KK, Kapur PK, Anand S, Singh VB (2012) Predicting software change complexity using entropy based measures. Paper presented at 6th international conference on quality, reliability, infocomm technology and industrial technology management (ICQRITITM 2012) during 26–28 Nov. 2012 at conference centre, University of Delhi, Delhi

  • D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In MSR’10: Proceedings of the 7th international working conference on mining software repositories. p 31–41

  • D’Ambros M, Lanza M, Robbes R (2012) Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Softw Eng 17(4–5):531–577

    Article  Google Scholar 

  • Fenton NE, Neil M (1999) A critique of software defect prediction models. IEEE Trans Softw Eng 25(3):675–689

    Article  Google Scholar 

  • Goel AL, Okumoto K (1979) Time dependent error detection rate model for software reliability and other performance measures. IEEE Trans Reliab 28(3):206–211

    Article  MATH  Google Scholar 

  • Graves TL, Karr AF, Marron JS, Siy HP (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661

    Article  Google Scholar 

  • Gyimothty T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910

    Article  Google Scholar 

  • Hassan AE (2009) Predicting faults based on complexity of code change. In: The proceedings of 31st Intl. Conf. On Software Engineering. p 78–88

  • Hassan AE, Holt RC (2003a) Studying the chaos in code development. In: Proceedings of 10th working conference on reverse engineering

  • Hassan AE, Holt RC (2003b) The chaos of software development. In: Proceedings of the 6th IEEE international workshop on principles of software evolution

  • Hassan AE, Holt RC (2005) The top ten lists: dynamic fault prediction. In: Proceedings of ICSM. p 263–272

  • Huang CY, Kuo SY, Chen JY (1997) Analysis of a software reliability growth model with logistic testing effort function. In: Proceedings of eighth international symposium on software reliability engineering. p 378–388

  • Kamei Y, Matsumoto S, Monden A, Matsumoto K, Adams B, Hassan A (2010) Revisiting common bug prediction findings using effort-aware models. In: Proc. Int’l Conf. On Softw. Maint. p 1–10

  • Kapur PK, Garg RB (1992) A software reliability growth model for an error removal phenomenon. Softw Eng J 7:291–294

    Article  Google Scholar 

  • Kapur PK, Garg RB, Kumar S (1999) Contributions to hardware and software reliability. World Scientific Publishing Co., Ltd., Singapore

    Book  MATH  Google Scholar 

  • Kapur PK, Goswami DN, Bardhan A, Singh O (2008) Flexible software reliability growth model with testing effort dependent learning process. Appl Math Model 32:1298–1307

    Article  MATH  Google Scholar 

  • Khoshgoftaar TM, Allen EB, Jones WD, Hudepohl JP (1999) Data mining for predictors of software quality. Int J Softw Eng Knowl Eng 9(5):547–563

    Article  Google Scholar 

  • Kim S, Zimmermann T, Whitehead J, Zeller A (2007) Predicting faults from cached history. In: Proceedings of ICSE. IEEE, p 489–498

  • Knab P, Pinzger M, Bernstein A (2006) Predicting defect densities in source code files with decision tree learners. In: Proc. Int’l workshop on mining software repositories. p 119–125

  • Leszak M, Perry DE, Stoll D (2002) Classification and evaluation of defects in a project retrospective. J Syst Softw 61(3):173–187

    Article  Google Scholar 

  • Lyu MR (1996) Handbook of software reliability engineering. McGraw-Hill, New York

    Google Scholar 

  • Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proc. Int’l Conf. On Softw. Eng. p 181–190

  • Musa JD, Iannino A, Okumoto K (1987) Software reliability, measurement, prediction and application. McGraw-Hill, New York

    Google Scholar 

  • Nagappan N, Ball T (2005a) Static analysis tools as early indicators of pre-release defect density. In: Proceedings of ICSE. ACM, p 580–586

  • Nagappan N, Ball T (2005b) Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th international conference on software engineering. p 284–292

  • Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proceedings of ICSE. ACM, p 452–461

  • Ohba M (1984) Inflection S-shaped software reliability growth model. In: Osaki S, Hotoyama Y (eds) Lecture notes in economics and mathematical systems. Springer, Berlin

    Google Scholar 

  • Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large complex systems. IEEE Trans Softw Eng 31(4):340–355

    Article  Google Scholar 

  • Pham H (2006) System software reliability. Springer series in reliability engineering

  • Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423 & 623–656

    Google Scholar 

  • Singh VB, Chaturvedi KK (2012) Entropy based bug prediction using support vector regression. In: Proceedings ISDA 2012–12th international conference on intelligent system design and applications, Nov 27–29, 2012, IEEE Xplore, Kochi. p 746–751

  • Singh VB, Chaturvedi KK (2013) Improving the quality of software by quantifying the code change metric and predicting the bugs. In: Murgante B et al (eds) ICCSA 2013, Part II, LNCS 7972. Springer, Berlin, pp 408–426

    Google Scholar 

  • Singh VB, Yadav K, Kapur R, Yadavalli VSS (2007) Considering the fault dependency concept with debugging time lag in software reliability growth modelling using a power function of testing time. Int J Autom Comput 4(4):359–368

    Article  Google Scholar 

  • The bugZilla project (2013) http://www.bugzilla.org

  • The Mozilla project (2013) http://www.mozilla.org

  • Trivedi KS (2001) Probability and statistics with reliability, queuing and computer science applications, 2nd edn. Wiley, New York

    Google Scholar 

  • Weisberg S (1980) Applied linear regression. Wiley, New York

    MATH  Google Scholar 

  • Xie M (1991) Software reliability modelling. World Scientific Publishing Company Pte. Ltd, Singapore

    MATH  Google Scholar 

  • Yamada S, Ohba M, Osaki S (1983) S-shaped software reliability growth modelling for software error detection. IEEE Trans Reliab 32(5):475–484

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. B. Singh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, V.B., Chaturvedi, K.K., Khatri, S.K. et al. Bug prediction modeling using complexity of code changes. Int J Syst Assur Eng Manag 6, 44–60 (2015). https://doi.org/10.1007/s13198-014-0242-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-014-0242-5

Keywords

Navigation