Skip to main content
Log in

Prediction of defect severity by mining software project reports

  • Original Article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

With ever increasing demands from the software organizations, the rate of the defects being introduced in the software cannot be ignored. This has now become a serious cause of concern and must be dealt with seriously. Defects which creep into the software come with varying severity levels ranging from mild to catastrophic. The severity associated with each defect is the most critical aspect of the defect. In this paper, we intend to predict the models which will be used to assign an appropriate severity level (high, medium, low and very low) to the defects present in the defect reports. We have considered the defect reports from the public domain PITS dataset (PITS A, PITS C, PITS D and PITS E) which are being popularly used by NASA’s engineers. Extraction of the relevant data from the defect reports is accomplished by using text mining techniques and thereafter model prediction is carried out by using one statistical method i.e. Multi-nominal Multivariate Logistic Regression (MMLR) and two machine learning methods viz. Multi-layer Perceptron (MLP) and Decision Tree (DT). The performance of the models has been evaluated using receiver operating characteristics analysis and it was observed that the performance of DT model is the best as compared to the performance of MMLR and MLP models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aggarwal KK, Singh Y, Kaur A, Malhotra R (2009) Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study. Softw Process Improve Practice 16(1):39–62

    Article  Google Scholar 

  • Canfora G, Cerulo L (2005) How software repositories can help in resolving a new change request. In: Workshop on empirical studies in reverse engineering

  • Catal C, Diri B (2009) A systematic review of software fault prediction studies. Expert Syst Appl 36:7346–7354

    Article  Google Scholar 

  • Cubranic D, Murphy GC (2004) Automatic bug triage using text categorization. In: Proceedings of the sixteenth international conference on software engineering and knowledge engineering

  • Emam KE, Melo W (1999) The prediction of faulty classes using object-oriented design metrics. Technical Report NRC 43609

  • Emam KE, Benlarbi S, Goel N, Rai S (1999b) A validation of object-oriented metrics. NRC Technical report ERB-1063

  • Gondra I (2008) Applying machine learning to software fault-proneness prediction. J Syst Softw 81:186–195

    Article  Google Scholar 

  • Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910

    Article  Google Scholar 

  • Hosmer D, Lemeshow S (1989) Applied logistic regression. Wiley, New York

    MATH  Google Scholar 

  • Ikonomakis M, Kotsiantis S, Tampakas V (2005) Text classification using machine learning techniques. WSEAS Trans Comput 4(8):966–974

    Google Scholar 

  • Jiang Y, Cukic B, Ma Y (2008) Techniques for evaluating fault prediction models. Empir Softw Eng 13(15):561–595

    Article  Google Scholar 

  • Lamkanfi A, Serge D, Giger E, Goethals B (2010) Predicting the severity of a reported bug. In: 7th IEEE working conference on mining software repositories (MSR), pp 1–10

  • Malhotra R, Jain A (2012) Fault prediction using statistical and machine learning methods for improving software quality. J Inf Process Syst 8(2):241–262

    Article  Google Scholar 

  • Malhotra R, Singh Y (2011) On the applicability of machine learning techniques for object-oriented software fault prediction. Softw Eng Int J 1(1):24–37

    Google Scholar 

  • Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: IEEE international conference on software maintenance (ICSM)

  • Myers G, Badgett T, Thomas T, Sandler C (2004) The art of software testing, 2nd edn. John Wiley & Sons Inc, Hoboken

    Google Scholar 

  • Ohlsson N, Zhao M, Helander M (1998) Application of multivariate analysis for software fault prediction. Softw Qual J 7:51–66

    Google Scholar 

  • Olague H, Etzkorn L, Gholston S, Quattlebaum S (2007) Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans Softw Eng 33(8):402–419

    Article  Google Scholar 

  • Pai G (2007) Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans Softw Eng 33(10):675–686. doi:10.1109/TSE.2007.70722

    Article  Google Scholar 

  • Porter A, Selby R (1990) Empirically guided software development using metric-based classification trees. IEEE Softw 7(2):46–54. doi:10.1109/52.50773

    Article  Google Scholar 

  • Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing of 29th IEEE international conference on software engineering (ICSE), pp 499–508

  • Sari GIP, Siahaan DO (2011) An attribute selection for severity level determination according to the support vector machine classification result. In: Proceedings of the 1st international conference on information systems for business competitiveness (ICISBC)

  • Shatnawi R, Li W (2008) The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J Syst Softw 81:1868–1882

    Article  Google Scholar 

  • Singh Y, Kaur A, Malhotra R (2010) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18:3–35

    Article  Google Scholar 

  • Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. Association for Computing Machinery

  • Widro B, Lehr MA (1990) 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proc IEEE 78(9):1415–1442

    Article  Google Scholar 

  • Yu P, Systa T, Muller H (2002) Predicting fault-proneness using OO metrics: an industrial case study. In: Proceedings of sixth European conference on software maintenance and reengineering, Budapest, pp 99–107

  • Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softwre Eng 32(10):771–789

    Article  Google Scholar 

  • Zhou Y, Xu B, Leung H (2010) On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J Syst Softw 83:660–674

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruchika Malhotra.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jindal, R., Malhotra, R. & Jain, A. Prediction of defect severity by mining software project reports. Int J Syst Assur Eng Manag 8, 334–351 (2017). https://doi.org/10.1007/s13198-016-0438-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-016-0438-y

Keywords

Navigation