Prediction of defect severity by mining software project reports

Jindal, Rajni; Malhotra, Ruchika; Jain, Abha

doi:10.1007/s13198-016-0438-y

Prediction of defect severity by mining software project reports

Original Article
Published: 10 March 2016

Volume 8, pages 334–351, (2017)
Cite this article

International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Rajni Jindal¹,
Ruchika Malhotra¹ &
Abha Jain¹

638 Accesses
16 Citations
Explore all metrics

Abstract

With ever increasing demands from the software organizations, the rate of the defects being introduced in the software cannot be ignored. This has now become a serious cause of concern and must be dealt with seriously. Defects which creep into the software come with varying severity levels ranging from mild to catastrophic. The severity associated with each defect is the most critical aspect of the defect. In this paper, we intend to predict the models which will be used to assign an appropriate severity level (high, medium, low and very low) to the defects present in the defect reports. We have considered the defect reports from the public domain PITS dataset (PITS A, PITS C, PITS D and PITS E) which are being popularly used by NASA’s engineers. Extraction of the relevant data from the defect reports is accomplished by using text mining techniques and thereafter model prediction is carried out by using one statistical method i.e. Multi-nominal Multivariate Logistic Regression (MMLR) and two machine learning methods viz. Multi-layer Perceptron (MLP) and Decision Tree (DT). The performance of the models has been evaluated using receiver operating characteristics analysis and it was observed that the performance of DT model is the best as compared to the performance of MMLR and MLP models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aggarwal KK, Singh Y, Kaur A, Malhotra R (2009) Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study. Softw Process Improve Practice 16(1):39–62
Article Google Scholar
Canfora G, Cerulo L (2005) How software repositories can help in resolving a new change request. In: Workshop on empirical studies in reverse engineering
Catal C, Diri B (2009) A systematic review of software fault prediction studies. Expert Syst Appl 36:7346–7354
Article Google Scholar
Cubranic D, Murphy GC (2004) Automatic bug triage using text categorization. In: Proceedings of the sixteenth international conference on software engineering and knowledge engineering
Emam KE, Melo W (1999) The prediction of faulty classes using object-oriented design metrics. Technical Report NRC 43609
Emam KE, Benlarbi S, Goel N, Rai S (1999b) A validation of object-oriented metrics. NRC Technical report ERB-1063
Gondra I (2008) Applying machine learning to software fault-proneness prediction. J Syst Softw 81:186–195
Article Google Scholar
Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Article Google Scholar
Hosmer D, Lemeshow S (1989) Applied logistic regression. Wiley, New York
MATH Google Scholar
Ikonomakis M, Kotsiantis S, Tampakas V (2005) Text classification using machine learning techniques. WSEAS Trans Comput 4(8):966–974
Google Scholar
Jiang Y, Cukic B, Ma Y (2008) Techniques for evaluating fault prediction models. Empir Softw Eng 13(15):561–595
Article Google Scholar
Lamkanfi A, Serge D, Giger E, Goethals B (2010) Predicting the severity of a reported bug. In: 7th IEEE working conference on mining software repositories (MSR), pp 1–10
Malhotra R, Jain A (2012) Fault prediction using statistical and machine learning methods for improving software quality. J Inf Process Syst 8(2):241–262
Article Google Scholar
Malhotra R, Singh Y (2011) On the applicability of machine learning techniques for object-oriented software fault prediction. Softw Eng Int J 1(1):24–37
Google Scholar
Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: IEEE international conference on software maintenance (ICSM)
Myers G, Badgett T, Thomas T, Sandler C (2004) The art of software testing, 2nd edn. John Wiley & Sons Inc, Hoboken
Google Scholar
Ohlsson N, Zhao M, Helander M (1998) Application of multivariate analysis for software fault prediction. Softw Qual J 7:51–66
Google Scholar
Olague H, Etzkorn L, Gholston S, Quattlebaum S (2007) Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans Softw Eng 33(8):402–419
Article Google Scholar
Pai G (2007) Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans Softw Eng 33(10):675–686. doi:10.1109/TSE.2007.70722
Article Google Scholar
Porter A, Selby R (1990) Empirically guided software development using metric-based classification trees. IEEE Softw 7(2):46–54. doi:10.1109/52.50773
Article Google Scholar
Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing of 29th IEEE international conference on software engineering (ICSE), pp 499–508
Sari GIP, Siahaan DO (2011) An attribute selection for severity level determination according to the support vector machine classification result. In: Proceedings of the 1st international conference on information systems for business competitiveness (ICISBC)
Shatnawi R, Li W (2008) The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J Syst Softw 81:1868–1882
Article Google Scholar
Singh Y, Kaur A, Malhotra R (2010) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18:3–35
Article Google Scholar
Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. Association for Computing Machinery
Widro B, Lehr MA (1990) 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proc IEEE 78(9):1415–1442
Article Google Scholar
Yu P, Systa T, Muller H (2002) Predicting fault-proneness using OO metrics: an industrial case study. In: Proceedings of sixth European conference on software maintenance and reengineering, Budapest, pp 99–107
Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softwre Eng 32(10):771–789
Article Google Scholar
Zhou Y, Xu B, Leung H (2010) On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J Syst Softw 83:660–674
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science Engineering, Delhi Technological University, New Delhi, India
Rajni Jindal, Ruchika Malhotra & Abha Jain

Authors

Rajni Jindal
View author publications
You can also search for this author in PubMed Google Scholar
Ruchika Malhotra
View author publications
You can also search for this author in PubMed Google Scholar
Abha Jain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruchika Malhotra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jindal, R., Malhotra, R. & Jain, A. Prediction of defect severity by mining software project reports. Int J Syst Assur Eng Manag 8, 334–351 (2017). https://doi.org/10.1007/s13198-016-0438-y

Download citation

Received: 20 August 2015
Revised: 01 March 2016
Published: 10 March 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s13198-016-0438-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction of defect severity by mining software project reports

Abstract

Access this article

Similar content being viewed by others

Heart Disease Prediction using Machine Learning Techniques

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Prediction of defect severity by mining software project reports

Abstract

Access this article

Similar content being viewed by others

Heart Disease Prediction using Machine Learning Techniques

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation