Effect of Feature Selection in Software Fault Detection

Tasnim Cynthia, Shamse; Rasul, Md. Golam; Ripon, Shamim

doi:10.1007/978-3-030-33709-4_5

Shamse Tasnim Cynthia¹⁰,
Md. Golam Rasul¹⁰ &
Shamim Ripon¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11909))

Included in the following conference series:

International Conference on Multi-disciplinary Trends in Artificial Intelligence

712 Accesses
3 Citations

Abstract

The quality of software is enormously affected by the faults associated with it. Detection of faults at a proper stage in software development is a challenging task and plays a vital role in the quality of the software. Machine learning is, now a days, a commonly used technique for fault detection and prediction. However, the effectiveness of the fault detection mechanism is impacted by the number of attributes in the publicly available datasets. Feature selection is the process of selecting a subset of all the features that are most influential to the classification and it is a challenging task. This paper thoroughly investigates the effect of various feature selection techniques on software fault classification by using NASA’s some benchmark publicly available datasets. Various metrics are used to analyze the performance of the feature selection techniques. The experiment discovers that the most important and relevant features can be selected by the adopted feature selection techniques without sacrificing the performance of fault detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, S., Tomar, D.: A feature selection based model for software defect prediction. Int. J. Adv. Sci. Technol. 65, 39–58 (2014)
Article Google Scholar
Anbu, M., Anandha Mala, G.S.: Feature selection using firefly algorithm in software defect prediction. Cluster Comput., 1–10 (2017)
Google Scholar
Arasteh, B.: Software fault-prediction using combination of neural network and Naive Bayes algorithm. J. Netw. Technol. 9(3), 94 (2018)
Article Google Scholar
Chen, X., Shen, Y., Cui, Z., Ju, X.: Applying feature selection to software defect prediction using multi-objective optimization. In 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), pp. 54–59. IEEE, July 2017
Google Scholar
Crack, T.F.: A note on Karl Pearson’s 1900 Chi-squared test: two derivations of the asymptotic distribution, and uses in goodness of fit and contingency tests of independence, and a comparison with the exact sample variance chi-square result. SSRN Electron. J. (2018)
Google Scholar
Akalya Devi, C., Surendiran, B., Kannammal, K.E.: A study of feature selection methods for software fault prediction model. In: Proceedings of the International Conference on Network, Intelligence and Computing Technologies (ICNICT 2011), Tamil Nadu, India, pp. 1–5 (2011)
Google Scholar
Fawagreh, K., Gaber, M.M., Elyan, E.: Random forests: from early developments to recent advancements. Syst. Sci. Control Eng. 2(1), 602–609 (2014)
Article Google Scholar
Felix, E.A., Lee, S.P.: Integrated approach to software defect prediction. IEEE Access 5, 21524–21547 (2017)
Article Google Scholar
Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B.: The misuse of the NASA metrics data program data sets for automated software defect prediction. In: 15th Annual Conference on Evaluation & Assessment in Software Engineering (EASE 2011), pp. 96–103. IET (2011)
Google Scholar
Ibrahim, D.R., Ghnemat, R., Hudaib, A.: Software defect prediction using feature selection and random forest algorithm. In: 2017 International Conference on New Trends in Computing Sciences (ICTCS), pp. 252–257. IEEE, October 2017
Google Scholar
Jakhar, A.K., Rajnish, K.: Software fault prediction with data mining techniques by using feature selection based models. Int. J. Electr. Eng. Inf. 10(3), 447–465 (2018)
Google Scholar
Jia, L.: A hybrid feature selection method for software defect prediction. IOP Conf. Ser. Mater. Sci. Eng. 394(3), 032035 (2018)
Article Google Scholar
Jiang, Y., Li, M., Zhou, Z.-H.: Software defect detection with ROCUS. J. Comput. Sci. Technol. 26(2), 328–342 (2011)
Article Google Scholar
Kakkar, M., Jain, S.: Feature selection in software defect prediction: a comparative study. In 2016 6th International Conference - Cloud System and Big Data Engineering (Confluence), pp. 658–663. IEEE, January 2016
Google Scholar
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256 (1992)
Chapter Google Scholar
McHugh, M.L.: The Chi-square test of independence. Biochemia Medica, 143–149 (2013)
Google Scholar
Mishra, M., Srivastava, M.: A view of artificial neural network. In: 2014 International Conference on Advances in Engineering & Technology Research (ICAETR - 2014), pp. 1–3. IEEE, August 2014
Google Scholar
Nugroho, A., Chaudron, M.R.V., Arisholm, E.: Assessing UML design metrics for predicting fault-prone classes in a Java system. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pp. 21–30. IEEE, May 2010
Google Scholar
Joanne Peng, C.-Y., Lee, K.L., Ingersoll, G.M.: An introduction to logistic regression analysis and reporting. J. Educ. Res. 96(1), 3–14 (2002)
Article Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2010)
Article Google Scholar
Shepperd, M., Song, Q., Sun, Z., Mair, C.: Data quality: some comments on the NASA software defect data sets. 2010(9), 1–13 (2013)
Google Scholar
Singhal, R., Rana, R.: Chi-square test and its application in hypothesis testing. J. Pract. Cardiovasc. Sci. 1(1), 69 (2015)
Article Google Scholar
Son, L.H., et al.: Empirical study of software defect prediction: a systematic mapping. Symmetry 11(2) (2019)
Article Google Scholar
Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Software Eng. 37(3), 356–370 (2011)
Article Google Scholar
Wahono, R.S., Herman, N.S.: Genetic feature selection for software defect prediction. Adv. Sci. Lett. 20(1), 239–244 (2014)
Article Google Scholar
Webb, G.I., Keogh, E., Miikkulainen, R., Sebag, M.: Naïve Bayes. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 713–714. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-30164-8_576
Chapter Google Scholar
Xu, Z., Xuan, J., Liu, J., Cui, X.: MICHAC: defect prediction via feature selection based on maximal information coefficient with hierarchical agglomerative clustering. In: 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 370–381. IEEE, March 2016
Google Scholar
Yousef, A.H.: Extracting software static defect models using data mining. Ain Shams Eng. J. 6(1), 133–144 (2015)
Article Google Scholar
Qiao, Y., Jiang, S., Wang, R., Wang, H.: A feature selection approach based on a similarity measure for software defect prediction. Front. Inf. Technol. Electron. Eng. 18(11), 1744–1753 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh
Shamse Tasnim Cynthia, Md. Golam Rasul & Shamim Ripon

Authors

Shamse Tasnim Cynthia
View author publications
You can also search for this author in PubMed Google Scholar
Md. Golam Rasul
View author publications
You can also search for this author in PubMed Google Scholar
Shamim Ripon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shamim Ripon .

Editor information

Editors and Affiliations

Mahasarakham University, Maha Sarakham, Thailand
Rapeeporn Chamchong
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tasnim Cynthia, S., Rasul, M.G., Ripon, S. (2019). Effect of Feature Selection in Software Fault Detection. In: Chamchong, R., Wong, K. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2019. Lecture Notes in Computer Science(), vol 11909. Springer, Cham. https://doi.org/10.1007/978-3-030-33709-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-33709-4_5
Published: 21 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33708-7
Online ISBN: 978-3-030-33709-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics