Skip to main content

Hybrid SMOTE-Ensemble Approach for Software Defect Prediction

  • Conference paper
  • First Online:
Book cover Software Engineering Trends and Techniques in Intelligent Systems (CSOC 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 575))

Included in the following conference series:

Abstract

Software defect prediction is the process of identifying new defects/bugs in software modules. Software defect presents an error in a computer program, which is caused by incorrect code or incorrect programming logic. As a result, undiscovered defects lead to a poor quality software products. In recent years, software defect prediction has received a considerable amount of attention from researchers. Most of the previous defect detection algorithms are marred by low defect detection ratios. Furthermore, software defect prediction is very challenging problem due to the high imbalanced distribution, where the bug-free codes are much higher than defective ones. In this paper, the software defect prediction problem is formulated as a classification task, and then it examines the impact of several ensembles methods on the classification effectiveness. In addition, the best ensemble classifier will be selected to be trained again on an over-sampled datasets using the Synthetic Minority Over-sampling Technique (SMOTE) algorithm to tackle imbalanced distribution problem. The proposed hybrid method is evaluated using four software defects datasets. Experimental results demonstrate that the proposed method can effectively enhance the defect prediction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://openscience.us/repo/.

References

  1. Rawat, M.S., Dubey, S.K.: Software defect prediction models for quality improvement: a literature study. IJCSI Int. J. Comput. Sci. Issues 9, 288–296 (2012)

    Google Scholar 

  2. Aljarah, I., Banitaan, S., Abufardeh, S., Jin, W., Salem, S.: Selecting discriminating terms for bug assignment: a formal analysis. In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, no. 12. ACM (2011)

    Google Scholar 

  3. Zheng, J.: Predicting software reliability with neural network ensembles. Expert Syst. Appl. 36, 2116–2122 (2009)

    Article  Google Scholar 

  4. Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38, 1276–1304 (2012)

    Article  Google Scholar 

  5. Arisholm, E., Briand, L.C., Johannessen, E.B.: A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw. 83, 2–17 (2010)

    Article  Google Scholar 

  6. Dowd, M., McDonald, J., Schuh, J.: The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities. Pearson Education, Upper Saddle River (2006)

    Google Scholar 

  7. Abaei, G., Selamat, A.: A survey on software fault detection based on different prediction approaches. Vietnam J. Comput. Sci. 1, 79–95 (2014)

    Article  Google Scholar 

  8. Tomar, D., Agarwal, S.: Prediction of defective software modules using class imbalance learning. Appl. Comput. Intell. Soft Comput. 2016 (2016). Article no. 6

    Google Scholar 

  9. Fenton, N.E., Neil, M.: Software metrics: roadmap. In: Proceedings of the Conference on the Future of Software Engineering, pp. 357–370. ACM (2000)

    Google Scholar 

  10. Fenton, N., Bieman, J.: Software Metrics: A Rigorous and Practical Approach. CRC Press, Boca Raton (2014)

    Book  MATH  Google Scholar 

  11. Clark, B., Zubrow, D.: How good is the software: a review of defect prediction techniques. Sponsored by the US Department of Defense (2001)

    Google Scholar 

  12. Wang, S., Liu, T., Tan, L.: Automatically learning semantic features for defect prediction. In: Proceedings of the 38th International Conference on Software Engineering, pp. 297–308. ACM (2016)

    Google Scholar 

  13. Quah, T.S., Thwin, M.M.T.: Application of neural networks for software quality prediction using object-oriented metrics. In: Proceedings on International Conference on Software Maintenance, ICSM 2003, pp. 116–125. IEEE (2003)

    Google Scholar 

  14. Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81, 649–660 (2008)

    Article  Google Scholar 

  15. Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33, 2–13 (2007)

    Article  Google Scholar 

  16. Evett, M., Khoshgoftar, T., Chien, P.D., Allen, E.: Gp-based software quality prediction. In: Proceedings of the Third Annual Conference Genetic Programming, pp. 60–65 (1998)

    Google Scholar 

  17. Koru, A.G., Liu, H.: Building effective defect-prediction models in practice. IEEE Softw. 22, 23–29 (2005)

    Article  Google Scholar 

  18. Suffian, M.D.M., Ibrahim, S.: A prediction model for system testing defects using regression analysis. arXiv preprint arXiv:1401.5830 (2014)

  19. Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to classify e-mail. Inf. Sci. 177, 2167–2187 (2007)

    Article  Google Scholar 

  20. Yuan, X., Khoshgoftaar, T.M., Allen, E.B., Ganesan, K.: An application of fuzzy clustering to software quality prediction. In: Proceedings of 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology, pp. 85–90. IEEE (2000)

    Google Scholar 

  21. Czibula, G., Marian, Z., Czibula, I.G.: Software defect prediction using relational association rule mining. Inf. Sci. 264, 260–278 (2014)

    Article  Google Scholar 

  22. Catal, C., Diri, B.: Software fault prediction with object-oriented metrics based artificial immune recognition system. In: Münch, J., Abrahamsson, P. (eds.) PROFES 2007. LNCS, vol. 4589, pp. 300–314. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73460-4_27

    Chapter  Google Scholar 

  23. Catal, C., Diri, B.: A fault prediction model with limited fault data to improve test process. In: Jedlitschka, A., Salo, O. (eds.) PROFES 2008. LNCS, vol. 5089, pp. 244–257. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69566-0_21

    Chapter  Google Scholar 

  24. Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)

    MATH  Google Scholar 

  25. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  26. Schapire, R.E.: Explaining AdaBoost. In: Schölkopf, B., Luo, Z., Vovk, V. (eds.) Empirical Inference, pp. 37–52. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41136-6_5

    Chapter  Google Scholar 

  27. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    MATH  Google Scholar 

  28. Shepperd, M., Song, Q., Sun, Z., Mair, C.: Data quality: some comments on the nasa software defect datasets. IEEE Trans. Softw. Eng. 39, 1208–1215 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ibrahim Aljarah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Alsawalqah, H., Faris, H., Aljarah, I., Alnemer, L., Alhindawi, N. (2017). Hybrid SMOTE-Ensemble Approach for Software Defect Prediction. In: Silhavy, R., Silhavy, P., Prokopova, Z., Senkerik, R., Kominkova Oplatkova, Z. (eds) Software Engineering Trends and Techniques in Intelligent Systems. CSOC 2017. Advances in Intelligent Systems and Computing, vol 575. Springer, Cham. https://doi.org/10.1007/978-3-319-57141-6_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57141-6_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57140-9

  • Online ISBN: 978-3-319-57141-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics