Understanding machine learning software defect predictions

Esteves, Geanderson; Figueiredo, Eduardo; Veloso, Adriano; Viggiato, Markos; Ziviani, Nivio

doi:10.1007/s10515-020-00277-4

Understanding machine learning software defect predictions

Published: 12 October 2020

Volume 27, pages 369–392, (2020)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

1932 Accesses
39 Citations
Explore all metrics

Abstract

Software defects are well-known in software development and might cause several problems for users and developers aside. As a result, researches employed distinct techniques to mitigate the impacts of these defects in the source code. One of the most notable techniques focuses on defect prediction using machine learning methods, which could support developers in handling these defects before they are introduced in the production environment. These studies provide alternative approaches to predict the likelihood of defects. However, most of these works concentrate on predicting defects from a vast set of software features. Another key issue with the current literature is the lack of a satisfactory explanation of the reasons that drive the software to a defective state. Specifically, we use a tree boosting algorithm (XGBoost) that receives as input a training set comprising records of easy-to-compute characteristics of each module and outputs whether the corresponding module is defect-prone. To exploit the link between predictive power and model explainability, we propose a simple model sampling approach that finds accurate models with the minimum set of features. Our principal idea is that features not contributing to increasing the predictive power should not be included in the model. Interestingly, the reduced set of features helps to increase model explainability, which is important to provide information to developers on features related to each module of the code which is more defect-prone. We evaluate our models on diverse projects within Jureczko datasets, and we show that (i) features that contribute most for finding best models may vary depending on the project and (ii) it is possible to find effective models that use few features leading to better understandability. We believe our results are useful to developers as we provide the specific software features that influence the defectiveness of selected projects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data collection and quality challenges in deep learning: a data-centric AI perspective

Article 03 January 2023

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Article 08 April 2024

Using machine learning and deep learning algorithms for downtime minimization in manufacturing systems: an early failure detection diagnostic service

Article 23 August 2023

Notes

http://promise.site.uottawa.ca/SERepository/.

References

Agrawal, A., Menzies, T.: Is better data better than better data miners? On the benefits of tuning smote for defect prediction. In: International Conference of Software Engineering (ICSE), pp. 1050–1061 (2018)
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 785–794 (2016)
Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994)
Article Google Scholar
Couto, C., Silva. C., Valente, M.T., Bigonha, R., Anquetil, N.: Uncovering causal relationships between software metrics and bugs. In: 2012 16th European Conference on Software Maintenance and Reengineering (2012)
D’Ambros, M., Lanza, M., Robbes, R.: An extensive comparison of bug prediction approaches. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) (2010)
Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)
Article Google Scholar
Ferenc, R., Tóth, Z., Ladányi, G., Siket, I., Gyimóthy, T.: A public unified bug dataset for java. In: Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE, pp. 12–21 (2018)
Fukushima, T., Kamei, Y., McIntosh, S., Yamashita, K., Ubayashi, N.: An empirical study of just-in-time defect prediction using cross-project models. In: Working Conference on Mining Software Repositories (MSR), pp. 172–181 (2014)
Ghotra, B., McIntosh, S., Hassan, A.E.: Revisiting the impact of classification techniques on the performance of defect prediction models. In: IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE), vol 1, pp. 789–800 (2015)
Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B.: Using the support vector machine as a classification method for software defect prediction with static code metrics. In: International Conference on Engineering Applications of Neural Networks (EANN), pp. 223–234 (2009)
Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B.: The misuse of the nasa metrics data program data sets for automated software defect prediction. In: 15th Annual Conference on Evaluation Assessment in Software Engineering (EASE 2011), pp. 96–103 (2011)
Halstead, M.H.: Elements of Software Science (Operating and Programming Systems Series). Elsevier Science Inc, Amsterdam (1977)
MATH Google Scholar
Herbold, S.: Crosspare: a tool for benchmarking cross-project defect predictions. In: 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW) (2015)
Jiang, T., Tan, L., Kim, S.: Personalized defect prediction. In: IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 279–289 (2013)
Jiarpakdee, J., Tantithamthavorn, C., Dam, H.K., Grundy, J.: An empirical study of model-agnostic techniques for defect prediction models. IEEE Trans. Softw. Eng. pp 1–1 (2020)
Jing, X.Y., Ying, S., Zhang, Z.W., Wu, S.S., Liu, J.: Dictionary learning based software defect prediction. In: International Conference of Software Engineering (ICSE), pp. 414–423 (2014)
Jureczko, M., Madeyski, L.: Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering, pp. 9:1–9:10 (2010)
Jureczko, M., Spinellis, D.D.: Using object-oriented design metrics to predict software defects. In: In Models and Methods of System Dependability. Oficyna Wydawnicza Politechniki Wrocławskiej, pp. 69–81 (2010)
Knab, P., Pinzger, M., Bernstein, A.: Predicting defect densities in source code files with decision tree learners. In: Proceedings of the International Workshop on Mining Software Repositories (MSR), MSR, pp. 119–125 (2006)
Kuhn, M.: Caret: Classification and regression training. http://topepo.github.io/caret/index.html (2015)
Lewis, C., Lin, Z., Sadowski, C., Zhu, X., Ou, R., Whitehead Jr, EJ.: Does bug prediction support human developers? findings from a google case study. In: International Conference of Software Engineering (ICSE), pp. 372–381 (2013)
Lundberg, S.M., Lee, S.: Consistent feature attribution for tree ensembles. arXiv:1706.06060 (2017a)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Annual Conference on Neural Information Processing Systems (NIPS) (2017b)
Lundberg, S.M., Erion, G.G., Lee, S.: Consistent individualized feature attribution for tree ensembles. arXiv:1802.03888 (2018)
McCabe, T.J.: A complexity measure. IEEE Trans. Softw. Eng. 4, 308–320 (1976)
Article MathSciNet Google Scholar
McCabe, T.J., Butler, C.W.: Design complexity measurement and testing. Commun. ACM 32(12), 1415–1425 (1989)
Article Google Scholar
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 1, 2–13 (2007)
Article Google Scholar
Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., Bener, A.: Defect prediction from static code features: current results, limitations, new approaches. Automated Softw. Eng. 17(4), 375–407 (2010)
Article Google Scholar
Mori, T., Uchihira, N.: Balancing the trade-off between accuracy and interpretability in software defect prediction. Empir. Softw. Eng. 24, 779–825 (2018)
Article Google Scholar
Nagappan, N., Ball, T.: Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th International Conference on Software Engineering (ICSE), pp. 284–292 (2005)
Nagappan, N., Ball, T., Zeller, A.: Mining metrics to predict component failures. In: Proceedings of the 28th International Conference on Software Engineering, pp. 452–461 (2006)
Petrić, J., Bowes, D., Hall, T., Christianson, B., Baddoo, N.: The jinx on the nasa software defect data sets. In: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, EASE (2016)
Sayyad Shirabad, J., Menzies, T.: The PROMISE Repository of Software Engineering Databases. University of Ottawa, Canada, School of Information Technology and Engineering (2005)
Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Annals of Mathematical Studies, pp. 307–317. Princeton University Press, Princeton (1953)
Google Scholar
Shuai, B., Li, H., Li, M., Zhang, Q., Tang, C.: Software defect prediction using dynamic support vector machine. In: Ninth International Conference on Computational Intelligence and Security, pp. 260–263 (2013)
Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, f-score and roc: A family of discriminant measures for performance evaluation. AI 2006: Advances in Artificial Intelligence pp. 1015–1021 (2006)
Stites, R.L., Ward, B., Walters, R.V.: Defect prediction with neural networks. In: Proceedings of the Conference on Analysis of Neural Network Applications, ANNA, pp. 199-206 (1991)
Sun, Z., Li, J., Sun, H.: An empirical study of public data quality problems in cross project defect prediction. Computing Research Repository (CoRR) (2018)
Gyimothy, T., Ferenc, R., Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31(10), 897–910 (2005)
Article Google Scholar
Tantithamthavorn, C., Hassan, A.E.: An experience report on defect modelling in practice: pitfalls and challenges. In: International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 286–295 (2018)
Tantithamthavorn, C., McIntosh, S., Hassan, AE., Ihara, A., Matsumoto, K.: The impact of mislabelling on the performance and interpretation of defect prediction models. In: International Conference on Software Engineering (ICSE), pp. 812–823 (2015)
Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Matsumoto, K.: An empirical comparison of model validation techniques for defect prediction models (2017)
Tantithamthavorn, C., McIntosh, S., Hassan, AE., Matsumoto, K.: The impact of automated parameter optimization for defect prediction models (2018)
Thwin, M.M.T., Quah, T.S.: Application of neural networks for software quality prediction using object-oriented metrics. J. Syst. Softw. 76(2), 147–156 (2005)
Article Google Scholar
Turhan, B., Bener, A.: Analysis of naive bayes’ assumptions on software fault data: an empirical study. Data Knowl. Eng. 68(2), 278–290 (2009)
Article Google Scholar
Turhan, B., Menzies, T., Bener, A.B., Di Stefano, J.: On the relative value of cross-company and within-company data for defect prediction. Empir. Softw. Eng. 14(5), 540–578 (2009)
Article Google Scholar
Wang, D., Yang, Q., Abdul, A., Lim, B.Y.: Designing theory-driven user-centric explainable ai. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pp. 1–15 (2019)
Wang, S., Liu, T., Tan, L.: Automatically learning semantic features for defect prediction. In: International Conference of Software Engineering (ICSE), pp. 297–308 (2016)
Wang, T., Li, W.H.: Naive bayes software defect prediction model. In: International Conference on Computational Intelligence and Software Engineering (CiSE), pp. 1–4 (2010)
Wohlin, C., Runeson, P., Hst, M., Ohlsson, M.C., Regnell, B., Wessln, A.: Experimentation in Software Engineering. Springer, Berlin (2012)
Book Google Scholar
Xu, Z., Liu, J., Luo, X., Zhang, T.: Cross-version defect prediction via hybrid active learning with kernel principal component analysis. In: International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 209–220 (2018)
Xuan, X., Lo, D., Xia, X., Tian, Y.: Evaluating defect prediction approaches using a massive set of metrics: An empirical study. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC (2015)
Yang, Y., Zhou, Y., Liu, J,. Zhao, Y., Lu, H., Xu, L., Xu, B., Leung, H.: Effort-aware just-in-time defect prediction: Simple unsupervised models could be better than supervised models. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE, pp. 157–168 (2016)
Yatish, S., Jiarpakdee, J., Thongtanunam, P., Tantithamthavorn, C.: Mining software defects: Should we consider affected releases? In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 654–665 (2019)
Zhang, F., Hassan, A.E., McIntosh, S., Zou, Y.: The use of summation to aggregate software metrics hinders the performance of defect prediction models. IEEE Trans. Softw. Eng. 43(5), 476–491 (2017)
Article Google Scholar

Download references

Acknowledgements

We thank the support given by the Project: Models, Algorithms and Systems for the Web (Grant by FAPEMIG/PRONEX/MASWeb APQ-01400- 14) and authors’ individual grants and scholarships from CNPq and Fapemig.

Author information

Authors and Affiliations

Department of Computer Science, Universidade Federal de Minas Gerais and Kunumi, Belo Horizonte, Brazil
Geanderson Esteves & Nivio Ziviani
Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Eduardo Figueiredo & Adriano Veloso
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
Markos Viggiato

Authors

Geanderson Esteves
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Figueiredo
View author publications
You can also search for this author in PubMed Google Scholar
Adriano Veloso
View author publications
You can also search for this author in PubMed Google Scholar
Markos Viggiato
View author publications
You can also search for this author in PubMed Google Scholar
Nivio Ziviani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Geanderson Esteves.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Esteves, G., Figueiredo, E., Veloso, A. et al. Understanding machine learning software defect predictions. Autom Softw Eng 27, 369–392 (2020). https://doi.org/10.1007/s10515-020-00277-4

Download citation

Received: 04 August 2019
Accepted: 14 September 2020
Published: 12 October 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10515-020-00277-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Understanding machine learning software defect predictions

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Using machine learning and deep learning algorithms for downtime minimization in manufacturing systems: an early failure detection diagnostic service

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Understanding machine learning software defect predictions

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

CoRT: Transformer-based code representations with self-supervision by predicting reserved words for code smell detection

Using machine learning and deep learning algorithms for downtime minimization in manufacturing systems: an early failure detection diagnostic service

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation