Skip to main content

An Empirical Study of Model-Agnostic Interpretation Technique for Just-in-Time Software Defect Prediction

  • Conference paper
  • First Online:

Abstract

Just-in-time software defect prediction (JIT-SDP) is an effective method of software quality assurance, whose objective is to use machine learning methods to identify defective code changes. However, the existing research only focuses on the predictive power of the JIT-SDP model and ignores the interpretability of the model. The need for the interpretability of the JIT-SDP model mainly comes from two reasons: (1) developers expect to understand the decision-making process of the JIT-SDP model and obtain guidance and insights; (2) the prediction results of the JIT-SDP model will have an impact on the interests of developers. According to privacy protection laws, prediction models need to provide explanations. To this end, we introduced three classifier-agnostic (CA) technologies, LIME, BreakDown, and SHAP for JIT-SDP models, and conducted a large-scale empirical study on six open source projects. The empirical results show that: (1) Different instances have different explanations. On average, the feature ranking difference of two random instances is 3; (2) For a given system, the feature lists and top-1 feature generated by different CA technologies have strong agreement; However, CA technologies have small agreement on the top-3 features in the feature ranking lists. In the actual software development process, we suggest using CA technologies to help developers understand the prediction results of the model.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Akoglu, H.: User’s guide to correlation coefficients. Turk. J. Emerg. Med. 18(3), 91–93 (2018)

    Article  Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  3. Cabral, G.G., Minku, L.L., Shihab, E., Mujahid, S.: Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proceedings of the 41st International Conference on Software Engineering, pp. 666–676 (2019)

    Google Scholar 

  4. Cafiso, S., Di Graziano, A., Pappalardo, G.: Using the Delphi method to evaluate opinions of public transport managers on bus safety. Saf. Sci. 57, 254–263 (2013)

    Article  Google Scholar 

  5. Chakraborty, S., et al.: Interpretability of deep learning models: a survey of results. In: IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, pp. 1–6 (2017)

    Google Scholar 

  6. Chen, X., Zhao, Y., Wang, Q., Yuan, Z.: MULTI: multi-objective effort-aware just-in-time software defect prediction. Inf. Softw. Technol. 93, 1–13 (2018)

    Article  Google Scholar 

  7. Dam, H.K., Tran, T., Ghose, A.: Explainable software analytics. In: Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results, pp. 53–56 (2018)

    Google Scholar 

  8. Fan, Y., Xia, X., Da Costa, D.A., Lo, D., Hassan, A.E., Li, S.: The impact of changes mislabeled by SZZ on just-in-time defect prediction. IEEE Trans. Softw. Eng. 47(8), 1559–1586 (2019)

    Article  Google Scholar 

  9. Gosiewska, A., Biecek, P.: iBreakDown: Uncertainty of model explanations for nonadditive predictive models. arXiv preprint arXiv:1903.11420 (2019)

  10. Hoang, T., Dam, H.K., Kamei, Y., Lo, D., Ubayashi, N.: Deepjit: An end-to-end deep learning framework for just-in-time defect prediction. In: Proceedings of the 16th International Conference on Mining Software Repositories, pp. 34–45 (2019)

    Google Scholar 

  11. Huang, Q., Xia, X., Lo, D.: Supervised vs unsupervised models: a holistic look at effort-aware just-in-time defect prediction. In: 2017 IEEE International Conference on Software Maintenance and Evolution, ICSME, pp. 159–170. IEEE Computer Society (2017)

    Google Scholar 

  12. Huang, Q., Xia, X., Lo, D.: Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction. Empirical Softw. Eng. 24(5), 2823–2862 (2019)

    Article  Google Scholar 

  13. Jiang, Y., Cukic, B., Menzies, T.: Can data transformation help in the detection of fault-prone modules?. In: Proceedings of the Workshop on Defects in Large Software Systems, held in conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 16–20 (2008)

    Google Scholar 

  14. Jiarpakdee, J., Tantithamthavorn, C., Dam, H.K., Grundy, J.: An empirical study of model-agnostic techniques for defect prediction models. IEEE Trans. Softw. Eng. 1 (2020)

    Google Scholar 

  15. Jiarpakdee, J., Tantithamthavorn, C., Grundy, J.C.: Practitioners’ perceptions of the goals and visual explanations of defect prediction models. In: 18th IEEE/ACM International Conference on Mining Software Repositories, pp. 432–443 (2021)

    Google Scholar 

  16. Jiarpakdee, J., Tantithamthavorn, C., Hassan, A.E.: The impact of correlated metrics on the interpretation of defect models. IEEE Trans. Softw. Eng. 47(2), 320–331 (2021)

    Article  Google Scholar 

  17. Kamei, Y., Fukushima, T., McIntosh, S., Yamashita, K., Ubayashi, N., Hassan, A.E.: Studying just-in-time defect prediction using cross-project models. Empirical Soft. Eng. 21(5), 2072–2106 (2016)

    Article  Google Scholar 

  18. Kamei, Y., Matsumoto, S., Monden, A., Matsumoto, K., Adams, B., Hassan, A.E.: Revisiting common bug prediction findings using effort-aware models. In: 26th IEEE International Conference on Software Maintenance, pp. 1–10 (2010)

    Google Scholar 

  19. Kamei, Y., Shihab, E., Adams, B., Hassan, A.E., Mockus, A., Sinha, A., Ubayashi, N.: A large-scale empirical study of just-in-time quality assurance. IEEE Trans. Softw. Eng. 39(6), 757–773 (2013)

    Article  Google Scholar 

  20. Forthofer, R.N., Lehnen, R.G.: Rank correlation methods. In: Public Program Analysis. Springer, Boston, MA (1981). https://doi.org/10.1007/978-1-4684-6683-6_9

  21. Li, Z., Jing, X., Zhu, X.: Progress on approaches to software defect prediction. IET Softw. 12(3), 161–175 (2018)

    Article  Google Scholar 

  22. Lipton, Z.C.: The mythos of model interpretability. Commun. ACM 61(10), 36–43 (2018)

    Article  Google Scholar 

  23. Liu, J., Zhou, Y., Yang, Y., Lu, H., Xu, B.: Code churn: a neglected metric in effort-aware just-in-time defect prediction. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 11–19 (2017)

    Google Scholar 

  24. Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pp. 4765–4774 (2017)

    Google Scholar 

  25. Mervyn, S.: Cross-validatory choice and assessment of statistical predictions. J. Roy. Stat. Soc.: Ser. B (Methodol.) 36(2), 111–133 (1974)

    MathSciNet  MATH  Google Scholar 

  26. Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)

    Article  MathSciNet  Google Scholar 

  27. Mockus, A., Weiss, D.M.: Predicting risk of software changes. Bell Labs Tech. J. 5(2), 169–180 (2000)

    Article  Google Scholar 

  28. Naik, K., Tripathy, P.: Software testing and quality assurance: theory and practice. John Wiley & Sons (2011)

    Google Scholar 

  29. Pornprasit, C., Tantithamthavorn, C.: Jitline: A simpler, better, faster, finer-grained just-in-time defect prediction. In: 18th International Conference on Mining Software Repositories, pp. 1–11 (2021)

    Google Scholar 

  30. Rajbahadur, G.K., Wang, S., Ansaldi, G., Kamei, Y., Hassan, A.E.: The impact of feature importance methods on the interpretation of defect classifiers. IEEE Trans. Softw. Eng. 1 (2021)

    Google Scholar 

  31. Regulation, G.D.P.: Regulation eu 2016/679 of the european parliament and of the council of 27 April 2016. Official Journal of the European Union (2016)

    Google Scholar 

  32. Ribeiro, M.T., Singh, S., Guestrin, C.: Why should I trust you?": explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)

    Google Scholar 

  33. Rosen, C., Grawi, B., Shihab, E.: Commit guru: Analytics and risk prediction of software commits. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, pp. 966–969 (2015)

    Google Scholar 

  34. Scott, A.J., Knott, M.: A cluster analysis method for grouping means in the analysis of variance. Biometrics, pp. 507–512 (1974)

    Google Scholar 

  35. Shihab, E., Hassan, A.E., Adams, B., Jiang, Z.M.: An industrial study on the risk of software changes. In: 20th ACM SIGSOFT Symposium on the Foundations of Software Engineering, p. 62 (2012)

    Google Scholar 

  36. Sliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? In: Proceedings of the 2005 International Workshop on Mining Software Repositories (2005)

    Google Scholar 

  37. Staniak, M., Biecek, P.: Explanations of model predictions with lime and breakdown packages 10(2), 395 (2018). arXiv preprint arXiv:1804.01955

  38. Tabassum, S., Minku, L.L., Feng, D., Cabral, G.G., Song, L.: An investigation of cross-project learning in online just-in-time software defect prediction. In: IEEE/ACM 42nd International Conference on Software Engineering, pp. 554–565 (2020)

    Google Scholar 

  39. Tantithamthavorn, C., Hassan, A.E., Matsumoto, K.: The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans. Softw. Eng. 46(11), 1200–1219 (2020)

    Article  Google Scholar 

  40. Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Ihara, A., Matsumoto, K.: The impact of mislabelling on the performance and interpretation of defect prediction models. In: 37th IEEE/ACM International Conference on Software Engineering, pp. 812–823 (2015)

    Google Scholar 

  41. Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Matsumoto, K.: An empirical comparison of model validation techniques for defect prediction models. IEEE Trans. Softw. Eng. 43(1), 1–18 (2017)

    Article  Google Scholar 

  42. Wattanakriengkrai, S., Thongtanunam, P., Tantithamthavorn, C., Hata, H., Matsumoto, K.: Predicting defective lines using a model-agnostic technique. IEEE Trans. Softw. Eng. (2021). https://doi.org/10.1109/TSE.2020.3023177

    Article  Google Scholar 

  43. Yang, X., Yu, H., Fan, G., Shi, K., Chen, L.: Local versus global models for just-in-time software defect prediction. Sci. Program. 2384706:1–2384706:13 (2019)

    Google Scholar 

  44. Yang, X., Yu, H., Fan, G., Yang, K.: An empirical studies on optimal solutions selection strategies for effort-aware just-in-time software defect prediction. In: The 31st International Conference on Software Engineering and Knowledge Engineering, pp. 319–424 (2019)

    Google Scholar 

  45. Yang, X., Yu, H., Fan, G., Yang, K.: A differential evolution-based approach for effort-aware just-in-time software defect prediction. In: Proceedings of the 1st ACM SIGSOFT International Workshop on Representation Learning for Software Engineering and Program Languages, pp. 13–16 (2020)

    Google Scholar 

  46. Yang, X., Yu, H., Fan, G., Yang, K.: DEJIT: a differential evolution algorithm for effort-aware just-in-time software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 31(3), 289–310 (2021)

    Article  Google Scholar 

  47. Yang, X., Lo, D., Xia, X., Sun, J.: TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Information & Software Technology 87, 206–220 (2017)

    Article  Google Scholar 

  48. Yang, Y., et al.: Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 157–168 (2016)

    Google Scholar 

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China (No. 61772200), the Project Supported by Shanghai Natural Science Foundation (No. 21ZR1416300).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Huiqun Yu or Guisheng Fan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, X., Yu, H., Fan, G., Huang, Z., Yang, K., Zhou, Z. (2021). An Empirical Study of Model-Agnostic Interpretation Technique for Just-in-Time Software Defect Prediction. In: Gao, H., Wang, X. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 406. Springer, Cham. https://doi.org/10.1007/978-3-030-92635-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92635-9_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92634-2

  • Online ISBN: 978-3-030-92635-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics