Abstract
Just-in-time software defect prediction (JIT-SDP) is an effective method of software quality assurance, whose objective is to use machine learning methods to identify defective code changes. However, the existing research only focuses on the predictive power of the JIT-SDP model and ignores the interpretability of the model. The need for the interpretability of the JIT-SDP model mainly comes from two reasons: (1) developers expect to understand the decision-making process of the JIT-SDP model and obtain guidance and insights; (2) the prediction results of the JIT-SDP model will have an impact on the interests of developers. According to privacy protection laws, prediction models need to provide explanations. To this end, we introduced three classifier-agnostic (CA) technologies, LIME, BreakDown, and SHAP for JIT-SDP models, and conducted a large-scale empirical study on six open source projects. The empirical results show that: (1) Different instances have different explanations. On average, the feature ranking difference of two random instances is 3; (2) For a given system, the feature lists and top-1 feature generated by different CA technologies have strong agreement; However, CA technologies have small agreement on the top-3 features in the feature ranking lists. In the actual software development process, we suggest using CA technologies to help developers understand the prediction results of the model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Akoglu, H.: User’s guide to correlation coefficients. Turk. J. Emerg. Med. 18(3), 91–93 (2018)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Cabral, G.G., Minku, L.L., Shihab, E., Mujahid, S.: Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proceedings of the 41st International Conference on Software Engineering, pp. 666–676 (2019)
Cafiso, S., Di Graziano, A., Pappalardo, G.: Using the Delphi method to evaluate opinions of public transport managers on bus safety. Saf. Sci. 57, 254–263 (2013)
Chakraborty, S., et al.: Interpretability of deep learning models: a survey of results. In: IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, pp. 1–6 (2017)
Chen, X., Zhao, Y., Wang, Q., Yuan, Z.: MULTI: multi-objective effort-aware just-in-time software defect prediction. Inf. Softw. Technol. 93, 1–13 (2018)
Dam, H.K., Tran, T., Ghose, A.: Explainable software analytics. In: Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results, pp. 53–56 (2018)
Fan, Y., Xia, X., Da Costa, D.A., Lo, D., Hassan, A.E., Li, S.: The impact of changes mislabeled by SZZ on just-in-time defect prediction. IEEE Trans. Softw. Eng. 47(8), 1559–1586 (2019)
Gosiewska, A., Biecek, P.: iBreakDown: Uncertainty of model explanations for nonadditive predictive models. arXiv preprint arXiv:1903.11420 (2019)
Hoang, T., Dam, H.K., Kamei, Y., Lo, D., Ubayashi, N.: Deepjit: An end-to-end deep learning framework for just-in-time defect prediction. In: Proceedings of the 16th International Conference on Mining Software Repositories, pp. 34–45 (2019)
Huang, Q., Xia, X., Lo, D.: Supervised vs unsupervised models: a holistic look at effort-aware just-in-time defect prediction. In: 2017 IEEE International Conference on Software Maintenance and Evolution, ICSME, pp. 159–170. IEEE Computer Society (2017)
Huang, Q., Xia, X., Lo, D.: Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction. Empirical Softw. Eng. 24(5), 2823–2862 (2019)
Jiang, Y., Cukic, B., Menzies, T.: Can data transformation help in the detection of fault-prone modules?. In: Proceedings of the Workshop on Defects in Large Software Systems, held in conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 16–20 (2008)
Jiarpakdee, J., Tantithamthavorn, C., Dam, H.K., Grundy, J.: An empirical study of model-agnostic techniques for defect prediction models. IEEE Trans. Softw. Eng. 1 (2020)
Jiarpakdee, J., Tantithamthavorn, C., Grundy, J.C.: Practitioners’ perceptions of the goals and visual explanations of defect prediction models. In: 18th IEEE/ACM International Conference on Mining Software Repositories, pp. 432–443 (2021)
Jiarpakdee, J., Tantithamthavorn, C., Hassan, A.E.: The impact of correlated metrics on the interpretation of defect models. IEEE Trans. Softw. Eng. 47(2), 320–331 (2021)
Kamei, Y., Fukushima, T., McIntosh, S., Yamashita, K., Ubayashi, N., Hassan, A.E.: Studying just-in-time defect prediction using cross-project models. Empirical Soft. Eng. 21(5), 2072–2106 (2016)
Kamei, Y., Matsumoto, S., Monden, A., Matsumoto, K., Adams, B., Hassan, A.E.: Revisiting common bug prediction findings using effort-aware models. In: 26th IEEE International Conference on Software Maintenance, pp. 1–10 (2010)
Kamei, Y., Shihab, E., Adams, B., Hassan, A.E., Mockus, A., Sinha, A., Ubayashi, N.: A large-scale empirical study of just-in-time quality assurance. IEEE Trans. Softw. Eng. 39(6), 757–773 (2013)
Forthofer, R.N., Lehnen, R.G.: Rank correlation methods. In: Public Program Analysis. Springer, Boston, MA (1981). https://doi.org/10.1007/978-1-4684-6683-6_9
Li, Z., Jing, X., Zhu, X.: Progress on approaches to software defect prediction. IET Softw. 12(3), 161–175 (2018)
Lipton, Z.C.: The mythos of model interpretability. Commun. ACM 61(10), 36–43 (2018)
Liu, J., Zhou, Y., Yang, Y., Lu, H., Xu, B.: Code churn: a neglected metric in effort-aware just-in-time defect prediction. In: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 11–19 (2017)
Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pp. 4765–4774 (2017)
Mervyn, S.: Cross-validatory choice and assessment of statistical predictions. J. Roy. Stat. Soc.: Ser. B (Methodol.) 36(2), 111–133 (1974)
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)
Mockus, A., Weiss, D.M.: Predicting risk of software changes. Bell Labs Tech. J. 5(2), 169–180 (2000)
Naik, K., Tripathy, P.: Software testing and quality assurance: theory and practice. John Wiley & Sons (2011)
Pornprasit, C., Tantithamthavorn, C.: Jitline: A simpler, better, faster, finer-grained just-in-time defect prediction. In: 18th International Conference on Mining Software Repositories, pp. 1–11 (2021)
Rajbahadur, G.K., Wang, S., Ansaldi, G., Kamei, Y., Hassan, A.E.: The impact of feature importance methods on the interpretation of defect classifiers. IEEE Trans. Softw. Eng. 1 (2021)
Regulation, G.D.P.: Regulation eu 2016/679 of the european parliament and of the council of 27 April 2016. Official Journal of the European Union (2016)
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should I trust you?": explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Rosen, C., Grawi, B., Shihab, E.: Commit guru: Analytics and risk prediction of software commits. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, pp. 966–969 (2015)
Scott, A.J., Knott, M.: A cluster analysis method for grouping means in the analysis of variance. Biometrics, pp. 507–512 (1974)
Shihab, E., Hassan, A.E., Adams, B., Jiang, Z.M.: An industrial study on the risk of software changes. In: 20th ACM SIGSOFT Symposium on the Foundations of Software Engineering, p. 62 (2012)
Sliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? In: Proceedings of the 2005 International Workshop on Mining Software Repositories (2005)
Staniak, M., Biecek, P.: Explanations of model predictions with lime and breakdown packages 10(2), 395 (2018). arXiv preprint arXiv:1804.01955
Tabassum, S., Minku, L.L., Feng, D., Cabral, G.G., Song, L.: An investigation of cross-project learning in online just-in-time software defect prediction. In: IEEE/ACM 42nd International Conference on Software Engineering, pp. 554–565 (2020)
Tantithamthavorn, C., Hassan, A.E., Matsumoto, K.: The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans. Softw. Eng. 46(11), 1200–1219 (2020)
Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Ihara, A., Matsumoto, K.: The impact of mislabelling on the performance and interpretation of defect prediction models. In: 37th IEEE/ACM International Conference on Software Engineering, pp. 812–823 (2015)
Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Matsumoto, K.: An empirical comparison of model validation techniques for defect prediction models. IEEE Trans. Softw. Eng. 43(1), 1–18 (2017)
Wattanakriengkrai, S., Thongtanunam, P., Tantithamthavorn, C., Hata, H., Matsumoto, K.: Predicting defective lines using a model-agnostic technique. IEEE Trans. Softw. Eng. (2021). https://doi.org/10.1109/TSE.2020.3023177
Yang, X., Yu, H., Fan, G., Shi, K., Chen, L.: Local versus global models for just-in-time software defect prediction. Sci. Program. 2384706:1–2384706:13 (2019)
Yang, X., Yu, H., Fan, G., Yang, K.: An empirical studies on optimal solutions selection strategies for effort-aware just-in-time software defect prediction. In: The 31st International Conference on Software Engineering and Knowledge Engineering, pp. 319–424 (2019)
Yang, X., Yu, H., Fan, G., Yang, K.: A differential evolution-based approach for effort-aware just-in-time software defect prediction. In: Proceedings of the 1st ACM SIGSOFT International Workshop on Representation Learning for Software Engineering and Program Languages, pp. 13–16 (2020)
Yang, X., Yu, H., Fan, G., Yang, K.: DEJIT: a differential evolution algorithm for effort-aware just-in-time software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 31(3), 289–310 (2021)
Yang, X., Lo, D., Xia, X., Sun, J.: TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Information & Software Technology 87, 206–220 (2017)
Yang, Y., et al.: Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 157–168 (2016)
Acknowledgment
This work was supported by the National Natural Science Foundation of China (No. 61772200), the Project Supported by Shanghai Natural Science Foundation (No. 21ZR1416300).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Yang, X., Yu, H., Fan, G., Huang, Z., Yang, K., Zhou, Z. (2021). An Empirical Study of Model-Agnostic Interpretation Technique for Just-in-Time Software Defect Prediction. In: Gao, H., Wang, X. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 406. Springer, Cham. https://doi.org/10.1007/978-3-030-92635-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-92635-9_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92634-2
Online ISBN: 978-3-030-92635-9
eBook Packages: Computer ScienceComputer Science (R0)