research-article

Estimating Uncertainty in Labeled Changes by SZZ Tools on Just-In-Time Defect Prediction

Authors:

He JiangAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 33, Issue 4

Article No.: 105, Pages 1 - 25

https://doi.org/10.1145/3637226

Published: 18 April 2024 Publication History

Abstract

The aim of Just-In-Time (JIT) defect prediction is to predict software changes that are prone to defects in a project in a timely manner, thereby improving the efficiency of software development and ensuring software quality. Identifying changes that introduce bugs is a critical task in just-in-time defect prediction, and researchers have introduced the SZZ approach and its variants to label these changes. However, it has been shown that different SZZ algorithms introduce noise to the dataset to a certain extent, which may reduce the predictive performance of the model. To address this limitation, we propose the Confident Learning Imbalance (CLI) model. The model identifies and excludes samples whose labels may be corrupted by estimating the joint distribution of noisy labels and true labels, and mitigates the impact of noisy data on the performance of the prediction model. The CLI consists of two components: identifying noisy data (Confident Learning Component) and generating a predicted probability matrix for imbalanced data (Imbalanced Data Probabilistic Prediction Component). The IDPP component generates precise predicted probabilities for each instance in the training set, while the CL component uses the generated predicted probability matrix and noise labels to clean up the noise and build a classification model. We evaluate the performance of our model through extensive experiments on a total of 126,526 changes from ten Apache open source projects, and the results show that our model outperforms the baseline methods.

References

[1]

2021. CLI Details. (2021). Retrieved from https://github.com/Andyldm/CLI/

[2]

Dana Angluin and Philip D. Laird. 1987. Learning from noisy examples. Machine Learning 2, 4 (1987), 343–370. DOI:

[3]

Yan M Xia X. Cai L., Fan Y. R. 2019. Just-in-time software defect prediction: Literature review. Ruan Jian Xue Bao/Journal of Software 30, 5 (2019), 1288–1307. Retrieved from http://www.jos.org.cn/1000-9825/5713.html

[4]

Daniel Alencar da Costa, Shane McIntosh, Weiyi Shang, Uirá Kulesza, Roberta Coelho, and Ahmed E. Hassan. 2017. A framework for evaluating the results of the SZZ approach for identifying bug-introducing changes. IEEE Transactions on Software Engineering 43, 7 (2017), 641–657. DOI:

Digital Library

[5]

Marco D’Ambros, Michele Lanza, and Romain Robbes. 2012. Evaluating defect prediction approaches: A benchmark and an extensive comparison. Empirical Software Engineering 17, 4-5 (2012), 531–577. DOI:

Digital Library

[6]

Falessi Davide, Ahluwalia Aalok, and Penta Massimiliano Di. 2022. The impact of dormant defects on defect prediction: A study of 19 apache projects. ACM Transactions on Software Engineering and Methodology (2022).

[7]

Charles Elkan. 2001. The foundations of cost-sensitive learning. In Proceedings of the 17th International Joint Conference on Artificial Intelligence, IJCAI 2001. Bernhard Nebel (Ed.), Morgan Kaufmann, 973–978.

[8]

Yuanrui Fan, Xin Xia, Daniel Alencar da Costa, David Lo, Ahmed E. Hassan, and Shanping Li. 2021. The impact of mislabeled changes by SZZ on just-in-time defect prediction. IEEE Transactions on Software Engineering 47, 8 (2021), 1559–1586. DOI:

[9]

Jiawei Han and Micheline Kamber. 2006. Data Mining: Concepts and Techniques, Second Edition. Elsevier.

[10]

Jiangfan Han, Ping Luo, and Xiaogang Wang. 2019. Deep self-learning from noisy labels. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019. IEEE, 5137–5146. DOI:

[11]

Haibo He and Edwardo A. Garcia. 2009. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21, 9 (2009), 1263–1284. DOI:

Digital Library

[12]

Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, and Naoyasu Ubayashi. 2019. DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction. In Proceedings of the 16th International Conference on Mining Software Repositories, MSR 2019. Margaret-Anne D. Storey, Bram Adams, and Sonia Haiduc (Eds.), IEEE/ACM, 34–45. DOI:

Digital Library

[13]

Thong Hoang, Hong Jin Kang, David Lo, and Julia Lawall. 2020. CC2Vec: Distributed representations of code changes. In Proceedings of the ICSE ’20: 42nd International Conference on Software Engineering. Gregg Rothermel and Doo-Hwan Bae (Eds.), ACM, 518–529. DOI:

Digital Library

[14]

Jinchi Huang, Lie Qu, Rongfei Jia, and Binqiang Zhao. 2019. O2U-Net: A simple noisy label detection approach for deep neural networks. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019. IEEE, 3325–3333. DOI:

[15]

Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized defect prediction. In Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013. Ewen Denney, Tevfik Bultan, and Andreas Zeller (Eds.), IEEE, 279–289. DOI:

Digital Library

[16]

Yasutaka Kamei, Shinsuke Matsumoto, Akito Monden, Ken-ichi Matsumoto, Bram Adams, and Ahmed E. Hassan. 2010. Revisiting common bug prediction findings using effort-aware models. In Proceedings of the 26th IEEE International Conference on Software Maintenance (ICSM 2010). Timisoara, Romania, Radu Marinescu, Michele Lanza, and Andrian Marcus (Eds.), IEEE Computer Society, 1–10. DOI:

Digital Library

[17]

Yasutaka Kamei, Akito Monden, Shinsuke Matsumoto, Takeshi Kakimoto, and Ken-ichi Matsumoto. 2007. The effects of over and under sampling on fault-prone module detection. In Proceedings of the 1st International Symposium on Empirical Software Engineering and Measurement, ESEM 2007. ACM/IEEE Computer Society, 196–204. DOI:

Digital Library

[18]

Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E. Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A large-scale empirical study of just-in-time quality assurance. IEEETransactions on Software Engineering 39, 6 (2013), 757–773. DOI:

Digital Library

[19]

Taghi M. Khoshgoftaar, Xiaojing Yuan, and Edward B. Allen. 2000. Balancing misclassification rates in classification-tree models of software quality. Empirical Software Engineering 5, 4 (2000), 313–330.

Digital Library

[20]

Sunghun Kim, E. James Whitehead Jr., and Yi Zhang. 2008. Classifying software changes: Clean or buggy? IEEE Transactions on Software Engineering 34, 2 (2008), 181–196. DOI:

Digital Library

[21]

Sunghun Kim, Hongyu Zhang, Rongxin Wu, and Liang Gong. 2011. Dealing with noise in defect prediction. ACM International Conference on Software Engineering (2011).

[22]

Sunghun Kim, Thomas Zimmermann, Kai Pan, and E. James Whitehead Jr.2006. Automatic identification of bug-introducing changes. In Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering (ASE 2006). IEEE Computer Society, 81–90. DOI:

Digital Library

[23]

Akif Günes Koru, Dongsong Zhang, Khaled El Emam, and Hongfang Liu. 2009. An investigation into the functional form of the size-defect relationship for software modules. IEEE Transactions on Software Engineering 35, 2 (2009), 293–304. DOI:

Digital Library

[24]

Zachary C. Lipton, Yu-Xiang Wang, and Alexander J. Smola. 2018. Detecting and correcting for label shift with black box predictors. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018.Jennifer G. Dy and Andreas Krause (Eds.), Proceedings of Machine Learning Research, Vol. 80, PMLR, 3128–3136. Retrieved from http://proceedings.mlr.press/v80/lipton18a.html

[25]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008). IEEE Computer Society, 413–422. DOI:

Digital Library

[26]

Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayse Basar Bener. 2010. Defect prediction from static code features: Current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375–407. DOI:

Digital Library

[27]

Roberto Minelli, Andrea Mocci, and Michele Lanza. 2015. I know what you did last summer: An investigation of how developers spend their time. In Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension, ICPC 2015. Andrea De Lucia, Christian Bird, and Rocco Oliveto (Eds.), IEEE Computer Society, 25–35. DOI:

Digital Library

[28]

Audris Mockus and David M. Weiss. 2000. Predicting risk of software changes. Bell Labs Technical Journal 5, 2 (2000), 169–180. DOI:

[29]

Edmilson Campos Neto, Daniel Alencar da Costa, and Uirá Kulesza. 2018. The impact of refactoring changes on the SZZ algorithm: An empirical study. In Proceedings of the 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018. Rocco Oliveto, Massimiliano Di Penta, and David C. Shepherd (Eds.), IEEE Computer Society, 380–390. DOI:

[30]

Curtis G. Northcutt, Lu Jiang, and Isaac L. Chuang. 2019. Confident learning: Estimating uncertainty in dataset labels. arXiv:1911.00068. Retrieved from https://arxiv.org/abs/1911.00068

[31]

Chanathip Pornprasit and Chakkrit Tantithamthavorn. 2021. JITLine: A simpler, better, faster, finer-grained just-in-time defect prediction. In Proceedings of the 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021. IEEE, 369–379. DOI:

[32]

Sophia Quach, Maxime Lamothe, Yasutaka Kamei, and Weiyi Shang. 2021. An empirical study on the use of SZZ for identifying inducing changes of non-functional bugs. Empirical Software Engineering 26, 4 (2021), 71. DOI:

Digital Library

[33]

G. Rodríguez-Pérez, M. Nagappan, and G. Robles. 2020. Watch out for extrinsic bugs! a case study of their impact in just-in-time bug prediction models on the openstack project. IEEE Transactions on Software Engineering (2020).

[34]

Hongbo Shi and Yali Lü. 2007. Investigation of the effects of factor analysis based dimension reduction on classification performance. Zhongbei Daxue Xuebao (Ziran Kexue Ban)/Journal of North University of China (Natural Science Edition) 28, 6 (2007), 662–677.

[35]

Liu Shiran, Zhaoqiang Guo, Yanhui Li, Chuanqi Wang, Lin Chen, Zhongbin Sun, and Yuming Zhou.2022. An extensive empirical study of inconsistent labels in multi-version-project defect datasets. arXiv.org (2022).

[36]

Jacek Sliwerski, Thomas Zimmermann, and Andreas Zeller. 2005. When do changes induce fixes? ACM SIGSOFT Software Engineering Notes 30, 4 (2005), 1–5. DOI:

Digital Library

[37]

Swanson and E. Burton. Software Maintenance Management:. Software Maintenance Management:.

[38]

Burak Turhan, Tim Menzies, Ayse Basar Bener, and Justin S. Di Stefano. 2009. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14, 5 (2009), 540–578. DOI:

Digital Library

[39]

Colin Wei, Jason D. Lee, Qiang Liu, and Tengyu Ma. 2018. On the margin theory of feedforward neural networks. arXiv:1810.05369. Retrieved from https://arxiv.org/abs/1810.05369

[40]

Xinli Yang, David Lo, Xin Xia, and Jianling Sun. 2017. TLEL: A two-layer ensemble learning approach for just-in-time defect prediction. Information and Software Technology 87 (2017), 206–220. DOI:

Digital Library

[41]

Xinli Yang, David Lo, Xin Xia, Yun Zhang, and Jianling Sun. 2015. Deep learning for just-in-time defect prediction. In Proceedings of the 2015 IEEE International Conference on Software Quality, Reliability and Security, QRS 2015. IEEE, 17–26. DOI:

Digital Library

[42]

Yibiao Yang, Yuming Zhou, Jinping Liu, Yangyang Zhao, Hongmin Lu, Lei Xu, Baowen Xu, and Hareton Leung. 2016. Effort-aware just-in-time defect prediction: Simple unsupervised models could be better than supervised models. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016.Thomas Zimmermann, Jane Cleland-Huang, and Zhendong Su (Eds.), ACM, 157–168. DOI:

Digital Library

[43]

Xingquan Zhu and Xindong Wu. 2004. Class noise vs. attribute noise: A quantitative study. Artificial Intelligence Review 22, 3 (2004), 177–210.

[44]

Thomas Zimmermann and Nachiappan Nagappan. 2008. Predicting defects using network analysis on dependency graphs. In Proceedings of the 30th International Conference on Software Engineering (ICSE 2008). Wilhelm Schäfer, Matthew B. Dwyer, and Volker Gruhn (Eds.), ACM, 531–540. DOI:

Digital Library

Cited By

Shahini XMetzger APohl KSpinellis DConstantinou EBacchelli A(2024)An Empirical Study on Just-in-time Conformal Defect PredictionProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644928(88-99)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644928

Index Terms

Estimating Uncertainty in Labeled Changes by SZZ Tools on Just-In-Time Defect Prediction
1. Computing methodologies
  1. Machine learning

Recommendations

Just-in-time defect prediction based on AST change embedding
Abstract
Just-in-time (JIT) defect prediction can help developers quickly identify whether a change is defective or not. The features extracted from changes play an essential role in building an accurate prediction model. In recent years, it ...
Interpretability application of the Just-in-Time software defect prediction model
Abstract
Software defect prediction is one of the most active fields in software engineering. Recently, some experts have proposed the Just-in-time Defect Prediction Technology. Just-in-time Defect prediction technology has become a hot topic ...
Just-In-Time Defect Prediction on JavaScript Projects: A Replication Study
Change-level defect prediction is widely referred to as just-in-time (JIT) defect prediction since it identifies a defect-inducing change at the check-in time, and researchers have proposed many approaches based on the language-independent change-level ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 33, Issue 4

May 2024

940 pages

EISSN:1557-7392

DOI:10.1145/3613665

Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2024

Online AM: 11 December 2023

Accepted: 01 December 2023

Revised: 29 August 2023

Received: 14 February 2022

Published in TOSEM Volume 33, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Dalian Excellent Young Project
Postgraduate Education Reform Project of Liaoning Province
Fundamental Research Funds for the Central Universities
Dalian Maritime University Teacher Development Project
China Higher Education Association 2023 Higher Education Scientific Research

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
484
Total Downloads

Downloads (Last 12 months)367
Downloads (Last 6 weeks)41

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shahini XMetzger APohl KSpinellis DConstantinou EBacchelli A(2024)An Empirical Study on Just-in-time Conformal Defect PredictionProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644928(88-99)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644928

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents