Studying just-in-time defect prediction using cross-project models

Kamei, Yasutaka; Fukushima, Takafumi; McIntosh, Shane; Yamashita, Kazuhiro; Ubayashi, Naoyasu; Hassan, Ahmed E.

doi:10.1007/s10664-015-9400-x

Studying just-in-time defect prediction using cross-project models

Published: 14 September 2015

Volume 21, pages 2072–2106, (2016)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Yasutaka Kamei¹,
Takafumi Fukushima¹,
Shane McIntosh²,
Kazuhiro Yamashita¹,
Naoyasu Ubayashi¹ &
…
Ahmed E. Hassan³

3032 Accesses
151 Citations
1 Altmetric
Explore all metrics

Abstract

Unlike traditional defect prediction models that identify defect-prone modules, Just-In-Time (JIT) defect prediction models identify defect-inducing changes. As such, JIT defect models can provide earlier feedback for developers, while design decisions are still fresh in their minds. Unfortunately, similar to traditional defect models, JIT models require a large amount of training data, which is not available when projects are in initial development phases. To address this limitation in traditional defect prediction, prior work has proposed cross-project models, i.e., models learned from other projects with sufficient history. However, cross-project models have not yet been explored in the context of JIT prediction. Therefore, in this study, we empirically evaluate the performance of JIT models in a cross-project context. Through an empirical study on 11 open source projects, we find that while JIT models rarely perform well in a cross-project context, their performance tends to improve when using approaches that: (1) select models trained using other projects that are similar to the testing project, (2) combine the data of several other projects to produce a larger pool of training data, and (3) combine the models of several other projects to produce an ensemble model. Our findings empirically confirm that JIT models learned using other projects are a viable solution for projects with limited historical data. However, JIT models tend to perform best in a cross-project context when the data used to learn them are carefully selected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Data collection and quality challenges in deep learning: a data-centric AI perspective

Article 03 January 2023

Steven Euijong Whang, Yuji Roh, … Jae-Gil Lee

How different are different diff algorithms in Git?

Article Open access 11 September 2019

Yusuf Sulistyo Nugroho, Hideaki Hata & Kenichi Matsumoto

Applications of AI in classical software engineering

Article Open access 26 July 2020

Marco Barenkamp, Jonas Rebstadt & Oliver Thomas

Notes

https://github.com/github/linguist
http://posl.ait.kyushu-u.ac.jp/Disclosure/emse_jit.html
https://github.com/github/linguist
http://posl.ait.kyushu-u.ac.jp/Disclosure/emse_jit.html
We choose domain-aware similarity techniques, similarity merge (5) using domain-aware similarity and weighted similarity voting using domain-aware similarity, which show the best median value in each RQ, and within-project JIT models as ideal models. We check the difference of the median values among the four models using Tukey’s HSD. If we find that there is not statistically significant difference between within- and one cross-project JIT models, we find no evidence of a difference in the performance of within- and cross-project models in those cases (i.e., the cross-project model perform well).

References

Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761
Article Google Scholar
Bettenburg N, Nagappan M, Hassan AE (2012) Think locally, act globally: Improving defect and effort prediction models. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’12), pp 60–69
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MathSciNet MATH Google Scholar
Briand LC, Melo WL, Wüst J (2002) Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans Softw Eng 28(7):706–720
Article Google Scholar
Coolidge FL (2012) Statistics: A Gentle Introduction. SAGE Publications (3rd ed.)
D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’10), pp 31–41
Fukushima T, Kamei Y, McIntosh S, Yamashita K, Ubayashi N (2014) An empirical study of just-in-time defect prediction using cross-project models. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’14), pp 172–181
Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661
Article Google Scholar
Guo P J, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: An empirical study of microsoft windows. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’10) vol 1, pp 495–504
Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304
Article Google Scholar
Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’09), pp 78–88
He Z, Shu F, Yang Y, Li M, Wang Q (2012) An investigation on the feasibility of cross-project defect prediction. Automated Software Engg 19(2):167–199
Article Google Scholar
Jiang Y, Cukic B, Menzies T (2008) Can data transformation help in the detection of fault-prone modules?. In: Proc. Workshop on Defects in Large Software Systems (DEFECTS’08), pp 16–20
Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto Ki (2007) The effects of over and under sampling on fault-prone module detection. In: Proc. Int’l Symposium on Empirical Softw. Eng. and Measurement (ESEM’07), pp 196–204
Kamei Y, Matsumoto S, Monden A, Matsumoto K, Adams B, Hassan AE (2010) Revisiting common bug prediction findings using effort aware models. In: Proc. Int’l Conf. on Software Maintenance (ICSM’10), pp 1–10
Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773
Article Google Scholar
Kampstra P (2008) Beanplot: A boxplot alternative for visual comparison of distributions. J Stat Softw,Code Snippets 28(1):1–9
Google Scholar
Kim S, Whitehead EJ, Zhang Y (2008) Classifying software changes: Clean or buggy IEEE Trans Softw Eng 34(2):181–196
Article Google Scholar
Kocaguneli E, Menzies T, Keung J (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416
Article Google Scholar
Koru AG, Zhang D, El Emam K, Liu H (2009) An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans Softw Eng 35(2):293–304
Article Google Scholar
Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496
Article Google Scholar
Li PL, Herbsleb J, Shaw M, Robinson B (2006) Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’06), pp 413–422
Matsumoto S, Kamei Y, Monden A, Matsumoto K (2010) An analysis of developer metrics for fault prediction. In: Proc. Int’l Conf. on Predictive Models in Softw. Eng. (PROMISE’10), pp 18:1–18:9
McIntosh S, Nagappan M, Adams B, Mockus A, Hassan A E (2014) A large-scale empirical study of the relationship between build technology and build maintenance. Empirical Software Engineering. doi:10.1.1/jpb001. http://link.springer.com/article/10.1007
Menzies T, Turhan B, Bener A, Gay G, Cukic B, Jiang Y (2008) Implications of ceiling effects in defect predictors. In: Proc. Int’l Conf. on Predictive Models in Softw. Eng. (PROMISE’10), pp 47–54
Menzies T, Butcher A, Marcus A, Zimmermann T, Cok D (2011) Local vs. global models for effort estimation and defect prediction. In: Proc. Int’l Conf. on Automated Software Engineering (ASE’11), pp 343–351
Menzies T, Butcher A, Cok D, Marcus A, Layman L, Shull F, Turhan B, Zimmermann T (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39(6):822–834
Article Google Scholar
Minku LL, Yao X (2014) How to make best use of cross-company data in software effort estimation?. In: Proc. Int’l Conf. on Software Engineering (ICSE’14), pp 446–456
Mısırlı AT, Bener AB, Turhan B (2011) An industrial case study of classifier ensembles for locating software defects. Softw Qual J 19(3):515–536
Article Google Scholar
Mockus A (2009) Amassing and indexing a large sample of version control systems: Towards the census of public source code history. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’09), pp 11–20
Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180
Article Google Scholar
Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’08), 181–190
Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’05), pp 284–292
Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’06), pp 452–461
Nam J, Pan S J, Kim S (2013) Transfer defect learning. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’13), pp 382–391
Purushothaman R, Perry DE (2005) Toward understanding the rhetoric of small source code changes. IEEE Trans Softw Eng 31(6):511–526
Article Google Scholar
Rahman F, Posnett D, Devanbu P (2012) Recalling the ”imprecision” of cross-project defect prediction. In: Proc. Int’l Symposium on the Foundations of Softw. Eng. (FSE’12), pp 61:1–61:11
Ratzinger J, Sigmund T, Gall HC (2008) On the relation of refactorings and software defect prediction. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’08), pp 35–38
Shihab E (2012) An exploration of challenges limiting pragmatic software defect prediction. PhD thesis, Queen’s University
Shihab E, Hassan AE, Adams B, Jiang ZM (2012) An industrial study on the risk of software changes. In: Proc. Int’l Symposium on the Foundations of Softw. Eng. (FSE’12), pp 62:1–62:11
Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’05), pp 1–5
Tan M, Tan L, Dara S, Mayuex C (2015) Online defect prediction for imbalanced data. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’13 SEIP), (To appear)
Thomas SW, Nagappan M, Blostein D, Hassan AE (2013) The impact of classifier configuration and classifier combination on bug localization. IEEE Trans Softw Eng 39(10):1427–1443
Article Google Scholar
Turhan B (2012) On the dataset shift problem in software engineering prediction models. Empirical Softw Engg 17(1-2):62–74
Article MathSciNet Google Scholar
Turhan B, Menzies T, Bener AB, Di Stefano J (2009) On the relative value of cross-company and within-company data for defect prediction. Empir Softw Eng 14 (5):540–578
Article Google Scholar
Turhan B, Tosun A, Bener A (2011) Empirical evaluation of mixed-project defect prediction models. In: Proc. EUROMICRO Conf. on Software Engineering and Advanced Applications (SEAA’11), pp 396–403
Wu R, Zhang H, Kim S, Cheung SC (2011) Relink: recovering links between bugs and changes. In: Proc. European Softw. Eng. Conf. and Symposium on the Foundations of Softw. Eng. (ESEC/FSE’11), pp 15–25
Zhang F, Mockus A, Zou Y, Khomh F, Hassan AE (2013) How does context affect the distribution of software maintainability metrics?. In: Proc. Int’l Conf. on Software Maintenance (ICSM’13), pp 350–359
Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’14), pp 182–191
Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proc. European Softw. Eng. Conf. and Symposium on the Foundations of Softw. Eng. (ESEC/FSE’09), pp 91–100

Download references

Acknowledgments

This research was partially supported by JSPS KAKENHI Grant Numbers 15H05306 and 24680003 and the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Principles of Software Languages Group (POSL), Kyushu University, Fukuoka-shi, Fukuoka, 819-0395, Japan
Yasutaka Kamei, Takafumi Fukushima, Kazuhiro Yamashita & Naoyasu Ubayashi
Department of Electrical and Computer Engineering, McGill University, Montréal, QC, H3A OG4, Canada
Shane McIntosh
Software Analysis and Intelligence Lab (SAIL), Queen’s University, Kingston, ON, K7L 3N6, Canada
Ahmed E. Hassan

Authors

Yasutaka Kamei
View author publications
You can also search for this author in PubMed Google Scholar
Takafumi Fukushima
View author publications
You can also search for this author in PubMed Google Scholar
Shane McIntosh
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiro Yamashita
View author publications
You can also search for this author in PubMed Google Scholar
Naoyasu Ubayashi
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed E. Hassan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yasutaka Kamei or Shane McIntosh.

Additional information

Communicated by: Sunghun Kim and Martin Pinzger

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kamei, Y., Fukushima, T., McIntosh, S. et al. Studying just-in-time defect prediction using cross-project models. Empir Software Eng 21, 2072–2106 (2016). https://doi.org/10.1007/s10664-015-9400-x

Download citation

Published: 14 September 2015
Issue Date: October 2016
DOI: https://doi.org/10.1007/s10664-015-9400-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Studying just-in-time defect prediction using cross-project models

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

How different are different diff algorithms in Git?

Applications of AI in classical software engineering

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Studying just-in-time defect prediction using cross-project models

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

How different are different diff algorithms in Git?

Applications of AI in classical software engineering

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation