An Open-Source Software Metric Tool for Defect Prediction, Its Case Study and Lessons We Learned

Gabdrakhmanov, Bulat; Tolkachev, Aleksey; Succi, Giancarlo; Yi, Jooyong

doi:10.1007/978-3-030-14687-0_7

An Open-Source Software Metric Tool for Defect Prediction, Its Case Study and Lessons We Learned

Bulat Gabdrakhmanov¹⁹,
Aleksey Tolkachev¹⁹,
Giancarlo Succi¹⁹ &
…
Jooyong Yi¹⁹

Conference paper
First Online: 19 March 2019

507 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 925))

Abstract

The number of research papers on defect prediction has sharply increased for the last decade or so. One of the main driving forces behind it has been the publicly available datasets for defect prediction such as the PROMISE repository. These publicly available datasets make it possible for numerous researchers to conduct various experiments on defect prediction without having to collect data themselves. However, there are potential problems that have been ignored. First, there is a potential risk that the knowledge accumulated in the research community is, over time, likely to overfit to the datasets that are repeatedly used in numerous studies. Second, as software development practices commonly employed in the field evolve over time, these changes may potentially affect the relation between defect-proneness and software metrics, which would not be reflected in the existing datasets. In fact, these potential risks can be addressed to a significant degree, if new datasets can be prepared easily. As a step toward that goal, we introduce an open-source software metric tool, SMD (Software Metric tool for Defect prediction) that can generate code metrics and process metrics for a given Java software project in a Git repository. In our case study where we compare existing datasets with the datasets re-generated from the same software projects using our tool, we found that the two datasets are not identical with each other, despite the fact that the metric values we obtained conform to the definitions of their corresponding metrics. We learned that there are subtle factors to consider when generating and using metrics for defect prediction.

B. Gabdrakhmanov and A. Tolkachev—These authors contributed equally to the work.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Note that the NASA datasets mentioned in [7] are available in the PROMISE repository.

References

Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Article Google Scholar
D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: Proceedings of MSR 2010, 7th IEEE working conference on mining software repositories. IEEE CS Press, pp 31–41
Google Scholar
Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: 37th IEEE/ACM international conference on software engineering, ICSE 2015, Florence, Italy, 16–24 May 2015, vol 1, pp 789–800
Google Scholar
Jureczko M (2011) Significance of different software metrics in defect prediction. Softw Eng Int J 1(1):86–95
Google Scholar
Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering, PROMISE, p 9
Google Scholar
Madeyski L, Kawalerowicz M (2017) Continuous defect prediction: the idea and a related dataset. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 515–518
Google Scholar
Malhotra R (2015) A systematic review of machine learning techniques for software fault prediction. Appl Soft Comput 27:504–518
Article Google Scholar
Osman H (2017) An extensive analysis of efficient bug prediction configurations. In: Proceedings of the 13th international conference on predictive models and data analytics in software engineering. ACM, pp 107–116
Google Scholar
Parr T (2013) The definitive ANTLR 4 reference, 2nd edn. Pragmatic Bookshelf
Google Scholar
Moser R, Pedrycz W, Giancarlo S (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proceedings of ICSE, the ACM/IEEE international conference on software engineering, pp 181–90
Google Scholar
Rahman F, Devanbu PT (2013) How, and why, process metrics are better. In: 35th International conference on software engineering (ICSE), pp 432–441
Google Scholar
Shepperd MJ, Bowes D, Hall T (2014) Researcher bias: the use of machine learning in software defect prediction. IEEE Trans Softw Eng 40(6):603–616
Article Google Scholar
Shepperd MJ, Song Q, Sun Z, Mair C (2013) Data quality: some comments on the NASA software defect datasets. IEEE Trans Softw Eng 39(9):1208–1215
Article Google Scholar
Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes? In: ACM SIGSOFT software engineering notes, vol 30. ACM, pp 1–5
Google Scholar
Spinellis D (2005) Tool writing: a forgotten art? (software tools). IEEE Softw 22(4):9–11
Article Google Scholar
Varela ASN, Pérez-González HG, Martínez-Perez FE, Soubervielle-Montalvo C (2017) Source code metrics: a systematic mapping study. J Syst Softw 128:164–197
Article Google Scholar
Zimmermann T, Premraj R, Zeller A (May 2007) Predicting defects for eclipse. In: Proceedings of the third international workshop on predictor models in software engineering
Google Scholar

Download references

Author information

Authors and Affiliations

Innopolis University, Innopolis, Russia
Bulat Gabdrakhmanov, Aleksey Tolkachev, Giancarlo Succi & Jooyong Yi

Authors

Bulat Gabdrakhmanov
View author publications
You can also search for this author in PubMed Google Scholar
Aleksey Tolkachev
View author publications
You can also search for this author in PubMed Google Scholar
Giancarlo Succi
View author publications
You can also search for this author in PubMed Google Scholar
Jooyong Yi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jooyong Yi .

Editor information

Editors and Affiliations

University of Bologna, Bologna, Italy
Paolo Ciancarini
Innopolis University, Innopolis, Russia
Manuel Mazzara
Innopolis University, Innopolis, Russia
Angelo Messina
Innopolis University, Innopolis, Russia
Alberto Sillitti
Innopolis University, Innopolis, Russia
Giancarlo Succi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gabdrakhmanov, B., Tolkachev, A., Succi, G., Yi, J. (2020). An Open-Source Software Metric Tool for Defect Prediction, Its Case Study and Lessons We Learned. In: Ciancarini, P., Mazzara, M., Messina, A., Sillitti, A., Succi, G. (eds) Proceedings of 6th International Conference in Software Engineering for Defence Applications. SEDA 2018. Advances in Intelligent Systems and Computing, vol 925. Springer, Cham. https://doi.org/10.1007/978-3-030-14687-0_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-14687-0_7
Published: 19 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14686-3
Online ISBN: 978-3-030-14687-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics