Abstract
Context
Bug reports contain information that can be used by researchers and practitioners to better understand the bug fixing process and to enable the estimation of the effort necessary to fix bugs. In general, estimation models are built using the data (e.g., fixing time, severity, number of comments, number of attachments, and number of patches) present in the reports of fixed bugs (i.e., the report final’s state). However, we claim that this approach is not reliable in a real setting. Effort estimation is necessary for bug fix scheduling and team allocation tasks, which happens closer to the bug report opening than its closing. At that moment, the data available in the bug report is less informative than the data used to build the model, which may lead to an unrealistic estimation.
Objective
We propose a new approach to estimate bug-fixing time, i.e., the time span between the moment the bug was first reported until the bug is considered fixed. We consider not only the final state of the bug report to create our estimation model but all the previous available states, different from some previous studies that do not consider the reports’ updates. The concept of bug report evolution is used to create a dataset containing all investigated report states.
Method
First, we verify how often the bug reports and their fields are updated. Next, we evaluate our approach using different machine learning methods as a classification problem, with distinct output configurations, and class balancing techniques. The experimental analysis is performed with data from the JIRA issue tracking system of ten open-source projects. By leveraging the best models (considering all possible configurations) for the different states of the evolution of a bug report, we can assess whether there are significant differences in the models’ estimation ability due to the report’s state.
Results
We gathered evidence that the reports’ fields are updated often, which characterizes the reports’ evolution, impacting the building of bug-fixing estimation models. The models’ evaluation shows promising results 0.44 up to 0.85, precision values from 0.34 up to 0.74 and recall values from 0.62 up to 0.99, depending on the project.
Conclusions
Our experiments show that field updates have a meaningful impact on the models’ performance. Furthermore, we present a new approach to deal with the bug report evolution by considering each report version as an independent report. Finally, we also make available our dataset to the community.
Similar content being viewed by others
Notes
An issue could represent a story, a bug, a task, or another issue type in the project.
In Mozilla’s (Firefox) case, it is partly automated, see https://hacks.mozilla.org/2019/04/teaching-machines-to-triage-firefox-bugs/
References
Akbarinasaji S, Caglayan B, Bener A (2018) Predicting bug-fixing time: a replication study using an open source software project. J Syst Softw 136:173–186. https://doi.org/10.1016/j.jss.2017.02.021. http://www.sciencedirect.com/science/article/pii/S0164121217300365
Al-Zubaidi W H A, Dam H K, Ghose A, Li X (2017) Multi-objective search-based approach to estimate issue resolution time. In: Proceedings of the 13th international conference on predictive models and data analytics in software engineering, PROMISE. https://doi.org/10.1145/3127005.3127011. Association for Computing Machinery, New York, pp 53–62
Ardimento P, Bilancia M, Monopoli S (2016) Predicting bug-fix time: using standard versus topic-based text categorization techniques. In: Calders T, Ceci M, Malerba D (eds) Discovery science. Springer International Publishing, Cham, pp 167–182
Assar S, Borg M, Pfahl D (2016) Using text clustering to predict defect resolution time: a conceptual replication and an evaluation of prediction accuracy. Empirical Softw Engg 21(4):1437–1475. https://doi.org/10.1007/s10664-015-9391-7
Baysal O, Holmes R, Godfrey M W (2013) Situational awareness: personalizing issue tracking systems. In: Proceedings of the 2013 international conference on software engineering, ICSE ’13. IEEE Press, pp 1185–1188
Bhattacharya P, Neamtiu I (2011) Bug-fix time prediction models: can we do better?. In: Proceedings of the 8th working conference on mining software repositories, MSR ’11. https://doi.org/10.1145/1985441.1985472. Association for Computing Machinery, New York, pp 207–210
Brady F (2013) Cambridge university report on cost of software faults, press release. http://www.prweb.com/releases/2013/1/prweb10298185.htm. Accessed 2020-01-02
Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Ebrahimi N, Trabelsi A, Islam M S, Hamou-Lhadj A, Khanmohammadi K (2019) An hmm-based approach for automatic detection and classification of duplicate bug reports. Inf Softw Technol 113:98–109. https://doi.org/10.1016/j.infsof.2019.05.007. http://www.sciencedirect.com/science/article/pii/S095058491930117X
Guo P J, Zimmermann T, Nagappan N, Murphy B (2011) “Not my bug!” and other reasons for software bug report reassignments. In: Proceedings of the ACM 2011 conference on computer supported cooperative work, CSCW ’11. https://doi.org/10.1145/1958824.1958887. Association for Computing Machinery, New York, pp 395–404
Habayeb M, Murtaza S S, Miranskyy A, Bener A B (2018) On the use of hidden markov model to predict the time to fix bugs. IEEE Trans Softw Eng 44(12):1224–1244. https://doi.org/10.1109/TSE.2017.2757480
Hamill M, Goseva-Popstojanova K (2017) Analyzing and predicting effort associated with finding and fixing software faults. Inf Softw Technol 87:1–18. https://doi.org/10.1016/j.infsof.2017.01.002. http://www.sciencedirect.com/science/article/pii/S0950584917300290
Hauge O, Ayala C, Conradi R (2010) Adoption of open source software in software-intensive organizations—a systematic literature review. Inf Softw Technol 52(11):1133–1154. https://doi.org/10.1016/j.infsof.2010.05.008
Hensman J, Fusi N, Lawrence N D (2013) Gaussian processes for big data. In: Proceedings of the twenty-ninth conference on uncertainty in artificial intelligence, pp 282–290
Herzig K, Just S, Zeller A (2013) It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In: Proceedings of the 2013 international conference on software engineering, ICSE ’13. IEEE Press, pp 392–401
Hooimeijer P, Weimer W (2007) Modeling bug report quality. In: Proceedings of the twenty-second IEEE/ACM international conference on automated software engineering, ASE ’07. https://doi.org/10.1145/1321631.1321639. Association for Computing Machinery, New York, pp 34–43
Hu H, Zhang H, Xuan J, Sun W (2014) Effective bug triage based on historical bug-fix information. In: 2014 IEEE 25th international symposium on software reliability engineering. https://doi.org/10.1109/ISSRE.2014.17, pp 122–132
Kim S, Whitehead E J (2006) How long did it take to fix bugs?. In: Proceedings of the 2006 international workshop on mining software repositories, MSR ’06. https://doi.org/10.1145/1137983.1138027. Association for Computing Machinery, New York, pp 173–174
Lazar A, Ritchey S, Sharif B (2014) Improving the accuracy of duplicate bug report detection using textual similarity measures. In: Proceedings of the 11th working conference on mining software repositories, MSR 2014, p 308–311. https://doi.org/10.1145/2597073.2597088. Association for Computing Machinery, New York
Lenarduzzi V, Taibi D, Tosi D, Lavazza L, Morasca S (2020) Open source software evaluation, selection, and adoption: a systematic literature review. In: 2020 46th Euromicro conference on software engineering and advanced applications (SEAA). https://doi.org/10.1109/SEAA51224.2020.00076, pp 437–444
Raja U (2013) All complaints are not created equal: text analysis of open source software defect reports. Empir Softw Eng 18(1):117–138. https://doi.org/10.1007/s10664-012-9197-9
Serrano N, Ciordia I (2005) Bugzilla, itracker, and other bug trackers. IEEE Softw 22(2):11–13. https://doi.org/10.1109/MS.2005.32
Shokripour R, Anvik J, Kasirun Z M, Zamani S (2015) A time-based approach to automatic bug report assignment. J Syst Softw 102(C):109–122. https://doi.org/10.1016/j.jss.2014.12.049
Thung F (2016) Automatic prediction of bug fixing effort measured by code churn size. In: Proceedings of the 5th international workshop on software mining, SoftwareMining 2016. https://doi.org/10.1145/2975961.2975964. Association for Computing Machinery, New York, pp 18–23
Tian Y, Lo D, Xia X, Sun C (2015) Automated prediction of bug report priority using multi-factor analysis. Empirical Softw Engg 20(5):1354–1383. https://doi.org/10.1007/s10664-014-9331-y
Vieira R, da Silva A, Rocha L, Gomes JAP (2019) From reports to bug-fix commits: a 10 years dataset of bug-fixing activity from 55 apache’s open source projects. In: Proceedings of the fifteenth international conference on predictive models and data analytics in software engineering, PROMISE’19. https://doi.org/10.1145/3345629.3345639. http://doi.acm.org/10.1145/3345629.3345639. ACM, New York, pp 80–89
Weiss C, Premraj R, Zimmermann T, Zeller A (2007) How long will it take to fix this bug?. In: Fourth international workshop on mining software repositories (MSR’07:ICSE workshops 2007). https://doi.org/10.1109/MSR.2007.13, pp 1–1
Wohlin C, Runeson P, Hst M, Ohlsson M C, Regnell B, Wessln A (2012) Experimentation in software engineering. Springer Publishing Company, Incorporated
Zhang H, Gong L, Versteeg S (2013) Predicting bug-fixing time: an empirical study of commercial software projects. In: 2013 35th International conference on software engineering (ICSE). https://doi.org/10.1109/ICSE.2013.6606654, pp 1042–1051
Zhang X, Yao L, Huang C, Sheng Q Z, Wang X (2017) Intent recognition in smart living through deep recurrent neural networks. In: Liu D, Xie S, Li Y, Zhao D, El-Alfy E S M (eds) Neural information processing. Springer International Publishing, Cham, pp 748–758
Zhang X, Chen X, Yao L, Ge C, Dong M (2019) Deep neural network hyperparameter optimization with orthogonal array tuning. In: Gedeon T, Wong K W, Lee M (eds) Neural information processing. Springer International Publishing, Cham, pp 287–295
Zimmermann T, Premraj R, Bettenburg N, Just S, Schroter A, Weiss C (2010) What makes a good bug report? IEEE Trans Softw Eng 36 (5):618–643. https://doi.org/10.1109/TSE.2010.63
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Foutse Khomh, Gemma Catolino, Pasquale Salza
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Vieira, R.G., Mattos, C.L.C., Rocha, L.S. et al. The role of bug report evolution in reliable fixing estimation. Empir Software Eng 27, 164 (2022). https://doi.org/10.1007/s10664-022-10213-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-022-10213-7