ABSTRACT
During software maintenance, developers often receive many bug reports. Project managers often need to manage limited resources to resolve the many bugs that a project receives. To help project managers perform their job, past studies have proposed techniques that predict the amount of time that passes between a bug report being submitted and it being resolved. However, this time period might not be representative of the actual development effort, as developers might not work on the bug right away or all the time. In the open source development setting, developers are only volunteers and might not devote their full working hours to fix a bug in a particular open source project. In the industrial setting, developers might be asked to perform various tasks aside from fixing a particular bug.
In this work, we estimate bug fixing effort in terms of code churn size. Code churn size is the number of lines of code that is either added, deleted, or modified to fix the bug. Lines of code has traditionally been used to estimate effort. However, no past studies have proposed techniques to automatically predict code churn size. In this work, using code churn size as estimation for bug fixing effort, we propose a classification-based approach that predicts, given a bug report, whether the bug fixing effort would be high or low. We have evaluated our approach on 1,029 bug reports from hadoop-common and struts2. The result is promising; we can achieve an Area Under the Receiver Operating Curve (AUC) of 0.612 to predict bug fixing effort in terms of lines of code churned, which is a 22.4% improvement over a baseline.
- G. Antoniol, K. Ayari, M. D. Penta, F. Khomh, and Y.-G. Guéhéneuc. Is it a bug or an enhancement?: a text-based approach to classify change requests. In CASCON, page 23, 2008. Google ScholarDigital Library
- J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In ICSE, pages 361–370. ACM, 2006. Google ScholarDigital Library
- P. Bhattacharya and I. Neamtiu. Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In ICSM, pages 1–10. IEEE, 2010. Google ScholarDigital Library
- T. F. Bissyandé, F. Thung, S. Wang, D. Lo, L. Jiang, and L. Réveillère. Empirical evaluation of bug linking. In CSMR, pages 89–98, 2013. Google ScholarDigital Library
- B. Boehm, C. Abts, A. Brown, S. Chulani, B. Clark, E. Horowitz, R. Madachy, D. Reifer, and B. Steece. Software Cost Estimation with Cocomo II. Prentice Hall, 2000. Google ScholarDigital Library
- D. ˇ Cubrani´ c. Automatic bug triage using text categorization. In SEKE. Citeseer, 2004.Google Scholar
- E. Giger, M. Pinzger, and H. Gall. Predicting the fix time of bugs. In RSSE, 2010. Google ScholarDigital Library
- J. Han and M. Kamber. Data Mining Concepts and Techniques. Morgan Kaufmann, 2nd edition, 2006. Google ScholarDigital Library
- F. Heemstra. Software cost estimation. Information and Software Technology, 34:627–639, 1992.Google ScholarCross Ref
- K. Herzig, S. Just, and A. Zeller. It’s not a bug, it’s a feature: How misclassification impacts bug prediction. In ICSE, 2013. Google ScholarDigital Library
- P. Hooimeijer and W. Weimer. Modeling bug report quality. In ASE, pages 34–43, 2007. Google ScholarDigital Library
- H. Hosseini, R. Nguyen, and M. W. Godfrey. A market-based bug allocation mechanism using predictive bug lifetimes. In CSMR, 2012. Google ScholarDigital Library
- G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with bug tossing graphs. In ESEC/FSE, pages 111–120. ACM, 2009. Google ScholarDigital Library
- A. Lamkanfi, S. Demeyer, E. Giger, and B. Goethals. Predicting the severity of a reported bug. In MSR, pages 1–10, 2010.Google ScholarCross Ref
- A. Lamkanfi, S. Demeyer, E. Giger, and B. Goethals. Predicting the severity of a reported bug. In MSR, pages 1–10. IEEE, 2010.Google Scholar
- A. Lamkanfi, S. Demeyer, Q. D. Soetens, and T. Verdonck. Comparing mining algorithms for predicting the severity of a reported bug. In CSMR, pages 249–258. IEEE, 2011. Google ScholarDigital Library
- T.-D. B. Le, S. Wang, and D. Lo. Multi-abstraction concern localization. In ICSM, pages 364–367, 2013. Google ScholarDigital Library
- S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. Software Eng., 34(4):485–496, 2008. Google ScholarDigital Library
- C. X. Ling, J. Huang, and H. Zhang. AUC: A better measure than accuracy in comparing learning algorithms. In Canadian Conference on AI, pages 329–341, 2003. Google ScholarDigital Library
- S. K. Lukins, N. A. Kraft, and L. H. Etzkorn. Bug localization using latent dirichlet allocation. Information and Software Technology, 52(9):972–990, 2010. Google ScholarDigital Library
- C. Manning, P. Raghavan, and H. Schutze. Introduction to Information Retrieval. Cambridge, 2008. Google ScholarDigital Library
- T. Menzies and A. Marcus. Automated severity assessment of software defect reports. In ICSM, pages 346–355. IEEE, 2008.Google Scholar
- L. D. Panjer. Predicting eclipse bug lifetimes. In MSR, page 29, 2007. Google ScholarDigital Library
- M. Porter. An algorithm for suffix stripping. Program, 1980.Google ScholarCross Ref
- S. Rao and A. Kak. Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In MSR, 2011. Google ScholarDigital Library
- D. Romano and M. Pinzger. Using source code metrics to predict change-prone java interfaces. In ICSM, pages 303–312, 2011. Google ScholarDigital Library
- R. K. Saha, M. Lease, S. Khurshid, and D. E. Perry. Improving bug localization using structured information retrieval. In ASE, pages 345–355, 2013.Google ScholarDigital Library
- A. Tamrawi, T. T. Nguyen, J. M. Al-Kofahi, and T. N. Nguyen. Fuzzy set and cache-based approach for bug triaging. In ESEC/FSE, pages 365–375. ACM, 2011. Google ScholarDigital Library
- H. Zhang, L. Gong, and S. Versteeg. Predicting bug-fixing time: an empirical study of commercial software projects. In ICSE, 2013. Google ScholarDigital Library
- J. Zhou, H. Zhang, and D. Lo. Where should the bugs be fixed? - more accurate information retrieval-based bug localization based on bug reports. In ICSE, 2012. Google ScholarDigital Library
Index Terms
- Automatic prediction of bug fixing effort measured by code churn size
Recommendations
An Empirical Study on Factors Impacting Bug Fixing Time
WCRE '12: Proceedings of the 2012 19th Working Conference on Reverse EngineeringFixing bugs is an important activity of the software development process. A typical process of bug fixing consists of the following steps: 1) a user files a bug report, 2) the bug is assigned to a developer, 3) the developer fixes the bug, 4) changed ...
Not all bug reopens are negative: A case study on eclipse bug reports
Highlights- A novel concept of non-negative bug reopens is proposed.
- A practical approach ...
Abstract ContextWe observed a special type of bug reopen that has no direct impact on the user experience or the normal operation of the system being developed. We refer to these as non-negative bug reopens.
...How Much Effort Needed to Fix the Bug? A Data Mining Approach for Effort Estimation and Analysing of Bug Report Attributes in Firefox
ICICA '14: Proceedings of the 2014 International Conference on Intelligent Computing ApplicationsEstimating the effort required to fix a bug is a significant task for the project manager to determine the project release. Among various ways to estimate the effort, analysis of bug report attributes proved excellent results. In this paper the effort ...
Comments