research-article

Automatic prediction of bug fixing effort measured by code churn size

Author:
Ferdian Thung

Singapore Management University, Singapore

Singapore Management University, Singapore
View Profile

SoftwareMining 2016: Proceedings of the 5th International Workshop on Software MiningSeptember 2016Pages 18–23https://doi.org/10.1145/2975961.2975964

Published:03 September 2016Publication History

SoftwareMining 2016: Proceedings of the 5th International Workshop on Software Mining

Pages 18–23

ABSTRACT

During software maintenance, developers often receive many bug reports. Project managers often need to manage limited resources to resolve the many bugs that a project receives. To help project managers perform their job, past studies have proposed techniques that predict the amount of time that passes between a bug report being submitted and it being resolved. However, this time period might not be representative of the actual development effort, as developers might not work on the bug right away or all the time. In the open source development setting, developers are only volunteers and might not devote their full working hours to fix a bug in a particular open source project. In the industrial setting, developers might be asked to perform various tasks aside from fixing a particular bug.

In this work, we estimate bug fixing effort in terms of code churn size. Code churn size is the number of lines of code that is either added, deleted, or modified to fix the bug. Lines of code has traditionally been used to estimate effort. However, no past studies have proposed techniques to automatically predict code churn size. In this work, using code churn size as estimation for bug fixing effort, we propose a classification-based approach that predicts, given a bug report, whether the bug fixing effort would be high or low. We have evaluated our approach on 1,029 bug reports from hadoop-common and struts2. The result is promising; we can achieve an Area Under the Receiver Operating Curve (AUC) of 0.612 to predict bug fixing effort in terms of lines of code churned, which is a 22.4% improvement over a baseline.

References

G. Antoniol, K. Ayari, M. D. Penta, F. Khomh, and Y.-G. Guéhéneuc. Is it a bug or an enhancement?: a text-based approach to classify change requests. In CASCON, page 23, 2008. Google ScholarDigital Library
J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In ICSE, pages 361–370. ACM, 2006. Google ScholarDigital Library
P. Bhattacharya and I. Neamtiu. Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In ICSM, pages 1–10. IEEE, 2010. Google ScholarDigital Library
T. F. Bissyandé, F. Thung, S. Wang, D. Lo, L. Jiang, and L. Réveillère. Empirical evaluation of bug linking. In CSMR, pages 89–98, 2013. Google ScholarDigital Library
B. Boehm, C. Abts, A. Brown, S. Chulani, B. Clark, E. Horowitz, R. Madachy, D. Reifer, and B. Steece. Software Cost Estimation with Cocomo II. Prentice Hall, 2000. Google ScholarDigital Library
D. ˇ Cubrani´ c. Automatic bug triage using text categorization. In SEKE. Citeseer, 2004.Google Scholar
E. Giger, M. Pinzger, and H. Gall. Predicting the fix time of bugs. In RSSE, 2010. Google ScholarDigital Library
J. Han and M. Kamber. Data Mining Concepts and Techniques. Morgan Kaufmann, 2nd edition, 2006. Google ScholarDigital Library
F. Heemstra. Software cost estimation. Information and Software Technology, 34:627–639, 1992.Google ScholarCross Ref
K. Herzig, S. Just, and A. Zeller. It’s not a bug, it’s a feature: How misclassification impacts bug prediction. In ICSE, 2013. Google ScholarDigital Library
P. Hooimeijer and W. Weimer. Modeling bug report quality. In ASE, pages 34–43, 2007. Google ScholarDigital Library
H. Hosseini, R. Nguyen, and M. W. Godfrey. A market-based bug allocation mechanism using predictive bug lifetimes. In CSMR, 2012. Google ScholarDigital Library
G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with bug tossing graphs. In ESEC/FSE, pages 111–120. ACM, 2009. Google ScholarDigital Library
A. Lamkanfi, S. Demeyer, E. Giger, and B. Goethals. Predicting the severity of a reported bug. In MSR, pages 1–10, 2010.Google ScholarCross Ref
A. Lamkanfi, S. Demeyer, E. Giger, and B. Goethals. Predicting the severity of a reported bug. In MSR, pages 1–10. IEEE, 2010.Google Scholar
A. Lamkanfi, S. Demeyer, Q. D. Soetens, and T. Verdonck. Comparing mining algorithms for predicting the severity of a reported bug. In CSMR, pages 249–258. IEEE, 2011. Google ScholarDigital Library
T.-D. B. Le, S. Wang, and D. Lo. Multi-abstraction concern localization. In ICSM, pages 364–367, 2013. Google ScholarDigital Library
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. Software Eng., 34(4):485–496, 2008. Google ScholarDigital Library
C. X. Ling, J. Huang, and H. Zhang. AUC: A better measure than accuracy in comparing learning algorithms. In Canadian Conference on AI, pages 329–341, 2003. Google ScholarDigital Library
S. K. Lukins, N. A. Kraft, and L. H. Etzkorn. Bug localization using latent dirichlet allocation. Information and Software Technology, 52(9):972–990, 2010. Google ScholarDigital Library
C. Manning, P. Raghavan, and H. Schutze. Introduction to Information Retrieval. Cambridge, 2008. Google ScholarDigital Library
T. Menzies and A. Marcus. Automated severity assessment of software defect reports. In ICSM, pages 346–355. IEEE, 2008.Google Scholar
L. D. Panjer. Predicting eclipse bug lifetimes. In MSR, page 29, 2007. Google ScholarDigital Library
M. Porter. An algorithm for suffix stripping. Program, 1980.Google ScholarCross Ref
S. Rao and A. Kak. Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In MSR, 2011. Google ScholarDigital Library
D. Romano and M. Pinzger. Using source code metrics to predict change-prone java interfaces. In ICSM, pages 303–312, 2011. Google ScholarDigital Library
R. K. Saha, M. Lease, S. Khurshid, and D. E. Perry. Improving bug localization using structured information retrieval. In ASE, pages 345–355, 2013.Google ScholarDigital Library
A. Tamrawi, T. T. Nguyen, J. M. Al-Kofahi, and T. N. Nguyen. Fuzzy set and cache-based approach for bug triaging. In ESEC/FSE, pages 365–375. ACM, 2011. Google ScholarDigital Library
H. Zhang, L. Gong, and S. Versteeg. Predicting bug-fixing time: an empirical study of commercial software projects. In ICSE, 2013. Google ScholarDigital Library
J. Zhou, H. Zhang, and D. Lo. Where should the bugs be fixed? - more accurate information retrieval-based bug localization based on bug reports. In ICSE, 2012. Google ScholarDigital Library

Index Terms

Automatic prediction of bug fixing effort measured by code churn size
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Maintaining software

Recommendations

An Empirical Study on Factors Impacting Bug Fixing Time
WCRE '12: Proceedings of the 2012 19th Working Conference on Reverse Engineering

Fixing bugs is an important activity of the software development process. A typical process of bug fixing consists of the following steps: 1) a user files a bug report, 2) the bug is assigned to a developer, 3) the developer fixes the bug, 4) changed ...
Read More
Not all bug reopens are negative: A case study on eclipse bug reports
Highlights
- A novel concept of non-negative bug reopens is proposed.
- A practical approach ...
Abstract Context
We observed a special type of bug reopen that has no direct impact on the user experience or the normal operation of the system being developed. We refer to these as non-negative bug reopens.
...
Read More
How Much Effort Needed to Fix the Bug? A Data Mining Approach for Effort Estimation and Analysing of Bug Report Attributes in Firefox
ICICA '14: Proceedings of the 2014 International Conference on Intelligent Computing Applications

Estimating the effort required to fix a bug is a significant task for the project manager to determine the project release. Among various ways to estimate the effort, analysis of bug report attributes proved excellent results. In this paper the effort ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SoftwareMining 2016: Proceedings of the 5th International Workshop on Software Mining
September 2016
42 pages
ISBN:9781450345118
DOI:10.1145/2975961
General Chairs:
Ming Li,
Xiaoyin Wang,
Lucia
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 September 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Bug Fixing Effort
Bug Report
Classification
Qualifiers
- research-article
Conference
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 222
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic prediction of bug fixing effort measured by code churn size

SoftwareMining 2016: Proceedings of the 5th International Workshop on Software Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

An Empirical Study on Factors Impacting Bug Fixing Time

Not all bug reopens are negative: A case study on eclipse bug reports

How Much Effort Needed to Fix the Bug? A Data Mining Approach for Effort Estimation and Analysing of Bug Report Attributes in Firefox

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic prediction of bug fixing effort measured by code churn size

SoftwareMining 2016: Proceedings of the 5th International Workshop on Software Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

An Empirical Study on Factors Impacting Bug Fixing Time

Not all bug reopens are negative: A case study on eclipse bug reports

How Much Effort Needed to Fix the Bug? A Data Mining Approach for Effort Estimation and Analysing of Bug Report Attributes in Firefox

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media