research-article

Method-level bug prediction

Authors:

Marco D'Ambros,

Martin Pinzger,

Harald C. GallAuthors Info & Claims

ESEM '12: Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement

Pages 171 - 180

https://doi.org/10.1145/2372251.2372285

Published: 19 September 2012 Publication History

Abstract

Researchers proposed a wide range of approaches to build effective bug prediction models that take into account multiple aspects of the software development process. Such models achieved good prediction performance, guiding developers towards those parts of their system where a large share of bugs can be expected. However, most of those approaches predict bugs on file-level. This often leaves developers with a considerable amount of effort to examine all methods of a file until a bug is located. This particular problem is reinforced by the fact that large files are typically predicted as the most bug-prone. In this paper, we present bug prediction models at the level of individual methods rather than at file-level. This increases the granularity of the prediction and thus reduces manual inspection efforts for developers. The models are based on change metrics and source code metrics that are typically used in bug prediction. Our experiments---performed on 21 Java open-source (sub-)systems---show that our prediction models reach a precision and recall of 84% and 88%, respectively. Furthermore, the results indicate that change metrics significantly outperform source code metrics.

References

[1]

G. Antoniol, K. Ayari, M. D. Penta, F. Khomh, and Y.-G. Gueheneuc. Is it a bug or an enhancement? a text-based approach to classify change requests. In Proc. Conf. of the center for advanced studies on collaborative research: meeting of minds, pages 304--318, 2008.

Digital Library

[2]

E. Arisholm and L. Briand. Predicting fault-prone components in a java legacy system. In Proc. Int'l Symp. on Empir. Softw. Eng., pages 8--17, 2006.

Digital Library

[3]

V. Basili, L. Briand, and W. Melo. A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng., 22:751--761, October 1996.

Digital Library

[4]

A. Bernstein, J. Ekanayake, and M. Pinzger. Improving defect prediction using temporal features and non linear models. In Proc. Int'l Workshop on Principles of Softw. Evolution, pages 11--18, 2007.

Digital Library

[5]

C. Bird, A. Bachmann, E. Aune, J. Duffy, A. Bernstein, V. Filkov, and P. Devanbu. Fair and balanced?: bias in bug-fix datasets. In Proc. Joint Eur. Softw. Eng. Conf. and Symp. on the Found. of Softw. Eng., pages 121--130, 2009.

Digital Library

[6]

C. Bird, N. Nagappan, P. Devanbu, H. Gall, and B. Murphy. Does distributed development affect software quality? an empirical case study of windows vista. In Proc. Int'l Conf. on Softw. Eng., pages 518--528, 2009.

Digital Library

[7]

C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu. Don't Touch My Code! Examining the Effects of Ownership on Software Quality. In Proc. Joint Eur Softw. Eng. Conf. and Symp. on the Found. of Softw. Eng., pages 4--14, 2011.

Digital Library

[8]

S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. IEEE Trans. Softw. Eng., 20(6):476--493, June 1994.

Digital Library

[9]

M. D'Ambros, M. Lanza, and R. Robbes. Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir. Softw. Eng., pages 1--47, 2011.

Digital Library

[10]

G. Denaro and M. Pezze. An empirical evaluation of fault-proneness models. In Proc. Int'l Conf. on Softw. Eng., pages 241--251, 2002.

Digital Library

[11]

K. E. Emam, S. Benlarbi, N. Goel, and S. Rai. The confounding effect of class size on the validity of object-oriented metrics. IEEE Trans. on Softw. Eng., 27(7):630--650, July 2001.

Digital Library

[12]

B. Fluri, M. Wursch, M. Pinzger, and H. C. Gall. Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction. IEEE Trans. on Softw. Eng., 33(11):725--743, November 2007.

Digital Library

[13]

B. Fluri, J. Zuberbuehler, and H. C. Gall. Recommending method invocation context changes. In Proc. Int'l Workshop on Recomm. Syst. for Softw. Eng., pages 1--5, 2008.

Digital Library

[14]

H. C. Gall, B. Fluri, and M. Pinzger. Change analysis with evolizer and changedistiller. IEEE Software, 26(1):26--33, January/February 2009.

Digital Library

[15]

G. Ghezzi and H. Gall. Sofas: A lightweight architecture for software analysis as a service. In Proc. Working Conf. on Softw. Architecture, pages 93--102, 2011.

Digital Library

[16]

E. Giger, M. Pinzger, and H. C. Gall. Comparing fine-grained source code changes and code churn for bug prediction. In Proc. Int'l Workshop on Mining Softw. Repos., pages 83--92, 2011.

Digital Library

[17]

T. Graves, A. Karr, J. Marron, and H. Siy. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng., 26:653--661, July 2000.

Digital Library

[18]

T. Gyimothy, R. Ferenc, and I. Siket. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng., 31:897--910, 2005.

Digital Library

[19]

A. Hassan. Predicting faults using the complexity of code changes. In Proc. Int'l Conf. on Softw. Eng., pages 78--88, 2009.

Digital Library

[20]

Y. Kamei, S. Matsumoto, A. Monden, K. Matsumoto, B. Adams, and A. Hassan. Revisiting common bug prediction findings using effort-aware models. In Proc. Int'l Conf. on Softw. Maint., pages 1--10, 2010.

Digital Library

[21]

D. Kawrykow and M. P. Robillard. Non-essential changes in version histories. In Proc. Int'l Conf. on Softw. Eng., pages 351--360, 2011.

Digital Library

[22]

S. Kim, H. Zhang, R. Wu, and L. Gong. Dealing with noise in defect prediction. In Proc. Int'l Conf. on Softw. Eng., pages 481--490, 2011.

Digital Library

[23]

S. Kim, T. Zimmermann, J. Whitehead, and A. Zeller. Predicting faults from cached history. In Proc. Int'l Conf. on Softw. Eng., pages 489--498, 2007.

Digital Library

[24]

P. Knab, M. Pinzger, and A. Bernstein. Predicting defect densities in source code files with decision tree learners. In Proc. Int'l Workshop on Mining Softw. Repos., pages 119--125, 2006.

Digital Library

[25]

T. Lee, J. Nam, D. Han, S. Kim, and H. P. In. Micro interaction metrics for defect prediction. In Proc. Joint Eur. Softw. Eng. Conf. and Symp. on the Found. of Softw. Eng., pages 311--321, 2011.

Digital Library

[26]

S. Lessmann, B. Baesens, C. M. Swantje, and Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. on Softw. Eng., 34:485--496, July 2008.

Digital Library

[27]

T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. IEEE Trans. on Softw. Eng., 33:2--13, January 2007.

Digital Library

[28]

T. Menzies, Z. Milton, B. Turhan, B. Cukic, Y. Jiang, and A. Bener. Defect prediction from static code features: current results, limitations, new approaches. Automated Softw. Eng., 17(4):375--407, 2010.

Digital Library

[29]

I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid prototyping for complex data mining tasks. In Proc. Int'l Conf. on Knowl. Discovery and Data Mining, pages 935--940, 2006.

Digital Library

[30]

R. Moser, W. Pedrycz, and G. Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proc. Int'l Conf. on Softw. Eng., pages 181--190, 2008.

Digital Library

[31]

N. Nagappan and T. Ball. Use of relative code churn measures to predict system defect density. In Proc. Int'l Conf. on Softw. Eng., pages 284--292, 2005.

Digital Library

[32]

N. Nagappan, T. Ball, and A. Zeller. Mining metrics to predict component failures. In Proc. Int'l Conf. on Softw. Eng., pages 452--461, 2006.

Digital Library

[33]

N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality: an empirical case study. In Proc. Int'l Conf. on Softw. Eng., pages 521--530, 2008.

Digital Library

[34]

N. Nagappan, A. Zeller, T. Zimmermann, K. Herzig, and B. Murphy. Change bursts as defect predictors. In Proc. Int'l Symp. on Softw. Reliability Eng., 2010.

Digital Library

[35]

T. Nguyen, B. Adams, and A. Hassan. Studying the impact of dependency network measures on software quality. In Int'l Conf. on Softw. Maint., pages 1--10, 2010.

Digital Library

[36]

T. Ostrand, E. Weyuker, and R. Bell. Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Eng., 31(4):340--355, 2005.

Digital Library

[37]

M. Pinzger, N. Nagappan, and B. Murphy. Can developer-module networks predict failures? In Proc. Symp. on the Found. of Softw. Eng., pages 2--12, 2008.

Digital Library

[38]

D. Posnett, V. Filkov, and P. Devanbu. Ecological inference in empirical software engineering. In Proc. Int'l Conf. on Automated Softw. Eng., pages 362--371, 2011.

Digital Library

[39]

R. Purushothaman and D. Perry. Toward understanding the rhetoric of small source code changes. IEEE Trans. Softw. Eng., 31(6):511--526, June 2005.

Digital Library

[40]

F. Rahman and P. Devanbu. Ownership, experience and defects: a fine-grained study of authorship. In Proc. Int'l Conf. on Softw. Eng., pages 491--500, 2011.

Digital Library

[41]

E. Shihab, M. Jiang, W. Ibrahim, B. Adams, and A. Hassan. Understanding the impact of code and process metrics on post-release defects: a case study on the eclipse project. In Proc. Int'l Symp. on Empir. Softw. Eng. and Meas., pages 1--10, 2010.

Digital Library

[42]

E. Shihab, A. Mockus, Y. Kamei, B. Adams, and A. Hassan. High-impact defects: a study of breakage and surprise defects. In Proc. Joint Eur. Softw. Eng. Conf. and Symp. on the Found. of Softw. Eng., pages 300--310, 2011.

Digital Library

[43]

J. Sliwerski, T. Zimmermann, and A. Zeller. When do changes induce fixes? In Proc. Int'l Workshop on Mining Softw. Repos., pages 1--5, 2005.

Digital Library

[44]

R. Subramanyam and M. Krishnan. Empirical analysis of ck metrics for object-oriented design complexity: Implications for software defects. IEEE Trans. Softw. Eng., 29(4):297--310, 2003.

Digital Library

[45]

R. Wu, H. Zhang, S. Kim, and S.-C. Cheung. Relink: Recovering links between bugs and changes. In Proc. Joint Eur. Softw. Eng. Conf. and Symp. on the Found. of Softw. Eng., pages 15--25, 2011.

Digital Library

[46]

H. Zhang. An investigation of the relationships between lines of code and defects. In Proc. Int'l Conf. on Softw. Maint., pages 274--283, 2009.

[47]

T. Zimmermann and N. Nagappan. Predicting defects using network analysis on dependency graphs. In Proc. Int'l Conf. on Softw. Eng., pages 531--540, 2008.

Digital Library

[48]

T. Zimmermann, R. Premraj, and A. Zeller. Predicting defects for eclipse. In Proc. Int'l Workshop on Predictor Models in Softw. Eng., pages 9--15, 2007.

Digital Library

Cited By

Yin SGuo SLi HLi CChen RLi XJiang H(2025)Line-Level Defect Prediction by Capturing Code Contexts With Graph Convolutional NetworksIEEE Transactions on Software Engineering10.1109/TSE.2024.350372351:1(172-191)Online publication date: Jan-2025
https://doi.org/10.1109/TSE.2024.3503723
Ahmad SChowdhury SHolmes R(2025)Impact of methodological choices on the analysis of code metrics and maintenanceJournal of Systems and Software10.1016/j.jss.2024.112263220(112263)Online publication date: Feb-2025
https://doi.org/10.1016/j.jss.2024.112263
Balasubramaniam BAhmed IBagheri HBradley J(2024)Carving Out Control Code: Automated Identification of Control Software in Autopilot SystemsACM Transactions on Cyber-Physical Systems10.1145/36782598:4(1-20)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3678259
Show More Cited By

Index Terms

Method-level bug prediction
1. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Reusability
        Software product lines
    2. Software verification and validation
      1. Empirical software validation
      2. Process validation
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods

Recommendations

An exploratory study of bug prediction at the method level
Abstract Context:
During the past decades, researchers have proposed numerous studies to predict bugs at different granularity levels, such as the file level, package level, module level, etc. However, the prediction models at the method level are rarely ...
Method-level Bug Prediction: Problems and Promises
Fixing software bugs can be colossally expensive, especially if they are discovered in the later phases of the software development life cycle. As such, bug prediction has been a classic problem for the research community. As of now, the Google Scholar ...
Empirical Evaluation of Hunk Metrics as Bug Predictors
IWSM '09 /Mensura '09: Proceedings of the International Conferences on Software Process and Product Measurement

Reducing the number of bugs is a crucial issue during software development and maintenance. Software process and product metrics are good indicators of software complexity. These metrics have been used to build bug predictor models to help developers ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEM '12: Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement

September 2012

338 pages

ISBN:9781450310567

DOI:10.1145/2372251

General Chair:
Per Runeson
Lund University, Sweden
,
Program Chairs:
Martin Höst
Lund University, Sweden
,
Emilia Mendes
Zayed University, United Arab Emirates
,
Anneliese Andrews
University of Denver, USA
,
Rachel Harrison
Oxford Brookes University, UK

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 September 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ESEM '12

Sponsor:

SIGSOFT

ESEM '12: 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement

September 19 - 20, 2012

Lund, Sweden

Acceptance Rates

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

108
Total Citations
View Citations
863
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)15

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yin SGuo SLi HLi CChen RLi XJiang H(2025)Line-Level Defect Prediction by Capturing Code Contexts With Graph Convolutional NetworksIEEE Transactions on Software Engineering10.1109/TSE.2024.350372351:1(172-191)Online publication date: Jan-2025
https://doi.org/10.1109/TSE.2024.3503723
Ahmad SChowdhury SHolmes R(2025)Impact of methodological choices on the analysis of code metrics and maintenanceJournal of Systems and Software10.1016/j.jss.2024.112263220(112263)Online publication date: Feb-2025
https://doi.org/10.1016/j.jss.2024.112263
Balasubramaniam BAhmed IBagheri HBradley J(2024)Carving Out Control Code: Automated Identification of Control Software in Autopilot SystemsACM Transactions on Cyber-Physical Systems10.1145/36782598:4(1-20)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3678259
Olewicki DHabchi SAdams B(2024)An Empirical Study on Code Review Activity Prediction and Its Impact in PracticeProceedings of the ACM on Software Engineering10.1145/36608061:FSE(2238-2260)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660806
Perera ATurhan BAleti ABöhme M(2024)On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software TestingACM Transactions on Software Engineering and Methodology10.1145/365502233:6(1-27)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3655022
Wang YMo RZhang Y(2024)Machine Learning-based Models for Predicting Defective PackagesProceedings of the 2024 8th International Conference on Machine Learning and Soft Computing10.1145/3647750.3647755(25-31)Online publication date: 26-Jan-2024
https://dl.acm.org/doi/10.1145/3647750.3647755
Chowdhury SUddin GHemmati HHolmes R(2024)Method-level Bug Prediction: Problems and PromisesACM Transactions on Software Engineering and Methodology10.1145/364033133:4(1-31)Online publication date: 13-Jan-2024
https://dl.acm.org/doi/10.1145/3640331
Hassan KMoradi SChowdhury SRouhani S(2024)Versioned Analysis of Software Quality Indicators and Self-admitted Technical Debt in Ethereum Smart Contracts with Ethstractor2024 IEEE International Conference on Blockchain (Blockchain)10.1109/Blockchain62396.2024.00075(512-519)Online publication date: 19-Aug-2024
https://doi.org/10.1109/Blockchain62396.2024.00075
Jász J(2024)The Effectiveness of Hidden Dependence Metrics in Bug PredictionIEEE Access10.1109/ACCESS.2024.340692912(77214-77225)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3406929
Dai HXi JDai H(2024)Improving effort-aware just-in-time defect prediction with weighted code churn and multi-objective slime mold algorithmHeliyon10.1016/j.heliyon.2024.e37360(e37360)Online publication date: Sep-2024
https://doi.org/10.1016/j.heliyon.2024.e37360
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten