research-article

FLUCCS: using code and change metrics to improve fault localization

Authors:

Shin YooAuthors Info & Claims

ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pages 273 - 283

https://doi.org/10.1145/3092703.3092717

Published: 10 July 2017 Publication History

Abstract

Fault localization aims to support the debugging activities of human developers by highlighting the program elements that are suspected to be responsible for the observed failure. Spectrum Based Fault Localization (SBFL), an existing localization technique that only relies on the coverage and pass/fail results of executed test cases, has been widely studied but also criticized for the lack of precision and limited effort reduction. To overcome restrictions of techniques based purely on coverage, we extend SBFL with code and change metrics that have been studied in the context of defect prediction, such as size, age and code churn. Using suspiciousness values from existing SBFL formulas and these source code metrics as features, we apply two learn-to-rank techniques, Genetic Programming (GP) and linear rank Support Vector Machines (SVMs). We evaluate our approach with a ten-fold cross validation of method level fault localization, using 210 real world faults from the Defects4J repository. GP with additional source code metrics ranks the faulty method at the top for 106 faults, and within the top five for 173 faults. This is a significant improvement over the state-of-the-art SBFL formulas, the best of which can rank 49 and 127 faults at the top and within the top five, respectively.

References

[1]

2016. JaCoCo. http://www.eclemma.org/jacoco/. (2016). http://www.eclemma. org/jacoco/

[2]

R. Abreu, P. Zoeteweij, and A.J.C. van Gemund. 2009. Spectrum-Based Multiple Fault Localization. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009). 88–99.

Digital Library

[3]

Tien-Duy B. Le, David Lo, Claire Le Goues, and Lars Grunske. 2016. A Learningto-rank Based Fault Localization Approach Using Likely Invariants. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA 2016). ACM, New York, NY, USA, 177–188.

Digital Library

[4]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2, 3, Article 27 (May 2011), 27 pages.

Digital Library

[5]

Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine Learning 20, 3 (1995), 273–297. http://dx.

[6]

Valentin Dallmeier, Christian Lindig, and Andreas Zeller. 2005. Lightweight bug localization with AMPLE. In Proceedings of the sixth international symposium on Automated analysis-driven debugging (AADEBUG’05). ACM, New York, NY, USA, 99–104.

Digital Library

[7]

Marco D’Ambros, Michele Lanza, and Romain Robbes. 2010. An extensive comparison of bug prediction approaches. In Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). IEEE, 31–41.

[8]

Michael D. Ernst, Jake Cockrell, William G. Griswold, and David Notkin. 1999. Dynamically Discovering Likely Program Invariants to Support Program Evolution. In Proceedings of the 21st International Conference on Software Engineering (ICSE-99). ACM Press, NY, 213–225.

Digital Library

[9]

Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A Genetic Programming Approach to Automated Software Repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO ’09). ACM, New York, NY, USA, 947–954.

Digital Library

[10]

Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: Evolutionary Algorithms Made Easy. Journal of Machine Learning Research 13 (July 2012), 2171–2175.

Digital Library

[11]

Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Trans. Softw. Eng. 39, 2 (Feb. 2013), 276–291.

Digital Library

[12]

Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: directed automated random testing. In PLDI. 213–223.

Digital Library

[13]

Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A Systematic Study of Automated Program Repair: Fixing 55 out of 105 bugs for $8 Each. In Proceedings of the 34th International Conference on Software Engineering. 3–13.

Digital Library

[14]

Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug Prediction Based on Fine-grained Module Histories. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA, 200–210.

Digital Library

[15]

Tom Janssen, Rui Abreu, and Arjan J. C. van Gemund. 2009. Zoltar: A Toolset for Automatic Fault Localization. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE ’09). IEEE Computer Society, Washington, DC, USA, 662–664.

Digital Library

[16]

James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th International Conference on Automated Software Engineering (ASE2005). ACM Press, 273–282.

Digital Library

[17]

James A. Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering. ACM, New York, NY, USA, 467–477.

Digital Library

[18]

James A. Jones, Mary Jean Harrold, and John T. Stasko. 2001. Visualization for Fault Localization. In Proceedings of ICSE Workshop on Software Visualization. 71–75.

[19]

René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). ACM, New York, NY, USA, 437–440.

Digital Library

[20]

Tien-Duy B. Le, Richard J. Oentaryo, and David Lo. 2015. Information Retrieval and Spectrum Based Bug Localization: Better Together. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 579–590.

Digital Library

[21]

Ching-Pei Lee and Chih-Jen Lin. 2014. Large-scale Linear Ranksvm. Neural Comput. 26, 4 (April 2014), 781–817.

Digital Library

[22]

Hua Jie Lee. 2011. Software debugging using program spectra. Ph.D. Dissertation. University of Melbourne.

[23]

Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. 2011. Micro Interaction Metrics for Defect Prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE ’11). ACM, New York, NY, USA, 311–321.

Digital Library

[24]

Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225–331.

Digital Library

[25]

Philip McMinn. 2004. Search-based Software Test Data Generation: A Survey. Software Testing, Verification and Reliability 14, 2 (June 2004), 105–156.

Digital Library

[26]

Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayşe Bener. 2010. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375–407.

Digital Library

[27]

R. Moser, W. Pedrycz, and G. Succi. 2008. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In 2008 ACM/IEEE 30th International Conference on Software Engineering. 181–190.

Digital Library

[28]

Nachiappan Nagappan and Thomas Ball. 2005. Use of Relative Code Churn Measures to Predict System Defect Density. In Proceedings of the 27th International Conference on Software Engineering (ICSE ’05). ACM, New York, NY, USA, 284– 292.

Digital Library

[29]

Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectrabased software diagnosis. ACM Transactions on Software Engineering Methodology 20, 3, Article 11 (August 2011), 32 pages.

Digital Library

[30]

Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers?. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA 2011). ACM, New York, NY, USA, 199–209.

Digital Library

[31]

Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk. (With contributions by J. R. Koza).

Digital Library

[32]

R Core Team. 2015. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project. org/

[33]

M. Renieres and S.P. Reiss. 2003. Fault localization with nearest neighbor queries. In Proceedings of the 18th International Conference on Automated Software Engineering. 30 – 39.

Digital Library

[34]

Friedrich Steimann, Marcus Frenkel, and Rui Abreu. 2013. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA 2013). ACM, New York, NY, USA, 314–324.

Digital Library

[35]

G. Tassey. 2002. The economic impacts of inadequate infrastructure for software testing. Planning Report 02-3.2002. National Institute of Standards and Technology.

[36]

W. E. Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A Survey on Software Fault Localization. IEEE Transactions on Software Engineering 42, 8 (August 2016), 707.

Digital Library

[37]

W. Eric Wong, Yu Qi, Lei Zhao, and Kai-Yuan Cai. 2007. Effective Fault Localization using Code Coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference - Volume 01 (COMPSAC ’07). IEEE Computer Society, Washington, DC, USA, 449–456.

Digital Library

[38]

Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013. A Theoretical Analysis of the Risk Evaluation Formulas for Spectrum-based Fault Localization. ACM Transactions on Software Engineering Methodology 22, 4, Article 31 (October 2013), 40 pages.

Digital Library

[39]

Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, Shin Yoo, and Mark Harman. 2013. Provably Optimal and Human-Competitive Results in SBSE for Spectrum Based Fault Localisation. In Search Based Software Engineering, Günther Ruhe and Yuanyuan Zhang (Eds.). Lecture Notes in Computer Science, Vol. 8084. Springer Berlin Heidelberg, 224–238.

Digital Library

[40]

Jifeng Xuan and M. Monperrus. 2014. Learning to Combine Multiple Ranking Metrics for Fault Localization. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME 2014). 191–200.

Digital Library

[41]

Shin Yoo. 2012. Evolving Human Competitive Spectra-Based Fault Localisation Techniques. In Search Based Software Engineering, Gordon Fraser and Jerffeson Teixeira de Souza (Eds.). Lecture Notes in Computer Science, Vol. 7515. Springer Berlin Heidelberg, 244–258.

Digital Library

[42]

Shin Yoo and Mark Harman. 2012. Regression Testing Minimisation, Selection and Prioritisation: A Survey. Software Testing, Verification, and Reliability 22, 2 (March 2012), 67–120.

Digital Library

[43]

Shin Yoo, Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, and Mark Harman. 2014. No Pot of Gold at the End of Program Spectrum Rainbow: Greatest Risk Evaluation Formula Does Not Exist. Technical Report RN/14/14. University College London.

Cited By

Xu HWang ZZou W(2025)A more accurate bug localization technique for bugs with multiple buggy code filesInformation and Software Technology10.1016/j.infsof.2025.107675181(107675)Online publication date: May-2025
https://doi.org/10.1016/j.infsof.2025.107675
Wu SYang BChang ZLi ZChen XLiu Y(2025)Boosting mutation-based fault localization by effectively generating Higher-Order MutantsInformation and Software Technology10.1016/j.infsof.2024.107660180(107660)Online publication date: Apr-2025
https://doi.org/10.1016/j.infsof.2024.107660
Ferreira WLeitao-Junior PMachado de Freitas DSilva-Junior DHarrison R(2024)Uniqueness of suspiciousness scores: towards boosting evolutionary fault localizationJournal of Software Engineering Research and Development10.5753/jserd.2024.365112:1Online publication date: 18-Oct-2024
https://doi.org/10.5753/jserd.2024.3651
Show More Cited By

Index Terms

FLUCCS: using code and change metrics to improve fault localization
1. Software and its engineering
  1. Software creation and management
    1. Search-based software engineering

Recommendations

Fault density, fault types, and spectra-based fault localization

This paper presents multiple empirical experiments that investigate the impact of fault quantity and fault type on statistical, coverage-based fault localization techniques and fault-localization interference. Fault-localization interference is a ...
Fault localization for build code errors in makefiles
ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software Engineering

Building is an important process in software development. In large software projects, build code has a high level of complexity, churn rate, and defect proneness. While several automated approaches exist to help developers in localizing faults in ...
Understanding the use of spectrum‐based fault localization
Summary
Developers spend significant time locating and fixing bugs, which is often performed manually. Although spectrum‐based fault localization (SFL) techniques aim at helping developers to locate faults, they are not yet used in practice. Recent ...

This paper presents a user study of spectrum‐based fault localization (SFL), showing that SFL can improve the developers' debugging effectiveness, leading them close to faulty code excerpts. SFL was well‐accepted by the participants of our study but ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

July 2017

447 pages

ISBN:9781450350761

DOI:10.1145/3092703

General Chair:
Tevfik Bultan
University of California at Santa Barbara, USA
,
Program Chair:
Koushik Sen
University of California at Berkeley, USA

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 July 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISSTA '17

Sponsor:

SIGSOFT

ISSTA '17: International Symposium on Software Testing and Analysis

July 10 - 14, 2017

CA, Santa Barbara, USA

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

157
Total Citations
View Citations
1,123
Total Downloads

Downloads (Last 12 months)122
Downloads (Last 6 weeks)15

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu HWang ZZou W(2025)A more accurate bug localization technique for bugs with multiple buggy code filesInformation and Software Technology10.1016/j.infsof.2025.107675181(107675)Online publication date: May-2025
https://doi.org/10.1016/j.infsof.2025.107675
Wu SYang BChang ZLi ZChen XLiu Y(2025)Boosting mutation-based fault localization by effectively generating Higher-Order MutantsInformation and Software Technology10.1016/j.infsof.2024.107660180(107660)Online publication date: Apr-2025
https://doi.org/10.1016/j.infsof.2024.107660
Ferreira WLeitao-Junior PMachado de Freitas DSilva-Junior DHarrison R(2024)Uniqueness of suspiciousness scores: towards boosting evolutionary fault localizationJournal of Software Engineering Research and Development10.5753/jserd.2024.365112:1Online publication date: 18-Oct-2024
https://doi.org/10.5753/jserd.2024.3651
Xie HLei YLi MYan MZhang SFilkov VRay BZhou M(2024)Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695542(1770-1782)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695542
Zhang XSong YXie XXin QXing CFilkov VRay BZhou M(2024)Do not neglect what's on your hands: localizing software faults with exception trigger streamProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695479(982-994)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695479
Liu YZhu MDong JYu JHao DFilkov VRay BZhou M(2024)Compiler Bug Isolation via Enhanced Test Program MutationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695074(819-830)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695074
Rafi Md'Amorim M(2024)Enhancing Code Representation for Improved Graph Neural Network-Based Fault LocalizationCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3664459(686-688)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3664459
Rafi MKim DChen AChen TWang S(2024)Towards Better Graph Neural Network-Based Fault Localization through Enhanced Code RepresentationProceedings of the ACM on Software Engineering10.1145/36607931:FSE(1937-1959)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660793
Wang XYu HMeng XCao HZhang HSun HLiu XHu C(2024)MTL-TRANSFER: Leveraging Multi-task Learning and Transferred Knowledge for Improving Fault Localization and Program RepairACM Transactions on Software Engineering and Methodology10.1145/365444133:6(1-31)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3654441
Xie MHu MKong ZZhang CFeng YWang HXue YZhang HLiu YLiu YChristakis MPradel M(2024)DeFort: Automatic Detection and Analysis of Price Manipulation Attacks in DeFi ApplicationsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652137(402-414)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652137
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten