skip to main content
10.1145/3092703.3092717acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

FLUCCS: using code and change metrics to improve fault localization

Published: 10 July 2017 Publication History

Abstract

Fault localization aims to support the debugging activities of human developers by highlighting the program elements that are suspected to be responsible for the observed failure. Spectrum Based Fault Localization (SBFL), an existing localization technique that only relies on the coverage and pass/fail results of executed test cases, has been widely studied but also criticized for the lack of precision and limited effort reduction. To overcome restrictions of techniques based purely on coverage, we extend SBFL with code and change metrics that have been studied in the context of defect prediction, such as size, age and code churn. Using suspiciousness values from existing SBFL formulas and these source code metrics as features, we apply two learn-to-rank techniques, Genetic Programming (GP) and linear rank Support Vector Machines (SVMs). We evaluate our approach with a ten-fold cross validation of method level fault localization, using 210 real world faults from the Defects4J repository. GP with additional source code metrics ranks the faulty method at the top for 106 faults, and within the top five for 173 faults. This is a significant improvement over the state-of-the-art SBFL formulas, the best of which can rank 49 and 127 faults at the top and within the top five, respectively.

References

[1]
2016. JaCoCo. http://www.eclemma.org/jacoco/. (2016). http://www.eclemma. org/jacoco/
[2]
R. Abreu, P. Zoeteweij, and A.J.C. van Gemund. 2009. Spectrum-Based Multiple Fault Localization. In Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009). 88–99.
[3]
Tien-Duy B. Le, David Lo, Claire Le Goues, and Lars Grunske. 2016. A Learningto-rank Based Fault Localization Approach Using Likely Invariants. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA 2016). ACM, New York, NY, USA, 177–188.
[4]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2, 3, Article 27 (May 2011), 27 pages.
[5]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine Learning 20, 3 (1995), 273–297. http://dx.
[6]
Valentin Dallmeier, Christian Lindig, and Andreas Zeller. 2005. Lightweight bug localization with AMPLE. In Proceedings of the sixth international symposium on Automated analysis-driven debugging (AADEBUG’05). ACM, New York, NY, USA, 99–104.
[7]
Marco D’Ambros, Michele Lanza, and Romain Robbes. 2010. An extensive comparison of bug prediction approaches. In Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). IEEE, 31–41.
[8]
Michael D. Ernst, Jake Cockrell, William G. Griswold, and David Notkin. 1999. Dynamically Discovering Likely Program Invariants to Support Program Evolution. In Proceedings of the 21st International Conference on Software Engineering (ICSE-99). ACM Press, NY, 213–225.
[9]
Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A Genetic Programming Approach to Automated Software Repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO ’09). ACM, New York, NY, USA, 947–954.
[10]
Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: Evolutionary Algorithms Made Easy. Journal of Machine Learning Research 13 (July 2012), 2171–2175.
[11]
Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Trans. Softw. Eng. 39, 2 (Feb. 2013), 276–291.
[12]
Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: directed automated random testing. In PLDI. 213–223.
[13]
Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A Systematic Study of Automated Program Repair: Fixing 55 out of 105 bugs for $8 Each. In Proceedings of the 34th International Conference on Software Engineering. 3–13.
[14]
Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug Prediction Based on Fine-grained Module Histories. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA, 200–210.
[15]
Tom Janssen, Rui Abreu, and Arjan J. C. van Gemund. 2009. Zoltar: A Toolset for Automatic Fault Localization. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE ’09). IEEE Computer Society, Washington, DC, USA, 662–664.
[16]
James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th International Conference on Automated Software Engineering (ASE2005). ACM Press, 273–282.
[17]
James A. Jones, Mary Jean Harrold, and John Stasko. 2002. Visualization of test information to assist fault localization. In Proceedings of the 24th International Conference on Software Engineering. ACM, New York, NY, USA, 467–477.
[18]
James A. Jones, Mary Jean Harrold, and John T. Stasko. 2001. Visualization for Fault Localization. In Proceedings of ICSE Workshop on Software Visualization. 71–75.
[19]
René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). ACM, New York, NY, USA, 437–440.
[20]
Tien-Duy B. Le, Richard J. Oentaryo, and David Lo. 2015. Information Retrieval and Spectrum Based Bug Localization: Better Together. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 579–590.
[21]
Ching-Pei Lee and Chih-Jen Lin. 2014. Large-scale Linear Ranksvm. Neural Comput. 26, 4 (April 2014), 781–817.
[22]
Hua Jie Lee. 2011. Software debugging using program spectra. Ph.D. Dissertation. University of Melbourne.
[23]
Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. 2011. Micro Interaction Metrics for Defect Prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE ’11). ACM, New York, NY, USA, 311–321.
[24]
Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225–331.
[25]
Philip McMinn. 2004. Search-based Software Test Data Generation: A Survey. Software Testing, Verification and Reliability 14, 2 (June 2004), 105–156.
[26]
Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayşe Bener. 2010. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375–407.
[27]
R. Moser, W. Pedrycz, and G. Succi. 2008. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In 2008 ACM/IEEE 30th International Conference on Software Engineering. 181–190.
[28]
Nachiappan Nagappan and Thomas Ball. 2005. Use of Relative Code Churn Measures to Predict System Defect Density. In Proceedings of the 27th International Conference on Software Engineering (ICSE ’05). ACM, New York, NY, USA, 284– 292.
[29]
Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectrabased software diagnosis. ACM Transactions on Software Engineering Methodology 20, 3, Article 11 (August 2011), 32 pages.
[30]
Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers?. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA 2011). ACM, New York, NY, USA, 199–209.
[31]
Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk. (With contributions by J. R. Koza).
[32]
R Core Team. 2015. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project. org/
[33]
M. Renieres and S.P. Reiss. 2003. Fault localization with nearest neighbor queries. In Proceedings of the 18th International Conference on Automated Software Engineering. 30 – 39.
[34]
Friedrich Steimann, Marcus Frenkel, and Rui Abreu. 2013. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA 2013). ACM, New York, NY, USA, 314–324.
[35]
G. Tassey. 2002. The economic impacts of inadequate infrastructure for software testing. Planning Report 02-3.2002. National Institute of Standards and Technology.
[36]
W. E. Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A Survey on Software Fault Localization. IEEE Transactions on Software Engineering 42, 8 (August 2016), 707.
[37]
W. Eric Wong, Yu Qi, Lei Zhao, and Kai-Yuan Cai. 2007. Effective Fault Localization using Code Coverage. In Proceedings of the 31st Annual International Computer Software and Applications Conference - Volume 01 (COMPSAC ’07). IEEE Computer Society, Washington, DC, USA, 449–456.
[38]
Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013. A Theoretical Analysis of the Risk Evaluation Formulas for Spectrum-based Fault Localization. ACM Transactions on Software Engineering Methodology 22, 4, Article 31 (October 2013), 40 pages.
[39]
Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, Shin Yoo, and Mark Harman. 2013. Provably Optimal and Human-Competitive Results in SBSE for Spectrum Based Fault Localisation. In Search Based Software Engineering, Günther Ruhe and Yuanyuan Zhang (Eds.). Lecture Notes in Computer Science, Vol. 8084. Springer Berlin Heidelberg, 224–238.
[40]
Jifeng Xuan and M. Monperrus. 2014. Learning to Combine Multiple Ranking Metrics for Fault Localization. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME 2014). 191–200.
[41]
Shin Yoo. 2012. Evolving Human Competitive Spectra-Based Fault Localisation Techniques. In Search Based Software Engineering, Gordon Fraser and Jerffeson Teixeira de Souza (Eds.). Lecture Notes in Computer Science, Vol. 7515. Springer Berlin Heidelberg, 244–258.
[42]
Shin Yoo and Mark Harman. 2012. Regression Testing Minimisation, Selection and Prioritisation: A Survey. Software Testing, Verification, and Reliability 22, 2 (March 2012), 67–120.
[43]
Shin Yoo, Xiaoyuan Xie, Fei-Ching Kuo, Tsong Yueh Chen, and Mark Harman. 2014. No Pot of Gold at the End of Program Spectrum Rainbow: Greatest Risk Evaluation Formula Does Not Exist. Technical Report RN/14/14. University College London.

Cited By

View all
  • (2025)A more accurate bug localization technique for bugs with multiple buggy code filesInformation and Software Technology10.1016/j.infsof.2025.107675181(107675)Online publication date: May-2025
  • (2025)Boosting mutation-based fault localization by effectively generating Higher-Order MutantsInformation and Software Technology10.1016/j.infsof.2024.107660180(107660)Online publication date: Apr-2025
  • (2024)Uniqueness of suspiciousness scores: towards boosting evolutionary fault localizationJournal of Software Engineering Research and Development10.5753/jserd.2024.365112:1Online publication date: 18-Oct-2024
  • Show More Cited By

Index Terms

  1. FLUCCS: using code and change metrics to improve fault localization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis
    July 2017
    447 pages
    ISBN:9781450350761
    DOI:10.1145/3092703
    • General Chair:
    • Tevfik Bultan,
    • Program Chair:
    • Koushik Sen
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 July 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Fault Localization
    2. Genetic Programming
    3. SBSE

    Qualifiers

    • Research-article

    Conference

    ISSTA '17
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 58 of 213 submissions, 27%

    Upcoming Conference

    ISSTA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)122
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A more accurate bug localization technique for bugs with multiple buggy code filesInformation and Software Technology10.1016/j.infsof.2025.107675181(107675)Online publication date: May-2025
    • (2025)Boosting mutation-based fault localization by effectively generating Higher-Order MutantsInformation and Software Technology10.1016/j.infsof.2024.107660180(107660)Online publication date: Apr-2025
    • (2024)Uniqueness of suspiciousness scores: towards boosting evolutionary fault localizationJournal of Software Engineering Research and Development10.5753/jserd.2024.365112:1Online publication date: 18-Oct-2024
    • (2024)Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness DetectionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695542(1770-1782)Online publication date: 27-Oct-2024
    • (2024)Do not neglect what's on your hands: localizing software faults with exception trigger streamProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695479(982-994)Online publication date: 27-Oct-2024
    • (2024)Compiler Bug Isolation via Enhanced Test Program MutationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695074(819-830)Online publication date: 27-Oct-2024
    • (2024)Enhancing Code Representation for Improved Graph Neural Network-Based Fault LocalizationCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3664459(686-688)Online publication date: 10-Jul-2024
    • (2024)Towards Better Graph Neural Network-Based Fault Localization through Enhanced Code RepresentationProceedings of the ACM on Software Engineering10.1145/36607931:FSE(1937-1959)Online publication date: 12-Jul-2024
    • (2024)MTL-TRANSFER: Leveraging Multi-task Learning and Transferred Knowledge for Improving Fault Localization and Program RepairACM Transactions on Software Engineering and Methodology10.1145/365444133:6(1-31)Online publication date: 27-Jun-2024
    • (2024)DeFort: Automatic Detection and Analysis of Price Manipulation Attacks in DeFi ApplicationsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652137(402-414)Online publication date: 11-Sep-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media