SpongeBugs: Automatically generating fix suggestions in response to static code analysis warnings
Introduction
Static code analysis tools (SATs) are becoming increasingly popular as a way of detecting possible sources of defects earlier in the development process (Habib and Pradel, 2018). By working statically on the source or byte code of a project, these tools are applicable to large code bases (Johnson et al., 2013b, Liu et al., 2019), where they quickly search for patterns that may indicate problems – bugs, questionable design choices, or failures to follow stylistic conventions (Barik et al., 2016, Tómasdóttir et al., 2017) – and report them to users. There is evidence (Beller et al., 2016) that using these tools can help developers monitor and improve software code quality; indeed, static code analysis tools are used for both commercial and open-source software development (Marcilio et al., 2019a, Habib and Pradel, 2018, Liu et al., 2019). Some projects’ development rules even require that code has to clear the checks of a certain SAT before it can be released (Marcilio et al., 2019a, Beller et al., 2016, Ayewah et al., 2008).
At the same time, some features of SATs limit their wider applicability in practice. One key problem is that SATs are necessarily imprecise in checking for rule violations; in other words, they report warnings that may or may not correspond to an actual mistake. As a result, the first time a static analysis tool is run on a project, it is likely to report thousands of warnings (Habib and Pradel, 2018, Johnson et al., 2013b), which saturates the developers’ capability of sifting through them to select those that are more relevant and should be fixed (Marcilio et al., 2019a). Another related issue with using SATs in practice is that understanding the problem highlighted by a warning and coming up with a suitable fix is often nontrivial (Marcilio et al., 2019a, Johnson et al., 2013b).
Our research aims at improving the practical usability of SATs by automatically providing fix suggestions: modifications to the source code that make it compliant with the rules checked by the analysis tools. We developed an approach, called SpongeBugs and described in Section 3, whose current implementation works on Java code. SpongeBugs detects violations of 11 different rules checked by SonarQube and SpotBugs (successor to FindBugs (Habib and Pradel, 2018))—two well-known static code analysis tools, routinely used by very many software companies and consortia, including large ones such as the Apache Software Foundation and the Eclipse Foundation. The rules checked by SpongeBugs are among the most widely used in these two tools, and cover different kinds of code issues (ranging from performance, to correct behavior, style, and other aspects). For each violation it detects, SpongeBugs automatically suggests and presents a fix to the user.
By construction, the fixes SpongeBugs suggests remove the origin of a rule’s violation, but the maintainers still have to decide – based on their overall knowledge of the project – whether to accept and merge each suggestion. To assess whether developers are indeed willing to accept SpongeBugs’s suggestions, Section 5 presents the result of an empirical evaluation where we applied it to 12 open-source Java projects, and submitted 946 fix suggestions as pull requests to the projects. Project maintainers accepted 825 (87%) fix suggestions—97% of them without any modifications. This high acceptance rate suggests that SpongeBugs often generates patches of high quality, which developers find adequate and useful.
The empirical evaluation also indicates that SpongeBugs is applicable with good performance to large code bases (1.2 min to process 1,000 lines of code on average). SpongeBugs is also accurate, as it generates false positives (spurious rule violations) in less than 0.6% of all reported violations. We actually found several cases where SpongeBugs correctly detected cases of rule violations that were missed by SonarQube (Section 5.1.1).
To further demonstrate SpongeBugs’s versatility, Section 5 also discusses how SpongeBugs complements program repair tools (e.g., avatar (Liu et al., 2019)) and how it performs on software whose main contributors are non-professionals (i.e., students). With few exceptions – which we discuss throughout Section 5 to inform further progress in this line of work – SpongeBugs worked as intended: it provides sound, easy to apply suggestions to fix static rule violations.
The work reported in this paper is part of a large body of research (see Section 2) that deals with helping developers detecting and fixing bugs and code smells. SpongeBugs’ approach is characterized by the following features: (i) it targets static rules that correspond to frequent mistakes that are often fixable syntactically; (ii) it builds fix suggestions that remove the source of warning by construction; (iii) it scales to large code bases because it is based on lightweight program transformation techniques. Despite the focus on conceptually simple rule violations, SpongeBugs can generate nontrivial patches, including some that modify multiple hunks of code at once. In summary, SpongeBugs’s focus privileges generating a large number of practically useful fixes over being as broadly applicable as possible. Based on our empirical evaluation, Section 6 discusses the main limitations of SpongeBugs’s approach, and Section 7 outlines directions for further progress in this line of work.
This journal article extends a previous conference publication (Marcilio et al., 2019b) by significantly expanding the empirical evaluation of SpongeBugs with: (1) an extended and updated evaluation on SpongeBugs’ applicability in Section 5.1, using a revised implementation with numerous bug fixes; (2) a detailed analysis of accuracy (false positives and false negatives, in Sections 5.1.2 False positives, 5.1.3 False negatives); (3) a smaller-scale evaluation involving student projects (Section 5.1.4); (4) an experimental assessment (Section 5.3.2) of how SpongeBugs’s three-stage process trade-offs a modicum of precision for markedly better performance; and (5) an experimental comparison with the Defects4J curated collection of real-world Java bugs (Section 5.4).
Section snippets
Background and related work
Static analysis techniques reason about program behavior statically, that is without running the program (Nielson et al., 1999). This is in contrast to dynamic analysis techniques, which are instead driven by specific program inputs (provided, for example, by unit tests). Thus, static analysis techniques are often more scalable (because they do not require complete executions) but also less precise (because they over-approximate program behavior to encompass all possible inputs) than dynamic
SpongeBugs : Approach and implementation
SpongeBugs provides fix suggestions for violations of selected rules that are checked by SonarQube and SpotBugs. Section 3.1 discusses how we selected the rules to check and suggest fixes for. SpongeBugs works by means of source-to-source transformations, implemented as we outline in Section 3.2. This approach has advantages but also limitations in its applicability, which we discuss in Section 3.3.
Empirical evaluation of SpongeBugs : Experimental design
The overall goal of this research is suggesting fixes to warnings generated by static code analysis tools. Section 4.1 presents the research questions we answer in this empirical study, which targets:
- •
Fifteen open-source projects selected using the criteria we present in Section 4.2;
- •
Five student projects developed as part of software engineering courses;
- •
Defects4J: a curated collection of faulty Java programs, widely used to evaluate automated program repair (see Section 2) tools.
All data created
Empirical evaluation of SpongeBugs : Results and discussion
The results of our empirical evaluation of SpongeBugs answer the four research questions presented in Section 4.1. For uniformity, the experiments14 related to RQ1–3 target the 12 projects whose maintainers accepted the pull requests fixing static analysis warnings (top portion of Table 5).
Limitations and threats to validity
Some of SpongeBugs’s transformations may violate a project’s stylistic guidelines (Liu et al., 2018). Take, for example, project primefaces, which uses a rule22 about the order of variable declarations within a class requiring that private constants ( ) be defined after public constants. SpongeBugs’s fixes for rule C1 (String literals should not be duplicated) may violate this
Conclusions
In this work we introduced a new approach and a tool (SpongeBugs) that finds and repairs violations of rules checked by static code analysis tools such as SonarQube, FindBugs, and SpotBugs. We designed SpongeBugs to deal with rule violations that are frequently fixed in both private and open-source projects. We assessed SpongeBugs by running it on 12 popular open source projects, and submitted a large portion (total of 946) of the fixes it generated as pull requests in the projects. Overall,
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We thank the maintainers for reviewing our patches; and the reviewers of SCAM and JSS for their helpful comments. This work was partially supported by CNPq, Brazil (#406308/2016-0 and #309032/2019-9); and by the Swiss National Science Foundation (SNSF) grant Hi-Fi (#200021_182060).
References (48)
- et al.
Mining fix patterns for FindBugs violations
IEEE Trans. Softw. Eng.
(2018) - et al.
Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment?
Inf. Softw. Technol.
(2016) - et al.
Building useful program analysis tools using an extensible Java compiler
- et al.
Compilers: Principles, Techniques, and Tools
(2006) - et al.
Using static analysis to find bugs
IEEE Softw.
(2008) - et al.
Getafix: Learning to fix bugs automatically
Proc. ACM Program. Lang.
(2019) - et al.
From quick fixes to slow fixes: Reimagining static analysis resolutions to enable design space exploration
- et al.
Phoenix: Automated data-driven synthesis of repairs for static analysis violations
- et al.
Analyzing the state of static analysis: A large-scale evaluation in open source software
- Brito, A., Xavier, L., Hora, A., Valente, M.T., 2018. Why and how Java developers break APIs. In: 25th International...
Practical memory leak detection using guarded value-flow analysis
Reconciling the past and the present: An empirical study on the application of source code transformations to automatically rejuvenate Java programs
How do developers fix issues and pay back technical debt in the apache ecosystem?
Automatic software repair: A survey
IEEE Trans. Softw. Eng.
Statistically rigorous Java performance evaluation
How many of all bugs do we find? a study of static bug detectors
Why don’t software developers use static analysis tools to find bugs?
Defects4j: A database of existing faults to enable controlled testing studies for java programs
An in-depth study of the promises and perils of mining GitHub
Empir. Softw. Eng.
Data Flow Analysis: Theory and Practice
Improving refactoring speed by 10x
Automatic patch generation learned from human-written patches
Cited by (40)
Cognitive Driven Development helps software teams to keep code units under the limit!
2023, Journal of Systems and SoftwareGT-SimNet: Improving code automatic summarization via multi-modal similarity networks
2022, Journal of Systems and SoftwareCitation Excerpt :In actuality, the information extracted by these techniques represents the surface information of the code; thus, these techniques lack deep abstraction and exploration of the code structure and have certain limitations. Deep neural networks have shown state-of-the-art performance in feature extraction and abstraction (Marcilio et al., 2020). Code summarization has also found new directions (Fang et al., 2019).
PyTy: Repairing Static Type Errors in Python
2024, arXivAn Empirical Evaluation on the Effect of Refactoring Code Smells Mobile Applications Android with ASATs on Resource Usage
2024, International Journal on Advanced Science, Engineering and Information TechnologyUnauthorized Microphone Access Restraint Based on User Behavior Perception in Mobile Devices
2024, IEEE Transactions on Mobile ComputingMitigating False Positive Static Analysis Warnings: Progress, Challenges, and Opportunities
2023, IEEE Transactions on Software Engineering