Skip to main content
Log in

Using code reviews to automatically configure static analysis tools

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Developers often use Static Code Analysis Tools (SCAT) to automatically detect different kinds of quality flaws in their source code. Since many warnings raised by SCATs may be irrelevant for a project/organization, it can be possible to leverage information from the project development history, to automatically configure which warnings a SCAT should raise, and which not. In this paper, we propose an automated approach (Auto-SCAT) to leverage (statement-level) code review comments for recommending SCAT warnings, or warning categories, to be enabled. To this aim, we trace code review comments onto SCAT warnings by leveraging their descriptions and messages, as well as review comments made in other different projects. We apply Auto-SCAT to study how CheckStyle, a well-known SCAT, can be configured in the context of six Java open source projects, all using Gerrit for handling code reviews. Our results show that, Auto-SCAT is able to classify code review comments into CheckStyle checks with a precision of 61% and a recall of 52%. While considering also the code review comments not related to CheckStyle warnings Auto-SCAT has a precision and a recall of ≈ 75%. Furthermore, Auto-SCAT can configuring CheckStyle with a precision of 72.7% at checks level and a precision of 96.3% at category level. Finally, our findings highlight that Auto-SCAT outperforms state-of-art baselines based on default CheckStyle configurations, or leveraging the history of previously-removed warnings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://checkstyle.sourceforge.net

  2. https://www.gerritcodereview.com

  3. http://review.couchbase.org/#/c/21805/ line 241, author’s hidden for privacy reasons.

  4. http://checkstyle.sourceforge.net/checks.html

  5. https://www.eclipse.org/mylyn/

  6. https://wiki.eclipse.org/Development_Conventions_and_Guidelines

  7. https://checkstyle.sourceforge.io/sun_style.html

  8. https://checkstyle.sourceforge.io/google_style.html

  9. core/org.eclipse.cdt.ui/src/org/eclipse/cdt/internal/ui/refactoring/pullup/PullUp Information.java+refs-changes-77-22177-3

  10. client/src/com/vaadin/client/widgets/Escalator.java+refs-changes-28-7628-1

References

  • Anderson P, Reps T, Teitelbaum T, Zarins M (2003) Tool support for fine-grained software inspection. In: IEEE Software

  • Ayewah N, Pugh W (2009) Using checklists to review static analysis warnings. In: Proceedings of the International Workshop on Defects in Large Software Systems: Held in Conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2009), pp 11–15

  • Bacchelli A, Bird C (2013) Expectations, outcomes, and challenges of modern code review. In: Proceedings of the International Conference on Software Engineering (ICSE), pp 712–721

  • Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley longman publishing co. inc, UBoston

    Google Scholar 

  • Bavota G, Russo B (2015) Four eyes are better than two: on the impact of code reviews on software quality. In: IEEE International Conference on Software Maintenance and Evolution, (ICSME)

  • Baysal O, Kononenko O, Holmes R, Godfrey M (2013) The influence of non-technical factors on code review. In: Reverse Engineering (WCRE), 2013 20th Working Conference on

  • Beller M, Bacchelli A, Zaidman A, Juergens E (2014) Modern code reviews in open-source projects: Which problems do they fix?. In: Proceedings of the Working Conference on Mining Software Repositories (MSR), pp 202–211

  • Beller M, Bholanath R, McIntosh S, Zaidman A (2016) Analyzing the state of static analysis: A large-scale evaluation in open source software. In: Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp 470–481

  • Bosu A (2014) Characteristics of the vulnerable code changes identified through peer code review. In: 36Th international conference on software engineering, ICSE ’14, companion proceedings, hyderabad, india, may 31 - june 07, 2014, pp 736–738

  • Bosu A, Carver JC, Bird C, Orbeck J, Chockley C (2017) Process aspects and social dynamics of contemporary code review. Insights from open source development and industrial practice at microsoft. IEEE Transactions on Software Engineering

  • Cassee N, Vasilescu B, Serebrenik A (2020) The silent helper: The impact of continuous integration on code reviews. In: 27Th IEEE international conference on software analysis, evolution and reengineering, SANER 2020, london, ON, Canada, February 18-21, 2020, pp 423–434

  • Couto C, Montandon JE, Silva C, Valente MT (2013) Static correspondence and correlation between field defects and warnings reported by a bug finding tool. Softw Qual J 21(2):241–257

    Article  Google Scholar 

  • Duvall P, Matyas SM, Glover A (2007) Continuous Integration. Improving Software Quality and Reducing Risk (The Addison-Wesley Signature Series). Addison-Wesley Professional

  • Fagan M (1976) Design and code inspections to reduce errors in program development. IBM Systems Journal

  • Fry Z, Weimer W (2013) Clustering static analysis defect reports to reduce maintenance costs. In: Proceedings of the Working Conference on Reverse Engineering (WCRE)

  • Hanam Q, Tan L, Holmes R, Lam P (2014) Finding patterns in static analysis alerts: Improving actionable alert ranking. In: Proceedings of the Working Conference on Mining Software Repositories

  • Huang A (2008) Similarity measures for text document clustering. In: New Zealand Computer Science Research Student Conference

  • Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs?. In: Proceedings of the International Conference on Software Engineering (ICSE), pp 672–681

  • Khoo YP, Foster JS, Hicks M, Sazawal V (2008) Path projection for user-centered static analysis tools. In: Proceedings of the ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, pp 57–63

  • Kim S, Ernst M (2007) Which warnings should I fix first?. In: Proceedings of the Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 45–54

  • Kononenko O, Baysal O, Godfrey MW (2016) Code review quality: How developers see it. In: Proceedings of the 38th International Conference on Software Engineering

  • Manning CD, Raghavan P, Schütze H (2008) Introduction to Information Retrieval. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Mäntylä M, Lassenius C (2009) What types of defects are really discovered in code reviews? IEEE Trans Software Eng 35(3):430–448

    Article  Google Scholar 

  • Marcilio D, Bonifacio R, Monteiro E, Canedo E, Luz W, Pinto G (2019) Are static analysis violations really fixed? a closer look at realistic usage of sonarqube. In: International Conference on Program Comprehension(ICPC)

  • McIntosh S, Kamei Y, Adams B, Hassan AE (2016) An empirical study of the impact of modern code review practices on software quality. Empir Softw Eng 21(5):2146–2189

    Article  Google Scholar 

  • Morales R, McIntosh S, Khomh F (2015) Do code review practices impact design quality? a case study of the qt, vtk, and itk projects. In: Proc. of the 22nd Int’l Conf. on Software Analysis, Evolution, and Reengineering (SANER)

  • Muske T, Baid A, Sanas T (2013) Review efforts reduction by partitioning of static analysis warnings. In: Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM)

  • Cousot P , Cousot R, Feret J, Mauborgne L, Monniaux D, Rival AX (2005) The astreé analyzer. In: Proceedings of the European Symposium on Programming (ESOP)

  • Panichella S, Zaugg N (2020) An empirical investigation of relevant changes and automation needs in modern code review. Empir Softw Eng 25(6):4833–4872

    Article  Google Scholar 

  • Panichella S, Arnaoudova V, Di Penta M, Antoniol G (2015) Would static analysis tools help developers with code reviews?. In: Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp 161–170

  • Pascarella L, Spadini D, Palomba F, Bruntink M, Bacchelli A (2018) Information needs in contemporary code review. PACMHCI 2(CSCW):135 27:1–135

    Google Scholar 

  • Phang K, Foster JS, Hicks MW, Sazawal V (2009) Triaging checklists: a substitute for a phd in static analysis. Evaluation and Usability of Programming Languages and Tools (PLATEAU)

  • Porter M (1980) An algorithm for suffix stripping. Program 14 (3):130–137. https://doi.org/10.1108/eb046814

    Article  Google Scholar 

  • Querel LP, Rigby PC (2018) Warningsguru: integrating statistical bug models with static analysis to provide timely and specific bug warnings. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp 892–895

  • Řehůřek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks (ELRA), pp 45–50

  • Reiss S (2007) Automatic code stylizing. In: Proceedings of the International Conference on Automated Software Engineering (ASE), pp 74–83

  • Ribeiro A, Meirelles P, Lago N, Kon F (2019) Ranking warnings from multiple source code static analyzers via ensemble learning. In: Proceedings of the 15th International Symposium on Open Collaboration, ACM, p 5

  • Rigby PC, German DM, Storey MA (2008) Open source software peer review practices: A case study of the apache server. In: Proceedings of the 30th International Conference on Software Engineering

  • Ruthruff JR, Penix J, Morgenthaler JD, Elbaum S, Rothermel G (2008) Predicting accurate and actionable static analysis warnings: an experimental approach. In: Proceedings of the International Conference on Software Engineering (ICSE), pp 341–350

  • Salton GM, Wong A, Yang C (1975) A vector space model for automatic indexing

  • Sokolova M, Guy L (2009) A systematic analysis of performance measures for classification tasks. Information Processing & Management 427–437

  • Spacco J, Hovemeyer D, Pugh W (2006) Tracking defect warnings across versions. In: Proceedings of the 2006 international workshop on Mining software repositories, ACM, pp 133–136

  • Vassallo C, Panichella S, Palomba F, Proksch S, Zaidman A, Gall HC (2018) Context is king: The developer perspective on the usage of static analysis tools. In: Proceedings of the International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 38–49

  • Wedyan F, Alrmuny D, Bieman JM (2009) The effectiveness of automated static analysis tools for fault detection and refactoring prediction. In: Second International Conference on Software Testing Verification and Validation, ICST 2009, Denver, Colorado, USA, April 1-4, 2009, IEEE Computer Society, pp 141–150

  • Weißgerber P, Neu D, Diehl S (2008) Small patches get in!. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories

  • Williams CC, Hollingsworth JK (2005) Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering

  • Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retr 1(1):69–90

    Article  Google Scholar 

  • Yoon J, Jin M, Jung Y (2014) Reducing false alarms from an industrial-strength static analyzer by SVM. In: 21St asia-pacific software engineering conference, APSEC 2014, jeju, south korea, december 1–4, 2014 2 Industry, Short, and QuASoQ Papers, pp 3–6

  • Yüksel U, Sözer H (2013) Automated classification of static code analysis alerts: a case study. In: Proceedings of the International Conference on Software Maintenance

  • Zampetti F, Scalabrino S, Oliveto R, Canfora G, Di Penta M (2017) How open source projects use static code analysis tools in continuous integration pipelines. In: Proceedings of the 14th International Conference on Mining Software Repositories, MSR 2017, Buenos Aires, Argentina, May 20-28, 2017, pp 334–344

  • Zampetti F, Mudbhari S, Arnaoudova V, Di Penta M, Panichella S, Antoniol G (2020). https://doi.org/10.5281/zenodo.4399225

  • Zhang D, Jin YGD, Zhang H (2013) Diagnosis-oriented alarm correlations. In: Asia-Pacific Software Engineering Conference (APSEC)

  • Zheng J, Williams L, Nagappan N, Snipes W, Hudepohl JP, Vouk MA (2006) On the value of static analysis for fault detection in software. IEEE Transactions on Software Engineering (TSE) 32(4):240–253

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fiorella Zampetti.

Additional information

Communicated by: Lin Tan

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zampetti, F., Mudbhari, S., Arnaoudova, V. et al. Using code reviews to automatically configure static analysis tools. Empir Software Eng 27, 28 (2022). https://doi.org/10.1007/s10664-021-10076-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-10076-4

Keywords

Navigation