Skip to main content

Advertisement

Log in

An empirical assessment of machine learning approaches for triaging reports of static analysis tools

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Despite their ability to detect critical bugs in software, static analysis tools’ high false positive rates are a key barrier to their adoption in real-world settings. To improve the usability of these tools, researchers have recently begun to apply machine learning techniques to classify and filter incorrect analysis reports. Although initial results have been promising, the long-term potential and best practices for this line of research are unclear due to the lack of detailed, large-scale empirical evaluation. To partially address this knowledge gap, we present a comparative empirical study of three machine learning techniques—traditional models, recurrent neural networks (RNNs), and graph neural networks (GNNs)—for classifying correct and incorrect results in three static analysis tools—FindSecBugs, CBMC, and JBMC—using multiple datasets. These tools represent different techniques of static analysis, namely taint analysis and model-checking. We also introduce and evaluate new data preparation routines for RNNs and node representations for GNNs. We find that overall classification accuracy reaches a high of 80%–99% for different datasets and application scenarios. We observe that data preparation routines have a positive impact on classification accuracy, with an improvement of up to 5% for RNNs and 16% for GNNs. Overall, our results suggest that neural networks (RNNs or GNNs) that learn over a program’s source code outperform traditional models, although interesting tradeoffs are present among all techniques. Our observations provide insight into the future research needed to speed the adoption of machine learning approaches for static analysis tools in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

There were 5 datasets used during this study. 2 datasets (CBMC and JBMC) were created using the SV-COMP dataset, available at https://github.com/sosy-lab/sv-benchmarks. Steps on how to use the SV-COMP dataset, the other 3 datasets (OWASP, ICST-Rand, and ICST-PW) along with the experimental infrastructure of this study are available in the replication repository at https://bitbucket.org/SaiArrow/emse-replication-package/.

Notes

  1. This was one of 24 configurations of JBMC that produced the exact same distribution of correct/incorrect results on our dataset (Section 4).

  2. Two events A and B are statistically independent iff P(AB) = P(A)P(B).

  3. p(x|y) = (p(y)p(y|x))/p(x)

  4. Joana uses the intermediate representation from the T.J. Watson Libraries for Analysis (WALA) (IBM 2006).

  5. Number of composite features include counting the number of variables, ifs, loops, functions defined, functions called, loads, and stores.

  6. A property of the representation which requires that each variable is assigned exactly once, and every variable is defined before it is used (Rosen et al. 1988).

  7. [-1] here refers to an array with a single -1 element, if N = 3 and k = 1 then [-1, -1]).

References

  • Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G S, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) Tensorflow: large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/ Software available from tensorflow.org

  • Allamanis M, Barr ET, Bird C, Sutton C (2015) Suggesting accurate method and class names. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering (ESEC/FSE 2015). ACM, New York, pp 38–49, DOI https://doi.org/10.1145/2786805.2786849, (to appear in print)

  • Allamanis M, Brockschmidt M, Khademi M (2017) Learning to represent programs with graphs. arXiv:1711.00740 [cs]

  • Allamanis M, Barr ET, Devanbu P, Sutton C (2018) A survey of machine learning for big code and naturalness. ACM Comput Surv 51(4):Article 81, 37 pp. https://doi.org/10.1145/3212695

    Article  Google Scholar 

  • Alon U, Zilberstein M, Levy O, Yahav E (2019) code2vec: learning distributed representations of code. In: Proceedings of the ACM on programming languages 3, POPL, pp 1–29

  • Andres M (2013) Free chat-server: a chatserver written in Java. https://sourceforge.net/projects/freecs

  • Apollo 2018 (2018) Apollo: a distributed configuration center. https://github.com/ctripcorp/apollo

  • Arteau Ph, Formáánek D, Polešovský T (2018) Find security bugs, version 1.4.6. http://find-sec-bugs.github.io, Accessed: 2022-07-19

  • AutoML (2022) AutoML. https://www.automl.org/automl

  • Beyer D (2018) Results of the competition. https://sv-comp.sosylab.org/2018/results/results-verified/. Accessed: 2021-04-22

  • Beyer D (2019) Results of the competition. https://sv-comp.sosylab.org/2019/results/results-verified/, Accessed: 2021-04-22

  • Biere A, Cimatti A, Clarke E, Zhu Y (1999) Symbolic model checking without BDDs. In: Cleaveland WR (ed) Tools and algorithms for the construction and analysis of systems. Springer, Berlin, pp 193–207

  • Blackburn SM, Garner R, Hoffmann C, Khang AM, McKinley KS, Bentzur R, Diwan A, Feinberg D, Frampton D, Guyer SZ, Hirzel M, Hosking A, Jump M, Lee H, Moss JEB, Phansalkar A, Stefanovic D, VanDrunen T, von Dincklage D, Wiedermann B (2006) The DaCapo benchmarks: Java benchmarking development and analysis. In: Proceedings of the 21st annual ACM SIGPLAN conference on object-oriented programming systems, languages, and applications (OOPSLA ’06). ACM, New York, pp 169–190, DOI https://doi.org/10.1145/1167473.1167488, (to appear in print)

  • Block, Inc (2022) Okhttp: an HTTP & HTTP/2 client for Android and Java applications. http://square.github.io/okhttp

  • Bravenboer M, Yannis S (2009) Strictly declarative specification of sophisticated points-to analyses. SIGPLAN Not 44(10):243–262. https://doi.org/10.1145/1639949.1640108

    Article  Google Scholar 

  • Burato E, Ferrara P, Spoto F (2017) Security analysis of the OWASP benchmark with julia. In: Proceedings of ITASEC17, the rst Italian conference on security, Venice, Italy

  • Carrier P-L, Cho K (2018) LSTM networks for sentiment analysis: deeplearning 0.1 documentation. http://deeplearning.net/tutorial/lstm.html

  • Chen Z, Monperrus M (2019) A literature study of embeddings on source code. arXiv:1904.03061

  • Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. https://doi.org/10.48550/ARXIV.1409.1259

  • Clarke E, Kroening D, Lerda F (2004) A tool for checking ANSIC programs. In: Jensen K, Podelski A (eds) Tools and algorithms for the construction and analysis of systems (TACAS 2004) (Lecture Notes in Computer Science), vol 2988. Springer, pp 168–176

  • Cordeiro L, Kesseli P, Kroening D, Schrammel P, Marek T (2018) JBMC: a bounded model checking tool for verifying java bytecode. In: Computer aided verification (CAV) (LNCS), vol 10981. Springer International Publishing, Cham, pp 183–190

  • Dam HK, Tran T, Pham TTM (2016) A deep language model for software code. In: FSE 2016: proceedings of the foundations software engineering international symposium, pp 1–4

  • Diamantopoulos T (2020) ASTEXtractor v0.5. https://github.com/thdiaman/ASTExtractor

  • Eclipse Foundation (2022a) Eclipse java integrated development environment. https://www.eclipse.org/ide/

  • Eclipse Foundation (2022b) Jetty: lightweight highly scalable java based web server and servlet engine. https://www.eclipse.org/jetty

  • Eibe F, Hall MA, Witten IH (2016) The WEKA workbench. Morgan Kaufmann

  • Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M (2020) Code- BERT: a pre-trained model for programming and natural languages. arXiv:cs.CL/2002.08155

  • Ferrante J, Ottenstein KJ, Warren JD (1987) The program dependence graph and its use in optimization. ACM Trans Program Lang Syst 9(3):319–349. https://doi.org/10.1145/24039.24041

    Article  MATH  Google Scholar 

  • Fowkes J, Sutton C (2016) Parameter-free probabilistic API mining across GitHub. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering (FSE 2016). ACM, New York, pp 254–265, DOI https://doi.org/10.1145/2950290.2950319, (to appear in print)

  • Gers FA, Schmidhuber J, Fred C (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471

    Article  Google Scholar 

  • Giraph (2020) Giraph: large-scale graph processing on Hadoop. http://giraph.apache.org

  • Goldberg Y (2017) Neural network methods for natural language processing. Synth Lect Hum Lang Technol 10(1):1–309

    Article  Google Scholar 

  • Goldberg Y, Levy O (2014) word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv:1402.3722

  • Gori M, Monfardini G, Scarselli F (2005) A new model for learning in graph domains. In: 2005 IEEE International joint conference on neural networks, 2005. IJCNN’05. Proceedings, vol 2. IEEE, pp 729–734

  • Gros D, Sezhiyan H, Devanbu P, Yu Z (2020) Code to comment “translation”: data, metrics, baselining & evaluation. arXiv:cs.SE/2010.01410

  • Gu X, Zhang H, Zhang D, Kim S (2016) Deep API learning. In: Proceedings of the 2016 24th ACM SIGSOFT International symposium on foundations of software engineering. ACM, pp 631–642

  • h2db (2022) H2 database engine. http://www.h2database.com

  • Haque S, LeClair A, Wu L, McMillan C (2020) Improved automatic summarization of subroutines via attention to file context. In: Proceedings of the 17th international conference on mining software repositories, DOI https://doi.org/10.1145/3379597.3387449, (to appear in print)

  • Heckman SS (2007) Adaptive probabilistic model for ranking code-based static analysis alerts. In: 29th International conference on software engineering—companion. ICSE 2007 companion, pp 89–90, DOI https://doi.org/10.1109/ICSECOMPANION.2007.16, (to appear in print)

  • Heckman SS (2009) A systematic model building process for predicting actionable static analysis alerts. North Carolina State University

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735

    Article  Google Scholar 

  • IBM (2006) T. J. Watson Libraries for Analysis (WALA). http://wala.sourceforge.net/

  • Joda.org (2021) Joda-Time a quality replacement for the Java date and time classes. http://www.joda.org/joda-time

  • Johnson A, Waye L, Moore S, Chong S (2015) Exploring and enforcing security guarantees via program dependence graphs. In: Proceedings of the 36th ACM SIGPLAN conference on programming language design and implementation (PLDI ’15). ACM, New York, pp 291–302, DOI https://doi.org/10.1145/2737924.2737957, (to appear in print)

  • Johnson B, Song Y, Murphy-Hill E, Bowdidge R (2013) Why don’t software developers use static analysis tools to find bugs?. In: Proceedings of the 2013 international conference on software engineering (ICSE ’13). IEEE Press, Piscataway, pp 672–681. http://dl.acm.org/citation.cfm?id=2486788.2486877

  • Jozefowicz R, Zaremba W, Sutskever I (2015) An empirical exploration of recurrent network architectures. In: Proceedings of the 32nd international conference on international conference on machine learning—volume 37 (ICML’15). JMLR.org, pp 2342–2350

  • Kang HJ, Aw KL, Lo D (2022) Detecting false alarms from automatic static analysis tools: how far are we?. In: Proceedings of the 44th international conference on software engineering (ICSE ’22). Association for Computing Machinery, New York, pp 698–709, DOI https://doi.org/10.1145/3510003.3510214

  • Kharkar A, Moghaddam RZ, Jin M, Liu X, Shi X, Clement C, Sundaresan N (2022) Learning to reduce false positives in analytic bug detectors. In: Proceedings of the 44th international conference on software engineering. ACM, DOI https://doi.org/10.1145/3510003.3510153

  • Kingma DP, Adam JB (2014) A method for stochastic optimization. https://doi.org/10.48550/ARXIV.1412.6980

  • Koc U, Saadatpanah P, Foster JS, Porter AA (2017) Learning a classifier for false positive error reports emitted by static code analysis tools. In: Proceedings of the 1st ACM SIGPLAN international workshop on machine learning and programming languages (MAPL 2017). ACM, New York, pp 35–42, DOI https://doi.org/10.1145/3088525.3088675

  • Koc U, Wei S, Foster JS, Carpuat M, Porter AA (2019) An empirical assessment of machine learning approaches for triaging reports of a java static analysis tool. In: 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pp 288–299, DOI https://doi.org/10.1109/ICST.2019.00036

  • Koc U, Mordahl A, Wei S, Foster JS, Porter A (2021) SATune: study-driven auto-tuning approach for configurable software verification tools. In: Proceedings of the 36th IEEE/ACM international conference on automated software engineering (ASE 2021). ACM

  • Kroening D, Tautschnig M (2014) CBMC—C bounded model checker. In: Ábrahám E, Havelund K (eds) Tools and algorithms for the construction and analysis of systems. Springer, Berlin, pp 389–391

  • Kushman N, Barzilay R (2013) Using semantic unification to generate regular expressions from natural language. In: Proceedings of the 2013 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 826–836

  • Li H, Kim S, Chandra S (2019) Neural code search evaluation dataset. arXiv:cs.SE/1908.09804

  • Li Y, Tarlow D, Brockschmidt M, Zemel R (2015a) Gated graph sequence neural networks. https://doi.org/10.48550/ARXIV.1511.05493

  • Li Y, Tarlow D, Brockschmidt M, Zemel R (2015b) Gated graph sequence neural networks. arXiv:1511.05493

  • Ling W, Blunsom P, Grefenstette E, Hermann KM, Kociskỳ T, Wang F, Senior A (2016) Latent predictor networks for code generation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 599–609

  • LLVM Team (2020) The LLVM compiler infrastructure. https://github.com/llvm/llvm-project.git

  • Lukins SK, Kraft NA, Letha HE (2010) Bug Localization using latent Dirichlet allocation. Inf Softw Technol 52(9):972–990. https://doi.org/10.1016/j.infsof.2010.04.002

    Article  Google Scholar 

  • Mandic DP, Chambers J (2001) Recurrent neural networks for prediction: learning algorithms architectures and stability. Wiley, New York

    Book  Google Scholar 

  • Microsoft (2019) Microsoft gated graph neural networks. https://github.com/Microsoft/gated-graph-neural-network-samples

  • Mikolov T, Chen K, Corrado G, Dean J, Sutskever L, Zweig G (2013) word2vec. https://code.google.com/p/word2vec

  • Mohr M, Hecker M, Bischof S, Bechberger J (2021) JOANA (Java Object-sensitive ANALysis)—information flow control framework for java. https://pp.ipd.kit.edu/projects/joana

  • MyBatis (2021) MyBatis: SQL mapper framework for Java. http://www.mybatis.org/mybatis-3

  • Naik M (2020) Petablox: large-scale software analysis and analytics using datalog. Technical Report. Georgia Tech Research Inst Atlanta Atlanta United States

  • NASA Ames Research Center (2022) Java pathfinder. https://github.com/javapathfinder

  • Nguyen TT, Nguyen AT, Nguyen HA, Nguyen TN (2013) A statistical semantic language model for source code. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering (ESEC/FSE 2013). ACM, New York, pp 532–542, DOI https://doi.org/10.1145/2491411.2491458

  • Nie C, Hareton L (2011) A survey of combinatorial testing. ACM Comput Surv 43(2):Article 11, 29 pp. https://doi.org/10.1145/1883612.1883618

    Article  MATH  Google Scholar 

  • OWASP (2014) The OWASP Benchmark for Security Automation, version 1.1. https://www.owasp.org/index.php/Benchmark. Accessed: 2018-01-04

  • Panthaplackel S, Nie P, Gligoric M, Li JJ, Mooney RJ (2020) Learning to update natural language comments based on code changes. arXiv:cs.CL/2004.12169

  • Prlić A, Yates A, Bliven SE, Rose PW, Jacobsen J, Troshin PV, Chapman M, Gao J, Koh CH, Foisy S et al (2012) Biojava: an open-source framework for bioinformatics in 2012. Bioinformatics 28(20):2693–2695

    Article  Google Scholar 

  • Quinlan J R (2014) C4.5: programs for machine learning. Elsevier

  • Raghothaman M, Kulkarni S, Heo K, Naik M (2018) User-guided program reasoning using bayesian inference. In: Proceedings of the 39th ACM SIGPLAN conference on programming language design and implementation (PLDI 2018). ACM, New York, pp 722–735, DOI https://doi.org/10.1145/3192366.3192417

  • Raychev V, Vechev M, Yahav E (2014) Code completion with statistical language models. In: Proceedings of the 35th ACM SIGPLAN conference on programming language design and implementation (PLDI ’14). ACM, New York, pp 419–428, DOI https://doi.org/10.1145/2594291.2594321

  • Raychev V, Vechev M, Krause A (2015) Predicting program properties from “Big code”. In: Proceedings of the 42nd annual ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL ’15). ACM, New York, pp 111–124, DOI https://doi.org/10.1145/2676726.2677009

  • Rish I, et al. (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 Workshop on empirical methods in artificial intelligence, vol 3, pp 41–46

  • Rosen BK, Wegman MN, Zadeck FK (1988) Global value numbers and redundant computations. In: Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL ’88). ACM, New York, pp 12–27, DOI https://doi.org/10.1145/73560.73562

  • Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386

    Article  Google Scholar 

  • Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Education Limited, Malaysia

    MATH  Google Scholar 

  • Safavian S R, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674. https://doi.org/10.1109/21.97458

    Article  MathSciNet  Google Scholar 

  • Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association

  • Scarselli F, Gori M, Tsoi A C, Hagenbuchner M, Monfardini G (2009) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80. https://doi.org/10.1109/TNN.2008.2005605

    Article  Google Scholar 

  • Smith A (2019) Universal password manager. http://upm.sourceforge.net

  • Sureka A, Jalote P (2010) Detecting duplicate bug report using character N-Gram-Based features. In: 2010 Asia pacific software engineering conference, pp 366–374, DOI https://doi.org/10.1109/APSEC.2010.49

  • Susi.ai (2018) api.susi.ai—software and rules for personal assistants. http://susi.ai

  • Tanwar A, Sundaresan K, Ashwath P, Ganesan P, Chandrasekaran SK, Ravi S (2020) Predicting vulnerability in large codebases with deep code representation. https://doi.org/10.48550/ARXIV.2004.12783

  • The Apache Software Foundation (2022) Apache Jackrabbit is a fully conforming implementation of the Content Repository for Java Technology API. http://jackrabbit.apache.org

  • The Clang Team (2021) Clang 12 documentation. https://releases.llvm.org/12.0.0/tools/clang/docs/index.html

  • The HSQL Development Group (2021) HyperSQL DataBase. http://hsqldb.org

  • Theano Development Team (2016) Theano: a python framework for fast computation of mathematical expressions. arXiv:1605.02688

  • Thunes C (2020) javalang: pure Python Java parser and tools. https://pypi.org/project/javalang/. Accessed: 2022-02-13

  • Tripp O, Guarnieri S, Pistoia M, Aleksandr A (2014) ALETHEIA: improving the usability of static security analysis. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security (CCS ’14). ACM, New York, pp 762–774, DOI https://doi.org/10.1145/2660267.2660339

  • Tu Z, Su Z, Devanbu P (2014) On the localness of software. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering (FSE 2014). ACM, New York, pp 269–280, DOI https://doi.org/10.1145/2635868.2635875

  • Utture A, Liu S, Kalhauge CG, Palsberg J (2022) Striking a balance: pruning false-positives from static call graphs. In: Proceedings of the 44th international conference on software engineering (ICSE ’22). Association for Computing Machinery, New York, pp 2043–2055, DOI https://doi.org/10.1145/3510003.3510166

  • Wan Y, Shu J, Sui Y, Xu G, Zhao Z, Wu J, Yu PS (2019) Multi-modal attention network learning for semantic source code retrieval. arXiv:cs.SE/1909.13516

  • Wang J, Wang S, Wang Q (2018) Is there a “golden” feature set for static warning identification?: an experimental evaluation. In: Proceedings of the 12th ACM/IEEE international symposium on empirical software engineering and measurement (ESEM ’18). ACM, New York, p Article 17, 10 pp, DOI https://doi.org/10.1145/3239235.3239523, (to appear in print)

  • Wang W, Zhang Y, Zeng Z, Xu G (2020) Trans3̂: a transformer-based framework for unifying code summarization and code search. arXiv:cs.SE/2003.03238

  • Weiser M (1981) Program slicing. In: Proceedings of the 5th international conference on software engineering. IEEE Press, pp 439–449

  • White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering (ASE 2016). ACM, New York, pp 87–98, DOI https://doi.org/10.1145/2970276.2970326

  • Xypolytos A, Xu H, Vieira B, Ali-Eldin AMT (2017) A framework for combining and ranking static analysis tool findings based on tool performance statistics. In: 2017 IEEE International conference on software quality, reliability and security companion (QRS-c). IEEE, pp 595–596

  • Ye X, Shen H, Ma X, Bunescu R, Liu C (2016) From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceedings of the 38th international conference on software engineering (ICSE ’16). ACM, New York, pp 404–415, DOI https://doi.org/10.1145/2884781.2884862

  • Yüksel U, Sözer H (2013) Automated classification of static code analysis alerts: a case study. In: 2013 IEEE International conference on software maintenance, pp 532–535

  • Zeiler MD (2012) ADADELTA: an adaptive learning rate method. arXiv:1212.5701

  • Zhou Y, Liu S, Siow J, Du X, Liu Y (2019) Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks. https://doi.org/10.48550/ARXIV.1909.03496

Download references

Acknowledgments

This work was partly supported by NSF grants CCF-2007314, CCF-2008905 and CCF 2047682, the NSF graduate research fellowship program, and Eugene McDermott Graduate Fellowship 202006.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sai Yerramreddy.

Ethics declarations

Financial Interests

Dr. Ugur Koc is currently employed by Amazon. Dr. Adam A. Porter receives a salary from association Fraunhofer USA CMA, where he is the Executive and Scientific Director.

Additional information

Communicated by: Andrea De Lucia

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Sai Yerramreddy and Austin Mordahl contributed equally to this research.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yerramreddy, S., Mordahl, A., Koc, U. et al. An empirical assessment of machine learning approaches for triaging reports of static analysis tools. Empir Software Eng 28, 28 (2023). https://doi.org/10.1007/s10664-022-10253-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10253-z

Keywords

Navigation