research-article

Open Access

Clog: A Declarative Language for C Static Code Checkers

Authors:
Alexandru Dura

Lund University, Lund, Sweden

Lund University, Lund, Sweden

0000-0002-8420-390X
View Profile

,
Christoph Reichenbach

Lund University, Lund, Sweden

Lund University, Lund, Sweden

0000-0003-0608-7023
View Profile

CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler ConstructionFebruary 2024Pages 186–197https://doi.org/10.1145/3640537.3641579

Published:20 February 2024Publication History

CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction

Pages 186–197

ABSTRACT

We present Clog, a declarative language for describing static code checkers for C. Unlike other extensible state-of-the-art checker frameworks, Clog enables powerful interprocedural checkers without exposing the underlying program representation: Clog checkers consist of Datalog-style recursive rules that access the program under analysis via syntactic pattern matching and control flow edges only. We have implemented Clog on top of Clang, using a custom Datalog evaluation strategy that piggy-backs on Clang's AST matching facilities while working around Clang's limitations to achieve our design goal of representation independence.

Our experiments demonstrate that Clog can concisely express a wide variety of checkers for different security vulnerabilities, with performance that is similar to Clang's own analyses and highly competitive on real-world programs.

References

[n. d.]. ISO/IEC 9899:2011 – Information technology – Programming languages - C. International Organization for Standardization. Google Scholar
Andrew W. Appel. 1998. Modern Compiler Implementation: In ML (1st ed.). Cambridge University Press, USA. isbn:0521582741 Google Scholar
Pavel Avgustinov, Oege de Moor, Michael Peyton Jones, and Max Schäfer. 2016. QL: Object-oriented Queries on Relational Data. In 30th European Conference on Object-Oriented Programming (ECOOP 2016), Shriram Krishnamurthi and Benjamin S. Lerner (Eds.) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 56). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany. 2:1–2:25. isbn:978-3-95977-014-9 issn:1868-8969 https://doi.org/10.4230/LIPIcs.ECOOP.2016.2 Google ScholarCross Ref
George Balatsouras and Yannis Smaragdakis. 2016. Structure-sensitive points-to analysis for C and C++. In International Static Analysis Symposium. 84–104. https://doi.org/10.1007/978-3-662-53413-7_5 Google ScholarCross Ref
Kevin Bierhoff, Nels E Beckman, and Jonathan Aldrich. 2009. Practical API protocol checking with access permissions. In ECOOP 2009–Object-Oriented Programming: 23rd European Conference, Genoa, Italy, July 6-10, 2009. Proceedings 23. 195–219. Google ScholarDigital Library
Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly declarative specification of sophisticated points-to analyses. In Proceedings of OOPSLA ’09. ACM, New York, NY, USA. 243–262. isbn:978-1-60558-766-0 https://doi.org/10.1145/1640089.1640108 Google ScholarDigital Library
Buddhika Chamith, Bo Joel Svensson, Luke Dalessandro, and Ryan R Newton. 2017. Instruction punning: Lightweight instrumentation for x86-64. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. 320–332. Google ScholarDigital Library
Xi Cheng. 2016. RABIEF: range analysis based integer error fixing. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 1094–1096. Google ScholarDigital Library
Coen De Roover, Carlos Noguera, Andy Kellens, and Vivane Jonckers. 2011. The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse. In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java (PPPJ ’11). Association for Computing Machinery, New York, NY, USA. 71–80. isbn:9781450309356 https://doi.org/10.1145/2093157.2093168 Google ScholarDigital Library
Alexandru Dura, Hampus Balldin, and Christoph Reichenbach. 2019. MetaDL: Analysing Datalog in Datalog. In Proceedings of the 8th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis. 38–43. https://doi.org/10.1145/3315568.3329970 Google ScholarDigital Library
Alexandru Dura and Christoph Reichenbach. 2024. Clog: A Declarative Language for C Static Code Checkers. https://doi.org/10.5281/zenodo.10525151 Google ScholarCross Ref
Alexandru Dura, Christoph Reichenbach, and Emma Söderberg. 2021. JavaDL: automatically incrementalizing Java bug pattern detection. Proceedings of the ACM on Programming Languages, 5, OOPSLA (2021), 1–31. https://doi.org/10.1145/3485542 Google ScholarDigital Library
Jeffrey S Foster, Tachio Terauchi, and Alex Aiken. 2002. Flow-sensitive type qualifiers. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation. 1–12. Google ScholarDigital Library
Ahmad Hazimeh, Adrian Herrera, and Mathias Payer. 2022. Magma: A Ground-Truth Fuzzing Benchmark. SIGMETRICS Perform. Eval. Rev., 49, 1 (2022), jun, 81–82. issn:0163-5999 https://doi.org/10.1145/3543516.3456276 Google ScholarDigital Library
Stephen C Johnson. 1977. Lint, a C program checker. Bell Telephone Laboratories Murray Hill. Google Scholar
Lennart C.L. Kats, Martin Bravenboer, and Eelco Visser. 2008. Mixing Source and Bytecode: A Case for Compilation by Normalization. SIGPLAN Not., 43, 10 (2008), Oct., 91–108. issn:0362-1340 https://doi.org/10.1145/1449955.1449772 Google ScholarDigital Library
Donald E Knuth. 1968. Semantics of context-free languages. Mathematical systems theory, 2, 2 (1968), 127–145. Google Scholar
Julia Lawall and Gilles Muller. 2018. Coccinelle: 10 years of automated evolution in the Linux kernel. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). 601–614. Google Scholar
Julia L Lawall, Julien Brunel, Nicolas Palix, René Rydhof Hansen, Henrik Stuart, and Gilles Muller. 2009. WYSIWIB: A declarative approach to finding API protocols and bugs in Linux code. In 2009 IEEE/IFIP International Conference on Dependable Systems & Networks. 43–52. https://doi.org/10.1109/DSN.2009.5270354 Google ScholarCross Ref
Sorin Lerner, Todd Millstein, and Craig Chambers. 2005. Cobalt: A Language for Writing Provably-Sound Compiler Optimizations. Electronic Notes in Theoretical Computer Science, 132, 1 (2005), 5–17. issn:1571-0661 https://doi.org/10.1016/j.entcs.2005.03.022 Proceedings of the 3rd International Workshop on Compiler Optimization Meets Compiler Verification (COCV 2004) Google ScholarDigital Library
Stephan Lipp, Sebastian Banescu, and Alexander Pretschner. 2022. An Empirical Study on the Effectiveness of Static C Code Analyzers for Vulnerability Detection. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022). Association for Computing Machinery, New York, NY, USA. 544–555. isbn:9781450393799 https://doi.org/10.1145/3533767.3534380 Google ScholarDigital Library
Magnus Madsen, Ming-Ho Yee, and Ondřej Lhoták. 2016. From Datalog to Flix: a Declarative Language for Fixed Points on Lattices. ACM SIGPLAN Notices, 51, 6 (2016), 194–208. https://doi.org/10.1145/2980983.2908096 Google ScholarDigital Library
Michael Martin, Benjamin Livshits, and Monica S. Lam. 2005. Finding Application Errors and Security Flaws Using PQL: A Program Query Language. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’05). Association for Computing Machinery, New York, NY, USA. 365–383. isbn:1595930310 https://doi.org/10.1145/1094811.1094840 Google ScholarDigital Library
Krishna Narasimhan, Christoph Reichenbach, and Julia Lawall. 2017. Interactive Data Representation Migration: Exploiting Program Dependence to Aid Program Transformation. In Proceedings of the 2017 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM 2017). ACM, New York, NY, USA. 47–58. isbn:978-1-4503-4721-1 https://doi.org/10.1145/3018882.3018890 Google ScholarDigital Library
Peter O’Hearn. 2019. Separation logic. Commun. ACM, 62, 2 (2019), 86–95. Google ScholarDigital Library
Dennis M. Ritchie. 1993. The Development of the C Language. In The Second ACM SIGPLAN Conference on History of Programming Languages (HOPL-II). Association for Computing Machinery, New York, NY, USA. 201–208. isbn:0897915704 https://doi.org/10.1145/154766.155580 Google ScholarDigital Library
Bernhard Scholz, Herbert Jordan, Pavle Subotić, and Till Westmann. 2016. On Fast Large-scale Program Analysis in Datalog. In Proceedings of the 25th Int. Conf. on Compiler Construction (CC 2016). ACM, New York, NY, USA. 196–206. isbn:978-1-4503-4241-4 https://doi.org/10.1145/2892208.2892226 Google ScholarDigital Library
Philipp Dominik Schubert, Ben Hermann, and Eric Bodden. 2019. PhASAR: An Inter-procedural Static Analysis Framework for C/C++. In Tools and Algorithms for the Construction and Analysis of Systems, Tomáš Vojnar and Lijun Zhang (Eds.). Springer International Publishing, Cham. 393–410. isbn:978-3-030-17465-1 https://doi.org/10.1007/978-3-030-17465-1_22 Google ScholarCross Ref
Elizabeth Scott. 2008. SPPF-style parsing from Earley recognisers. Electronic Notes in Theoretical Computer Science, 203, 2 (2008), 53–67. https://doi.org/10.1016/j.entcs.2008.03.044 Google ScholarDigital Library
Tamás Szabó, Gábor Bergmann, Sebastian Erdweg, and Markus Voelter. 2018. Incrementalizing Lattice-Based Program Analyses in Datalog. Proc. ACM Program. Lang., 2, OOPSLA (2018), Article 139, Oct., 29 pages. https://doi.org/10.1145/3276509 Google ScholarDigital Library
J. D. Ullman. 1989. Bottom-up Beats Top-down for Datalog. In Proceedings of the Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS ’89). Association for Computing Machinery, New York, NY, USA. 140–149. isbn:0897913086 Google ScholarDigital Library
Andreas Wagner and Johannes Sametinger. 2014. Using the juliet test suite to compare static security scanners. In 2014 11th International Conference on Security and Cryptography (SECRYPT). 1–9. Google ScholarDigital Library

Index Terms

Clog: A Declarative Language for C Static Code Checkers
1. Software and its engineering
  1. Software notations and tools
    1. Context specific languages
      1. Domain specific languages
    2. General programming languages
      1. Language types
        Constraint and logic languages
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis
2. Theory of computation
  1. Design and analysis of algorithms
    1. Data structures design and analysis
      1. Pattern matching

Recommendations

From Datalog to flix: a declarative language for fixed points on lattices
PLDI '16

We present Flix, a declarative programming language for specifying and solving least fixed point problems, particularly static program analyses. Flix is inspired by Datalog and extends it with lattices and monotone functions. Using Flix, implementors ...
Read More
From Datalog to flix: a declarative language for fixed points on lattices
PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

We present Flix, a declarative programming language for specifying and solving least fixed point problems, particularly static program analyses. Flix is inspired by Datalog and extends it with lattices and monotone functions. Using Flix, implementors ...
Read More
Correct‐by‐construction specification to verified code
Abstract
Event‐B is a formal notation and method for the systems development. The key feature of this method is to produce correct‐by‐construction system designs. Once the correct design is established, the remaining work is to generate or implement ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CC 2024: Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction
February 2024
261 pages
ISBN:9798400705076
DOI:10.1145/3640537
General Chair:
Gabriel Rodríguez
Universidade da Coruña, Spain
,
Program Chairs:
P. Sadayappan
University of Utah, USA
,
Aravind Sukumaran-Rajam
Meta, USA
Copyright © 2024 Owner/Author
This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 February 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
Author Tags
C
Datalog
Static Analysis Frameworks
Syntactic Patterns
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 168
  Total Downloads
- Downloads (Last 12 months)168
- Downloads (Last 6 weeks)77
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.