ABSTRACT
Instrumentation is vital to fuzzing. It provides fuzzing directions and helps detect covert bugs, yet its overhead greatly reduces the fuzzing throughput. To reduce the overhead, compilers compromise instrumentation correctness for better optimization, or seek convoluted runtime support to remove unused probes during fuzzing.
In this paper, we propose Odin, an on-demand instrumentation framework to instrument C/C++ programs correctly and flexibly. When instrumentation requirement changes during fuzzing, Odin first locates the changed code fragment, then re-instruments, re-optimizes, and re-compiles the small fragment on-the-fly. Consequently, with a minuscule compilation overhead, the runtime overhead of unused probes is reduced. Its architecture ensures correctness in instrumentation, optimized code generation, and low latency in recompilation. Experiments show that Odin delivers the performance of compiler-based static instrumentation while retaining the flexibility of binary-based dynamic instrumentation. When applied to coverage instrumentation, Odin reduces the coverage collection overhead by 3× and 17× compared to LLVM SanitizerCoverage and DynamoRIO, respectively.
- Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Continuous fuzzing for open source software. https://opensource.googleblog.com/2016/12/announcing-oss-fuzz-continuous-fuzzing.html [Online; accessed 15-May-2021]Google Scholar
- Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. GitHub - google/oss-fuzz: OSS-Fuzz - continuous fuzzing for open source software.. https://github.com/google/oss-fuzz [Online; accessed 04-Nov-2021]Google Scholar
- Abhishek Arya, Oliver Chang, Max Moroz, Martin Barbella, and Jonathan Metzman. 2019. GitHub - google/clusterfuzz: Scalable fuzzing infrastructure.. https://github.com/google/clusterfuzz [Online; accessed 04-Nov-2021]Google Scholar
- Abhishek Arya, Oliver Chang, Max Moroz, Martin Barbella, and Jonathan Metzman. 2019. Open sourcing ClusterFuzz. https://opensource.googleblog.com/2019/02/open-sourcing-clusterfuzz.html [Online; accessed 15-May-2021]Google Scholar
- Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. Redqueen: Fuzzing with Input-to-State Correspondence. In 26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society. https://www.ndss-symposium.org/ndss-paper/redqueen-fuzzing-with-input-to-state-correspondence/Google Scholar
- The LLVM authors. 2021. llvm-mca - LLVM Machine Code Analyzer — LLVM 13 documentation. https://llvm.org/docs/CommandGuide/llvm-mca.html [Online; accessed 01-Nov-2021]Google Scholar
- The LLVM authors. 2021. SanitizerCoverage — Clang 13 documentation. https://clang.llvm.org/docs/SanitizerCoverage.html [Online; accessed 09-Nov-2021]Google Scholar
- Andrew R. Bernat and Barton P. Miller. 2011. Anywhere, any-time binary instrumentation. In Proceedings of the 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools. ACM, 9–16. https://doi.org/10.1145/2024569.2024572 Google ScholarDigital Library
- Dean Michael Berris, Alistair Veitch, Nevin Heintze, Eric Anderson, and Ning Wang. 2017. XRay: A function call tracing system. 11th annual US LLVM Developers’ Meeting.Google Scholar
- Derek Bruening, Timothy Garnett, and Saman P. Amarasinghe. 2003. An Infrastructure for Adaptive Dynamic Optimization. In 1st IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2003). IEEE Computer Society, 265–275. https://doi.org/10.1109/CGO.2003.1191551 Google ScholarCross Ref
- Derek Bruening and Qin Zhao. 2011. Practical memory checking with Dr. Memory. In Proceedings of the CGO 2011, The 9th International Symposium on Code Generation and Optimization, Chamonix, France, April 2-6, 2011. IEEE Computer Society, 213–223. https://doi.org/10.1109/CGO.2011.5764689 Google ScholarCross Ref
- Justin Campbell and Mike Walker. 2020. GitHub - microsoft/onefuzz: A self-hosted Fuzzing-As-A-Service platform. https://github.com/microsoft/onefuzz [Online; accessed 04-Nov-2021]Google Scholar
- Justin Campbell and Mike Walker. 2020. Microsoft announces new Project OneFuzz framework, an open source developer tool to find and fix bugs at scale - Microsoft Security Blog. https://www.microsoft.com/security/blog/2020/09/15/microsoft-onefuzz-framework-open-source-developer-tool-fix-bugs/ [Online; accessed 04-Nov-2021]Google Scholar
- Buddhika Chamith, Bo Joel Svensson, Luke Dalessandro, and Ryan R. Newton. 2016. Living on the edge: rapid-toggling probes with cross-modification on x86. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 16–26. https://doi.org/10.1145/2908080.2908084 Google ScholarDigital Library
- Buddhika Chamith, Bo Joel Svensson, Luke Dalessandro, and Ryan R. Newton. 2017. Instruction punning: lightweight instrumentation for x86-64. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 320–332. https://doi.org/10.1145/3062341.3062344 Google ScholarDigital Library
- Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In IEEE Symposium on Security and Privacy (SP). 711–725. https://doi.org/10.1109/SP.2018.00046 Google ScholarCross Ref
- Yaohui Chen, Dongliang Mu, Jun Xu, Zhichuang Sun, Wenbo Shen, Xinyu Xing, Long Lu, and Bing Mao. 2019. Ptrix: Efficient hardware-assisted fuzzing for cots binary. In Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security. 633–645.Google ScholarDigital Library
- TIS Committee. 1993. Tool Interface Standard (TIS) Executable and Linking Format (ELF) Specification.Google Scholar
- Mozilla Corporation. 2014. mozilla/sccache: sccache is ccache with cloud storage. https://github.com/mozilla/sccache [Online; accessed 26-Feb-2022]Google Scholar
- Christian Dietrich, Valentin Rothberg, Ludwig Füracker, Andreas Ziegler, and Daniel Lohmann. 2017. cHash: Detection of Redundant Compilations via AST Hashing. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). USENIX Association, Santa Clara, CA. 527–538. isbn:978-1-931971-38-6 https://www.usenix.org/conference/atc17/technical-sessions/presentation/dietrichGoogle Scholar
- Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. AFL++ : Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT). https://www.usenix.org/conference/woot20/presentation/fioraldiGoogle Scholar
- Free Software Foundation. 2021. Labels as Values (Using the GNU Compiler Collection (GCC)). https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html [Online; accessed 01-Nov-2021]Google Scholar
- Byron Hawkins, Brian Demsky, Derek Bruening, and Qin Zhao. 2015. Optimizing binary translation of dynamically generated code. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization. IEEE Computer Society, 68–78. https://doi.org/10.1109/CGO.2015.7054188 Google ScholarCross Ref
- Marc Heuse. 2021. AFLplusplus/README.md at 3.14c · AFLplusplus/AFLplusplus · GitHub. https://github.com/AFLplusplus/AFLplusplus/blob/3.14c/README.md#cite [Online; accessed 01-Nov-2021]Google Scholar
- Chin-Chia Hsu, Che-Yu Wu, Hsu-Chun Hsiao, and Shih-Kun Huang. 2018. INSTRIM: Lightweight instrumentation for coverage-guided fuzzing. In 25th Annual Network and Distributed System Security Symposium (NDSS).Google ScholarCross Ref
- Apple Computer Inc.. 1999. nlist.h - Apple Open Source. https://opensource.apple.com/source/xnu/xnu-1228.0.2/EXTERNAL_HEADERS/mach-o/nlist.h.auto.html [Online; accessed 01-Nov-2021]Google Scholar
- Google Inc.. 2021. dynamorio/drcovlib.c · DynamoRIO/dynamorio. https://github.com/DynamoRIO/dynamorio/blob/7595b777289b70a4752ecb6db5ca7987efeeaaaf/ext/drcovlib/drcovlib.c [Online; accessed 14-Nov-2021]Google Scholar
- Microsoft Inc.. 2021. PE Format - Win32 apps | Microsoft Docs. https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#export-address-table [Online; accessed 01-Nov-2021]Google Scholar
- Yuseok Jeon, Wookhyun Han, Nathan Burow, and Mathias Payer. 2020. FuZZan: Efficient Sanitizer Metadata Design for Fuzzing. In 2020 USENIX Annual Technical Conference. USENIX Association, 249–263. https://www.usenix.org/conference/atc20/presentation/jeonGoogle Scholar
- Yaron Keren. 2018. yrnkrn/zapcc: zapcc is a caching C++ compiler based on clang, designed to perform faster compilations. https://github.com/yrnkrn/zapcc [Online; accessed 26-Feb-2022]Google Scholar
- Chris Lattner and Vikram S. Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In 2nd IEEE / ACM International Symposium on Code Generation and Optimization. IEEE Computer Society, 75–88. https://doi.org/10.1109/CGO.2004.1281665 Google ScholarCross Ref
- Michael Laurenzano, Mustafa M. Tikir, Laura Carrington, and Allan Snavely. 2010. PEBIL: Efficient static binary instrumentation for Linux. In IEEE International Symposium on Performance Analysis of Systems and Software. IEEE Computer Society, 175–183. https://doi.org/10.1109/ISPASS.2010.5452024 Google ScholarCross Ref
- Julian Lettner, Dokyung Song, Taemin Park, Per Larsen, Stijn Volckaert, and Michael Franz. 2018. PartiSan: Fast and Flexible Sanitization via Run-Time Partitioning. In Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Heraklion, Crete, Greece, September 10-12, 2018, Proceedings (Lecture Notes in Computer Science, Vol. 11050). Springer, 403–422. https://doi.org/10.1007/978-3-030-00470-5_19 Google ScholarCross Ref
- Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu. 2017. Steelix: program-state based binary fuzzing. In 11th Joint Meeting on Foundations of Software Engineering. 627–637. https://doi.org/10.1145/3106237.3106295 Google ScholarDigital Library
- Yuan Li, Wende Tan, Zhizheng Lv, Songtao Yang, Mathias Payer, Ying Liu, and Chao Zhang. 2022. PACSan: Enforcing Memory Safety Based on ARM PA. CoRR, abs/2202.03950 (2022), arXiv:2202.03950. arxiv:2202.03950Google Scholar
- Jie Liang, Yu Jiang, Mingzhe Wang, Xun Jiao, Yuanliang Chen, Houbing Song, and Kim-Kwang Raymond Choo. 2021. DeepFuzzer: Accelerated Deep Greybox Fuzzing. IEEE Trans. Dependable Secur. Comput., 18, 6 (2021), 2675–2688. https://doi.org/10.1109/TDSC.2019.2961339 Google ScholarDigital Library
- Jie Liang, Mingzhe Wang, Chijin Zhou, Zhiyong Wu, Yu Jiang, Jianzhong Liu, Zhe Liu, and Jiaguang Sun. 2022. PATA: Fuzzing with Path Aware Taint Analysis. In 2022 2022 IEEE Symposium on Security and Privacy (SP)(SP). IEEE Computer Society, Los Alamitos, CA, USA. 154–170.Google Scholar
- Chi-Keung Luk, Robert S. Cohn, Robert Muth, Harish Patil, Artur Klauser, P. Geoffrey Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim M. Hazelwood. 2005. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation. ACM, 190–200. https://doi.org/10.1145/1065010.1065034 Google ScholarDigital Library
- Michael Matz, Jan Hubicka, Andreas Jaeger, and Mark Mitchell. 2013. System V Application Binary Interface. AMD64 Architecture Processor Supplement, Draft v0, 99 (2013), 57.Google Scholar
- Jonathan Metzman, László Szekeres, Laurent Simon, Read Sprabery, and Abhishek Arya. 2021. FuzzBench: an open fuzzer benchmarking platform and service. In ESEC/FSE ’21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 1393–1403. https://doi.org/10.1145/3468264.3473932 Google ScholarDigital Library
- Stefan Nagy and Matthew Hicks. 2019. Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing. In IEEE Symposium on Security and Privacy (SP). 787–802. https://doi.org/10.1109/SP.2019.00069 Google ScholarCross Ref
- Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation. ACM, 89–100. https://doi.org/10.1145/1250734.1250746 Google ScholarDigital Library
- Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing. In 24th Annual Network and Distributed System Security Symposium (NDSS). https://www.ndss-symposium.org/ndss2017/ndss-2017-programme/vuzzer-application-aware-evolutionary-fuzzing/Google Scholar
- Joel Rosdahl. 2010. Ccache — Compiler cache. https://ccache.dev/ [Online; accessed 26-Feb-2022]Google Scholar
- Sergej Schumilo, Cornelius Aschermann, Robert Gawlik, Sebastian Schinzel, and Thorsten Holz. 2017. kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels. In 26th USENIX Security Symposium (USENIX Security 17). 167–182.Google ScholarDigital Library
- Kostya Serebryany. 2016. Sanitize, Fuzz, and Harden Your C++ Code. USENIX Association, San Francisco, CA.Google Scholar
- Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In 2012 USENIX Annual Technical Conference. USENIX Association, 309–318. https://www.usenix.org/conference/atc12/technical-sessions/presentation/serebryanyGoogle Scholar
- Konstantin Serebryany, Alexander Potapenko, Timur Iskhodzhanov, and Dmitriy Vyukov. 2011. Dynamic Race Detection with LLVM Compiler - Compile-Time Instrumentation for ThreadSanitizer. In Runtime Verification - Second International Conference (Lecture Notes in Computer Science, Vol. 7186). Springer, 110–114. https://doi.org/10.1007/978-3-642-29860-8_9 Google ScholarDigital Library
- Julian Seward and Nicholas Nethercote. 2005. Using Valgrind to Detect Undefined Value Errors with Bit-Precision. In 2005 USENIX Annual Technical Conference. USENIX, 17–30. http://www.usenix.org/events/usenix05/tech/general/seward.htmlGoogle Scholar
- Dokyung Song, Julian Lettner, Prabhu Rajasekaran, Yeoul Na, Stijn Volckaert, Per Larsen, and Michael Franz. 2019. SoK: Sanitizing for Security. In 2019 IEEE Symposium on Security and Privacy. IEEE, 1275–1295. https://doi.org/10.1109/SP.2019.00010 Google ScholarCross Ref
- Evgeniy Stepanov and Konstantin Serebryany. 2015. MemorySanitizer: fast detector of uninitialized memory use in C++. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization. IEEE Computer Society, 46–55. https://doi.org/10.1109/CGO.2015.7054186 Google ScholarCross Ref
- Todd L. Veldhuizen. 2003. C++ Templates are Turing Complete. Indiana University Computer Science.Google Scholar
- Jonas Wagner, Volodymyr Kuznetsov, George Candea, and Johannes Kinder. 2015. High System-Code Security with Low Overhead. In 2015 IEEE Symposium on Security and Privacy, SP 2015. IEEE Computer Society, 866–879. https://doi.org/10.1109/SP.2015.58 Google ScholarDigital Library
- Mingzhe Wang, Jie Liang, Chijin Zhou, Yu Jiang, Rui Wang, Chengnian Sun, and Jiaguang Sun. 2021. RIFF: Reduced Instruction Footprint for Coverage-Guided Fuzzing. In 2021 USENIX Annual Technical Conference. USENIX Association, 147–159. https://www.usenix.org/conference/atc21/presentation/wang-mingzheGoogle Scholar
- Mingzhe Wang, Zhiyong Wu, Xinyi Xu, Jie Liang, Chijin Zhou, Huafeng Zhang, and Yu Jiang. 2021. Industry Practice of Coverage-Guided Enterprise-Level DBMS Fuzzing. In 43rd IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP) 2021, Madrid, Spain, May 25-28, 2021. IEEE, 328–337. https://doi.org/10.1109/ICSE-SEIP52600.2021.00042 Google ScholarDigital Library
- Yanhao Wang, Xiangkun Jia, Yuwei Liu, Kyle Zeng, Tiffany Bao, Dinghao Wu, and Purui Su. 2020. Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization. In 27th Annual Network and Distributed System Security Symposium (NDSS). https://www.ndss-symposium.org/ndss-paper/not-all-coverage-measurements-are-equal-fuzzing-by-coverage-accounting-for-input-prioritization/Google Scholar
- Jiang Zhang, Shuai Wang, Manuel Rigger, Pinjia He, and Zhendong Su. 2021. SanRazor: Reducing Redundant Sanitizer Checks in C/C++ Programs. In 15th USENIX Symposium on Operating Systems Design and Implementation. USENIX Association, 479–494. https://www.usenix.org/conference/osdi21/presentation/zhangGoogle Scholar
- Chijin Zhou, Mingzhe Wang, Jie Liang, Zhe Liu, and Yu Jiang. 2020. Zeror: Speed Up Fuzzing with Coverage-sensitive Tracing and Scheduling. In 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). 858–870. https://doi.org/10.1145/3324884.3416572 Google ScholarDigital Library
- Zhiqiang Zuo, Kai Ji, Yifei Wang, Wei Tao, Linzhang Wang, Xuandong Li, and Guoqing Harry Xu. 2021. JPortal: precise and efficient control-flow tracing for JVM programs with Intel processor trace. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 1080–1094.Google ScholarDigital Library
Index Terms
- Odin: on-demand instrumentation with on-the-fly recompilation
Recommendations
Efficient and expressive bytecode-level instrumentation for Java programs
AbstractWe present an efficient and expressive tool for the instrumentation of Java programs at the bytecode level. BISM (Bytecode-Level Instrumentation for Software Monitoring) is a lightweight Java bytecode instrumentation tool that features an ...
BISM: Bytecode-Level Instrumentation for Software Monitoring
Runtime VerificationAbstractBISM (Bytecode-level Instrumentation for Software Monitoring) is a lightweight Java bytecode instrumentation tool which features an expressive high-level control-flow-aware instrumentation language. The language follows the aspect-oriented ...
Always-on instrumentation for application introspection in HPC
CF '22: Proceedings of the 19th ACM International Conference on Computing FrontiersObtaining insights into the dynamic behavior of user code is crucial for supercomputing centers to support both better operation and co-design of future systems. To this end, always-on instrumentation is the key: enabling all running code to dynamically ...
Comments