Abstract
We present FoxDec: an approach to C code decompilation that aims at producing sound and recompilable code. Formal methods are used during three phases of the decompilation process: control flow recovery, symbolic execution, and variable analysis. The use of formal methods minimizes the trusted code base and ensures soundness: the extracted C code behaves the same as the original binary. Soundness and recompilablity enable C code decompilation to be used in the contexts of binary patching, binary porting, binary analysis and binary improvement, with confidence that the recompiled code’s behavior is consistent with the original program. We demonstrate that FoxDec can be used to improve execution speed by recompiling a binary with different compiler options, to patch a memory leak with a code transformation tool, and to port a binary to a different architecture. FoxDec can also be leveraged to port a binary to run as a unikernel, a minimal and secure virtual machine usually requiring source access for porting.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
National Security Agency. Ghidra (2019). https://www.nsa.gov/resources/everyone/ghidra/
Andriesse, D., Chen, X., Van Der Veen, V., Slowinska, A., Bos, H.: An in-depth analysis of disassembly on full-scale x86/x64 binaries. In: 25th USENIX Security Symposium (USENIX Security 2016), pp. 583–600 (2016)
Balakrishnan, G., Gruian, R., Reps, T., Teitelbaum, T.: CodeSurfer/x86—a platform for analyzing x86 executables. In: Bodik, R. (ed.) CC 2005. LNCS, vol. 3443, pp. 250–254. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31985-6_19
Bauman, E., Lin, Z., Hamlen, K.W.: Superset disassembly: statically rewriting x86 binaries without heuristics. In: NDSS (2018)
Bellard, F.: QEMU, a fast and portable dynamic translator. In: USENIX Annual Technical Conference, FREENIX Track, vol. 41, p. 46 (2005)
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 72–81. ACM (2008)
Bonfante, G., Kaczmarek, M., Marion, J.-Y.: Control flow graphs as malware signatures (2007)
Brumley, D., Jager, I., Avgerinos, T., Schwartz, E.J.: BAP: a binary analysis platform. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 463–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_37
Brumley, D., Lee, J.H., Schwartz, E.J., Woo, M.: Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In: Presented as part of the 22nd USENIX Security Symposium (USENIX Security 2013), pp. 353–368 (2013)
Bugnion, E., Nieh, J., Tsafrir, D.: Hardware and software support for virtualization. Synth. Lect. Comput. Archit. 12(1), 1–206 (2017)
Cifuentes, C., Gough, K.J.: Decompilation of binary programs. Softw. Pract. Exp. 25(7), 811–829 (1995)
Cifuentes, C., Simon, D., Fraboulet, A.: Assembly to high-level language translation. In: Proceedings of International Conference on Software Maintenance (Cat. No. 98CB36272), pp. 228–237. IEEE (1998)
De Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24
Dinaburg, A., Ruef, A.: Mcsema: static translation of x86 instructions to LLVM. In: ReCon 2014 Conference, Montreal, Canada (2014)
Ďurfina, L., et al.: Design of a retargetable decompiler for a static platform-independent malware analysis. In: Kim, T., Adeli, H., Robles, R.J., Balitanas, M. (eds.) ISA 2011. CCIS, vol. 200, pp. 72–86. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23141-4_8
Ferguson, J., Kaminsky, D.: Reverse engineering code with IDA Pro. Syngress (2008)
Fokin, A., Derevenetc, E., Chernov, A., Troshina, K.: SmartDec: approaching C++ decompilation. In: 2011 18th Working Conference on Reverse Engineering, pp. 347–356. IEEE (2011)
German, S.M., Wegbreit, B.: A synthesizer of inductive assertions. IEEE Trans. Softw. Eng. 1(1), 68–75 (1975)
Guilfanov, I.: Decompilers and beyond. Black Hat USA (2008)
Hecht, M.S., Ullman, J.D.: Characterizations of reducible flow graphs. J. ACM (JACM) 21(3), 367–375 (1974)
Heule, S., Schkufza, E., Sharma, R., Aiken, A.: Stratified synthesis: automatically learning the x86-64 instruction set. In: ACM SIGPLAN Notices, vol. 51, pp. 237–250. ACM (2016)
Horspool, R.N., Marovac, N.: An approach to the problem of detranslation of computer programs. Comput. J. 23(3), 223–229 (1980)
Khadra, M.A.B., Stoffel, D., Kunz, W.: Speculative disassembly of binary code. In: 2016 International Conference on Compliers, Architectures, and Sythesis of Embedded Systems (CASES), pp. 1–10, October 2016
Kirchner, F., Kosmatov, N., Prevosto, V., Signoles, J., Yakobowski, B.: Frama-C: a software analysis perspective. Formal Aspects Comput. 27(3), 573–609 (2015). https://doi.org/10.1007/s00165-014-0326-7
Křoustek, J.: Retargetable analysis of machine code. Ph.D. thesis, Brno, FIT BUT (2014)
Křoustek, J., Kolár, D.: Preprocessing of binary executable files towards retargetable decompilation. In: 8th International Multi-Conference on Computing in the Global Information Technology (ICCGI 2013), pp. 259–264 (2013)
Lankes, S., Pickartz, S., Breitbart, J.: Hermitcore: a unikernel for extreme scale computing. In: Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers, p. 4. ACM (2016)
Madhavapeddy, A., et al.: Unikernels: library operating systems for the cloud. ACM SIGPLAN Not. 48(4), 461–472 (2013)
Merkel, D.: Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014)
Mycroft, A.: Type-based decompilation (or program reconstruction via type reconstruction). In: Swierstra, S.D. (ed.) ESOP 1999. LNCS, vol. 1576, pp. 208–223. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-49099-X_14
Myreen, M.O., Gordon, M.J.C., Slind, K.: Machine-code verification for multiple architectures - an application of decompilation into logic. In: Formal Methods in Computer-Aided Design, pp. 1–8, November 2008
Myreen, M.O., Gordon, M.J.C., Slind, K.: Decompilation into logic - improved. In: 2012 Formal Methods in Computer-Aided Design (FMCAD), pp. 78–81. IEEE (2012)
Nipkow, T., Wenzel, M., Paulson, L.C. (eds.): Isabelle/HOL: A Proof Assistant for Higher-Order Logic. LNCS, vol. 2283. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45949-9
Olivier, P., Chiba, D., Lankes, S., Min, C., Ravindran, B.: A binary-compatible unikernel. In: Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE 2019) (2019)
Padioleau, Y., Lawall, J., Hansen, R.R., Muller, G.: Documenting and automating collateral evolutions in Linux device drivers. In: Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008, Eurosys 2008, pp. 247–260. ACM, New York (2008)
Padioleau, Y., Lawall, J.L., Muller, G.: Semantic patches, documenting and automating collateral evolutions in Linux device drivers. In: Ottawa Linux Symposium (OLS 2007), Ottawa, Canada (2007)
Proebsting, T.A., Watterson, S.A.: Krakatoa: decompilation in Java (does bytecode reveal source?). In: COOTS, pp. 185–198 (1997)
Roessle, I., Verbeek, F., Ravindran, B.: Formally verified big step semantics out of x86-64 binaries. In: Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs, pp. 181–195. ACM (2019)
Shoshitaishvili, Y., et al.: SoK: (state of) the art of war: offensive techniques in binary analysis. In: IEEE Symposium on Security and Privacy (2016)
Sulaman, S.M., Orucevic-Alagic, A., Borg, M., Wnuk, K., Höst, M., de la Vara, J.L.: Development of safety-critical software systems using open source software - a systematic map. In: 2014 40th EUROMICRO Conference on Software Engineering and Advanced Applications (SEAA), pp. 17–24. IEEE (2014)
Wang, R., et al.: Ramblr: making reassembly great again. In: NDSS (2017)
Wei, T., Mao, J., Zou, W., Chen, Y.: A new algorithm for identifying loops in decompilation. In: Nielson, H.R., Filé, G. (eds.) SAS 2007. LNCS, vol. 4634, pp. 170–183. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74061-2_11
Liang, X., Sun, F., Zhendong, S.: Constructing precise control flow graphs from binaries. University of California, Davis, Technical report (2009)
Yakdan, K., Eschweiler, S., Gerhards-Padilla, E., Smith, M.: No more gotos: decompilation using pattern-independent control-flow structuring and semantic-preserving transformations. In: NDSS (2015)
Acknowledgments
This work is supported in part by the US Office of Naval Research (ONR) under grants N00014-17-1-2297, N00014-16-1-2104, and N00014-18-1-2022.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Verbeek, F., Olivier, P., Ravindran, B. (2020). Sound C Code Decompilation for a Subset of x86-64 Binaries. In: de Boer, F., Cerone, A. (eds) Software Engineering and Formal Methods. SEFM 2020. Lecture Notes in Computer Science(), vol 12310. Springer, Cham. https://doi.org/10.1007/978-3-030-58768-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-58768-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58767-3
Online ISBN: 978-3-030-58768-0
eBook Packages: Computer ScienceComputer Science (R0)