Abstract
There is increasing interest in applying verification tools to programs that have bitvector operations. SMT solvers, which serve as a foundation for these tools, have thus increased support for bitvector reasoning through bit-blasting and linear arithmetic approximations.
In this paper we show that similar linear arithmetic approximation of bitvector operations can be done at the source level through transformations. Specifically, we introduce new paths that over-approximate bitvector operations with linear conditions/constraints, increasing branching but allowing us to better exploit the well-developed integer reasoning and interpolation of verification tools. We show that, for reachability of bitvector programs, increased branching incurs negligible overhead yet, when combined with integer interpolation optimizations, enables more programs to be verified. We further show this exploitation of integer interpolation in the common case also enables competitive termination verification of bitvector programs and leads to the first effective technique for linear temporal logic (LTL) verification of bitvector programs. Finally, we provide an in-depth case study of decompiled (“lifted”) binary programs, which emulate X86 execution through frequent use of bitvector operations. We present a new tool DarkSea, the first tool capable of verifying reachability, termination and LTL of lifted binaries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hex-rays decompiler. www.hex-rays.com/products/decompiler/
mcsema jump table bug. github.com/lifting-bits/mcsema/issues/558
mcsema bug, missing data cross reference due to resetting ida’s analysis flag. github.com/lifting-bits/mcsema/issues/561
mcsema var. bug. github.com/lifting-bits/mcsema/issues/566
SV-COMP Termination Benchmarks. github.com/sosy-lab/sv-benchmarks/tree/master/c/termination-crafted
Ultimate’s LTL benchmarks. github.com/ultimate-pa/ultimate/tree/dev/trunk/examples/LTL/
National Security Agency: Ghidra. www.nsa.gov/resources/everyone/ghidra/
Altinay, A., et al.: BinRec: dynamic binary lifting and recompilation. In: EuroSys, pp. 36:1–36:16 (2020)
Anderson, S.: Bit twiddling hacks. graphics.stanford.edu/ seander/bithacks.html
Armstrong, A., et al.: ISA semantics for ARMv8-a, RISC-v, and CHERI-MIPS. Proc. ACM Program. Lang. 3(POPL), 1–31 (2019)
Barnett, M., Chang, B.-Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: a modular reusable verifier for object-oriented programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387. Springer, Heidelberg (2006). https://doi.org/10.1007/11804192_17
Beyer, D., Löwe, S., Wendler, P.: Reliable benchmarking: requirements and solutions. Int. J. Softw. Tools Technol. Transfer 21(1), 1–29 (2017). https://doi.org/10.1007/s10009-017-0469-y
Bozzano, M., et al.: Encoding RTL constructs for MathSAT: a preliminary report. Electron. Notes Theor. Comput. Sci. 144(2), 3–14 (2006)
Brumley, D., Jager, I., Avgerinos, T., Schwartz, E.J.: BAP: a binary analysis platform. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 463–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_37
Bryant, R.E., Kroening, D., Ouaknine, J., Seshia, S.A., Strichman, O., Brady, B.: Deciding bit-vector arithmetic with abstraction. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 358–372. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71209-1_28
Chalupa, M.: mchalupa/dg, January 2021. github.com/mchalupa/dg
Chen, H., David, C., Kroening, D., Schrammel, P., Wachter, B.: Synthesising interprocedural bit-precise termination proofs (T). In: ASE, pp. 53–64 (2015)
Chen, H.Y., David, C., Kroening, D., Schrammel, P., Wachter, B.: Bit-precise procedure-modular termination analysis. ACM Trans. Program. Lang. Syst. 40, 1–38 (2018)
Cook, B., Koskinen, E.: Making prophecies with decision predicates. In: Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, pp. 399–410 (2011)
Cook, B., Kroening, D., Rümmer, P., Wintersteiger, C.M.: Ranking function synthesis for bit-vector relations. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 236–250. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12002-2_19
Dasgupta, S., Dinesh, S., Venkatesh, D., Adve, V.S., Fletcher, C.W.: Scalable validation of binary lifters. In: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 655–671, June 2020
Dasgupta, S., Park, D., Kasampalis, T., Adve, V.S., Roşu, G.: A complete formal semantics of x86-64 user-level instruction set architecture, p. 16 (2019)
Derevenets, Y.: Snowman. derevenets.com/
Dinaburg, A., Ruef, A.: McSema: static translation of x86 instructions to LLVM. In: ReCon 2014 Conference, Montreal, Canada (2014)
Falke, S., Kapur, D., Sinz, C.: Termination analysis of imperative programs using bitvector arithmetic. In: Joshi, R., Müller, P., Podelski, A. (eds.) VSTTE 2012. LNCS, vol. 7152, pp. 261–277. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27705-4_21
Galois, I.: Macaw. github.com/GaloisInc/macaw
Galois, I.: Reopt vcg. github.com/GaloisInc/reopt-vcg
Giesl, J., et al.: Analyzing program termination and complexity automatically with AProVE. J. Autom. Reason. 58(1), 3–31 (2016). https://doi.org/10.1007/s10817-016-9388-y
He, S., Rakamarić, Z.: Counterexample-guided bit-precision selection. In: Chang, B.-Y.E. (ed.) APLAS 2017. LNCS, vol. 10695, pp. 534–553. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71237-6_26
Heizmann, M., et al.: Ultimate program analysis framework, p. 1
Heizmann, M., Hoenicke, J., Podelski, A.: Termination analysis by learning terminating programs. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 797–813. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08867-9_53
Hendrix, J., Wei, G., Winwood, S.: Towards verified binary raising, p. 4
Hensel, J., Giesl, J., Frohn, F., Ströder, T.: Proving termination of programs with bitvector arithmetic by symbolic execution. In: De Nicola, R., Kühn, E. (eds.) SEFM 2016. LNCS, vol. 9763, pp. 234–252. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41591-8_16
Henzinger, T.A., Necula, G.C., Jhala, R., Sutre, G., Majumdar, R., Weimer, W.: Temporal-safety proofs for systems code. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 526–538. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45657-0_45
Kinder, J.: Jakstab. http://www.jakstab.org/
Kinder, J., Veith, H.: Precise static analysis of untrusted driver binaries. In: Formal Methods in Computer Aided Design, pp. 43–50. IEEE (2010)
Kroening, D., Sharygina, N.: Approximating predicate images for bit-vector logic. In: Hermanns, H., Palsberg, J. (eds.) TACAS 2006. LNCS, vol. 3920, pp. 242–256. Springer, Heidelberg (2006). https://doi.org/10.1007/11691372_16
Leike, J., Heizmann, M.: Geometric nontermination arguments. In: Beyer, D., Huisman, M. (eds.) TACAS 2018. LNCS, vol. 10806, pp. 266–283. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-89963-3_16
Liu, Y.C., et al.: Proving LTL properties of bitvector programs and decompiled binaries (extended). CoRR abs/2105.05159 (2021). https://arxiv.org/abs/2105.05159
Mattsen, S., Wichmann, A., Schupp, S.: A non-convex abstract domain for the value analysis of binaries. In: SANER, pp. 271–280 (2015)
Metere, R., Lindner, A., Guanciale, R.: Sound transpilation from binary to machine-independent code, vol. 10623, pp. 197–214. arXiv:1807.10664 [cs] (2017)
Myreen, M.O., Gordon, M.J.C.: Hoare logic for realistically modelled machine code. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 568–582. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71209-1_44
Myreen, M.O., Gordon, M.J.C., Slind, K.: Machine-code verification for multiple architectures - an application of decompilation into logic. In: Formal Methods in Computer-Aided Design, FMCAD 2008, pp. 1–8 (2008)
Myreen, M.O., Gordon, M.J.C., Slind, K.: Decompilation into logic - improved. In: Formal Methods in Computer-Aided Design, FMCAD 2012, Cambridge, UK, 22–25 October 2012, pp. 78–81 (2012)
Niemetz, A., Preiner, M., Reynolds, A., Zohar, Y., Barrett, C., Tinelli, C.: Towards bit-width-independent proofs in SMT solvers. In: Fontaine, P. (ed.) CADE 2019. LNCS (LNAI), vol. 11716, pp. 366–384. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29436-6_22
Roessle, I., Verbeek, F., Ravindran, B.: Formally verified big step semantics out of x86-64 binaries. In: Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs (2019)
IDA Support: Hex Rays: IDA pro. www.hex-rays.com/products/ida/
Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on S&P (2016)
SoSy-Lab: cpachecker. cpachecker.sosy-lab.org/
Verbeek, F., Olivier, P., Ravindran, B.: Sound C code decompilation for a subset of x86-64 binaries. In: de Boer, F., Cerone, A. (eds.) SEFM 2020. LNCS, vol. 12310, pp. 247–264. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58768-0_14
Wintersteiger, C.M., Hamadi, Y., de Moura, L.: Efficiently solving quantified bit-vector formulas. Formal Methods Syst. Des. 42, 3–23 (2013). https://doi.org/10.1007/s10703-012-0156-2
Zohar, Y., et al.: Bit-Precise Reasoning via Int-Blasting (2021)
Acknowledgments
We thank the anonymous reviewers for their helpful feedback. This work is supported by ONR Grant #N00014-17-1-2787.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Proof of Theorem 1
A Proof of Theorem 1
Proof
Induction on traces, showing equality on expression translation \(T_E\) via induction on expressions/statements and then inclusion on statement translations \(T_S\). First show that \(T_E\) preserves traces equivalence. Structural induction on e, with base cases being constants, variables, etc. In the inductive case, for a bitvector operation \(e_1 \otimes e_2\), assume \(e_1,e_2\) has been (potentially) transformed to \(e_1',e_2'\) (resp.) and that Lemma 1 holds for each \(i\in \{1,2\}\): \(\forall \sigma . [\![ e_i ]\!]\sigma =[\![ e_i' ]\!]\sigma \). Since \(\otimes \) is deterministic, \([\![ e_1'\otimes e_2' ]\!]\sigma = [\![ e_1\otimes e_2 ]\!]\sigma \). Finally, applying the transformation to \(\otimes \), we show that \([\![ T_E\{e_1'\otimes e_2'\} ]\!] = [\![ e_1'\otimes e_2' ]\!]\) again by Lemma 1. Next, for each statement s or relational condition c step, we prove \(T_S\) preserves trace inclusion: that \([\![ s ]\!] \subseteq [\![ T_S\{s\} ]\!]\) or that \([\![ c ]\!] \subseteq [\![ T_S\{c\} ]\!]\). We do not recursively weaken conditional boolean expressions, which would require alternating strengthening/weakening. Thus, inclusion holds directly from Lemma 1.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, Y.C. et al. (2021). Proving LTL Properties of Bitvector Programs and Decompiled Binaries. In: Oh, H. (eds) Programming Languages and Systems. APLAS 2021. Lecture Notes in Computer Science(), vol 13008. Springer, Cham. https://doi.org/10.1007/978-3-030-89051-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-89051-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89050-6
Online ISBN: 978-3-030-89051-3
eBook Packages: Computer ScienceComputer Science (R0)