Skip to main content

Towards Automating Code-Reuse Attacks Using Synthesized Gadget Chains

  • Conference paper
  • First Online:
Computer Security – ESORICS 2021 (ESORICS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12972))

Included in the following conference series:

Abstract

In the arms race between binary exploitation techniques and mitigation schemes, code-reuse attacks have been proven indispensable. Typically, one of the initial hurdles is that an attacker cannot execute their own code due to countermeasures such as data execution prevention (DEP, ). While this technique is powerful, the task of finding and correctly chaining gadgets remains cumbersome. Although various methods automating this task have been proposed, they either rely on hard-coded heuristics or make specific assumptions about the gadgets’ semantics. This not only drastically limits the search space but also sacrifices their capability to find valid chains unless specific gadgets can be located. As a result, they often produce no chain or an incorrect chain that crashes the program. In this paper, we present SGC, the first generic approach to identify gadget chains in an automated manner without imposing restrictions on the gadgets or limiting its applicability to specific exploitation scenarios. Instead of using heuristics to find a gadget chain, we offload this task to an SMT solver. More specifically, we build a logical formula that encodes the CPU and memory state at the time when the attacker can divert execution flow to the gadget chain, as well as the attacker’s desired program state that the gadget chain should construct. In combination with a logical encoding of the data flow between gadgets, we query an SMT solver whether a valid gadget chain exists. If successful, the solver provides a proof of existence in the form of a synthesized gadget chain. This way, we remain fully flexible w.r.t. to the gadgets. In empirical tests, we find that the solver often uses all types of control-flow transfer instructions and even gadgets with side effects. Our evaluation shows that SGC successfully finds working gadget chains for real-world exploitation scenarios within minutes, even when all state-of-the-art approaches fail.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abadi, M., Budiu, M., Erlingsson, U., Ligatti, J.: Control-flow integrity principles, implementations, and applications. ACM Trans. Inf. Syst. Secur. (TISSEC) 13(1) (2009)

    Google Scholar 

  2. angr team: angrop. https://github.com/angr/angrop

  3. Bletsch, T., Jiang, X., Freeh, V.W., Liang, Z.: Jump-oriented programming: a new class of code-reuse attack. In: ACM Conference on Computer and Communications Security (CCS) (2011)

    Google Scholar 

  4. Carlini, N., Wagner, D.: ROP is still dangerous: breaking modern defenses. In: USENIX Security Symposium (2014)

    Google Scholar 

  5. CEA IT Security: Miasm - reverse engineering framework. https://github.com/cea-sec/miasm

  6. Checkoway, S., Davi, L., Dmitrienko, A., Sadeghi, A.R., Shacham, H., Winandy, M.: Return-oriented programming without returns. In: ACM Conference on Computer and Communications Security (CCS) (2010)

    Google Scholar 

  7. Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N., Zadeck, F.K.: An Efficient method of computing static single assignment form. In: ACM Symposium on Principles of Programming Languages (POPL) (1989)

    Google Scholar 

  8. Follner, A., et al.: PSHAPE: automatically combining gadgets for arbitrary method execution. In: Security and Trust Management Workshop (2016)

    Google Scholar 

  9. Göktas, E., Athanasopoulos, E., Bos, H., Portokalidis, G.: Out of control: overcoming control-flow integrity. In: IEEE Symposium on Security and Privacy (2014)

    Google Scholar 

  10. Hu, H., Shinde, S., Adrian, S., Chua, Z.L., Saxena, P., Liang, Z.: Data-oriented programming: on the expressiveness of non-control data attacks. In: IEEE Symposium on Security and Privacy (2016)

    Google Scholar 

  11. Ispoglou, K.K., AlBassam, B., Jaeger, T., Payer, M.: Block-oriented programming: automating data-only attacks. In: ACM Conference on Computer and Communications Security (CCS) (2018)

    Google Scholar 

  12. Kelley, S.: dnsmasq. https://thekelleys.org.uk/dnsmasq/doc.html

  13. Kornau, T.: Return-Oriented Programming for the ARM Architecture. Master’s thesis, Ruhr-Universität Bochum (2010)

    Google Scholar 

  14. Krahmer, S.: x86–64 buffer overflow exploits and the borrowed code chunks exploitation technique (2005)

    Google Scholar 

  15. Kroening, D., Strichman, O.: Decision Procedures. Springer, Cham (2016). https://doi.org/10.1007/978-3-540-74105-3

  16. Matz, M., Hubicka, J., Jaeger, A., Mitchell, M.: System V application binary interface. AMD64 Architecture Processor Supplement, Draft v0 99 (2013)

    Google Scholar 

  17. Milanov, B.: ROPium. https://github.com/Boyan-MILANOV/ropium

  18. Niemetz, A., Preiner, M., Biere, A.: Boolector 2.0. J. Satisfiabil. Boolean Modeling Comput. 9(1), 53–58 (2014)

    Google Scholar 

  19. Roemer, R.G.: Finding the bad in good code: automated return-oriented programming exploit discovery. Master’s thesis, UC San Diego (2009)

    Google Scholar 

  20. Salwan, J.: ROPgadget. https://github.com/JonathanSalwan/ROPgadget

  21. Schirra, S.: Ropper. https://github.com/sashs/Ropper

  22. Schuster, F., Tendyck, T., Liebchen, C., Davi, L., Sadeghi, A.R., Holz, T.: Counterfeit object-oriented programming: on the difficulty of preventing code reuse attacks in C++ Applications. In: 2015 IEEE Symposium on Security and Privacy, pp. 745–762. IEEE (2015)

    Google Scholar 

  23. Schwartz, E.J., Avgerinos, T., Brumley, D.: Q: Exploit hardening made easy. In: USENIX Security Symposium (2011)

    Google Scholar 

  24. Schwartz, E.J., Cohen, C.F., Gennari, J.S., Schwartz, S.M.: A generic technique for automatically finding defense-aware code reuse attacks. In: ACM Conference on Computer and Communications Security (CCS) (2020)

    Google Scholar 

  25. Serna, F.J., Linton, M., Stadmeyer, K.: dnsmasq stack-based buffer overflow (CVE-2017-14493). https://security.googleblog.com/2017/10/behind-masq-yet-more-dns-and-dhcp.html

  26. Shacham, H.: The geometry of innocent flesh on the bone: return-into-LIBC without function calls (on the x86). In: ACM Conference on Computer and Communications Security (CCS) (2007)

    Google Scholar 

  27. Sinz, C., Falke, S., Merz, F.: A precise memory model for low-level bounded model checking. In: International Conference on Systems Software Verification (2010)

    Google Scholar 

  28. SMT-LIB: Logics. https://smtlib.cs.uiowa.edu/logics-all.shtml#QF_ABV

  29. Snow, K.Z., Monrose, F., Davi, L., Dmitrienko, A., Liebchen, C., Sadeghi, A.R.: Just-in-time code reuse: on the effectiveness of fine-grained address space layout randomization. In: IEEE Symposium on Security and Privacy (2013)

    Google Scholar 

  30. Solar Designer: Return-to-Libc (1997)

    Google Scholar 

  31. Stump, A., Barrett, C.W., Dill, D.L., Levitt, J.: A decision procedure for an extensional theory of arrays. In: IEEE Symposium on Logic in Computer Science (2001)

    Google Scholar 

  32. Szekeres, L., Payer, M., Wei, T., Song, D.: Sok: eternal war in memory. In: IEEE Symposium on Security and Privacy (2013)

    Google Scholar 

  33. Vanegue, J., Heelan, S., Rolles, R.: SMT solvers in software security. In: USENIX Workshop on Offensive Technologies (WOOT) (2012)

    Google Scholar 

  34. Vector 35 Inc.: Binary Ninja. https://binary.ninja/

  35. van der Veen, V., Andriesse, D., Stamatogiannakis, M., Chen, X., Bos, H., Giuffrdia, C.: The dynamics of innocent flesh on the bone: code reuse ten years later. In: ACM Conference on Computer and Communications Security (CCS) (2017)

    Google Scholar 

  36. Weber, T., Conchon, S., Déharbe, D., Heizmann, M., Niemetz, A., Reger, G.: The SMT competition 2015–2018. J. Satisf. Boolean Model. Comput. 11(1) (2019)

    Google Scholar 

  37. Wollgast, P., Gawlik, R., Garmany, B., Kollenda, B., Holz, T.: Automated multi-architectural discovery of CFI-resistant code gadgets. In: European Symposium on Research in Computer Security (ESORICS) (2016)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the German Research Foundation (DFG) within the framework of the Excellence Strategy of the Federal Government and the States—EXC 2092 CaSa—39078197.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moritz Schloegel .

Editor information

Editors and Affiliations

Appendices

A Modeling

Byte-wise memory reads and writes are modeled using single select and store operators, respectively. Larger reads are modeled by concatenating multiple select expressions, which we define recursively in terms of smaller read operations. Reads smaller than 64-bit into a 64-bit register are zero-extended by using concat with the zero bit vector \(bv_0\). Larger writes are similarly modeled using the composition of multiple store expressions. Table 6 shows memory accesses of various sizes. Given an array m, address k and value v and bit size \(n \in (8, 16, 32, 64)\), we use the names \(mem\_read_n(m, k)\) and \(mem\_write_n(m, k, v)\) to substitute the longer SMT expressions from these tables.

Table 6. Encoding of memory reads and writes (m: memory, k: address, v: value).

B dnsmasq CVE-2017-14493

In the following, we analyze the dnsmasq bug in more detail. The stack-based buffer overflow in dnsmasq is caused by the absence of a length check of the data copied to a static buffer on the stack. Figure 2 shows the vulnerable call to memcpy in function dhcp6_maybe_relay. Sending a malicious DHCPv6 packet allows to gain control over the instruction pointer by overflowing the mac buffer of static size DHCP_CHADDR_MAX (16) in the state structure present on the stack.

Fig. 2.
figure 2

Vulnerable memcpy in file rfc3315.c, which overflows the mac buffer in state.

The proof-of-concept (PoC) provided alongside the bug report [25] builds up such a DHCPv6 packet containing an OPTION6_CLIENT_MAC option holding data of excessive length. While the PoC overwrites the instruction pointer with a dummy value, injecting an arbitrary amount of bytes is possible. As long as the stack is not exhausted, the packet’s content is copied and remains untouched until the instruction pointer is overwritten.

In order to synthesize a gadget chain, the information needed to specify preconditions and postconditions is gathered by extracting the program state before hijacking the control flow through GDB. Table 7a shows the preconditions set for dnsmasq. The initial ret instruction, which redirects the control flow to the chain’s first gadget (gadget_0), is specified by preconditioning rip. The stack pointer rsp points to the part of the controlled buffer, where the gadget chain will be copied. In the logical formula, this stack area is a free variable.

Table 7. Preconditions and postconditions used for dnsmasq. Registers not mentioned in the preconditions are free variables, i. e., registers an attacker controls and can set to an arbitrary value.

Since we want to execute a system call to execve to spawn a shell, the final register values which the gadget chain needs to reach are specified accordingly. Table 7b shows the postconditions in preparation for calling execve(&"/bin/sh", 0, 0). Here, rip holds the address of a syscall instruction available in the program. Using the default configuration described in Sect. 5.1, SGC finds a gadget chain consisting of four gadgets within approximately 8m. While most gadgets are straightforward, gadget_3 (shown in Fig. 3) writes a value to the stack outside the attacker-controlled buffer, a side effect that does not harm the chain. The arithmetic operations of the first four instructions do not change register rax’ value of 0. In line 6, the lea instruction is used to add 0x5 to the value present in \(\texttt {rbp} = \texttt {0x55555559a1cb}\). The resulting address, 0x55555559a1d0, is a syscall instruction; the address is placed on the stack at address 0x7fffffffe240 present in register rbx. As this address is writable memory, no harm results from this side effect.

Fig. 3.
figure 3

gadget_3 of the gadget chain used to spawn a shell in dnsmasq.

As mentioned earlier, the PoC crafts a rogue DHCPv6 packet. In order to construct the payload with our synthesized gadget chain, the length parameter is adjusted and the dummy value is replaced with the data of the gadget chain. Sending this packet to the dnsmasq DHCP server successfully spawns the shell.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schloegel, M., Blazytko, T., Basler, J., Hemmer, F., Holz, T. (2021). Towards Automating Code-Reuse Attacks Using Synthesized Gadget Chains. In: Bertino, E., Shulman, H., Waidner, M. (eds) Computer Security – ESORICS 2021. ESORICS 2021. Lecture Notes in Computer Science(), vol 12972. Springer, Cham. https://doi.org/10.1007/978-3-030-88418-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88418-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88417-8

  • Online ISBN: 978-3-030-88418-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics