ABSTRACT
With the proliferation of new hardware architectures and ever-evolving user requirements, the software stack is becoming increasingly bloated. In practice, only a limited subset of the supported functionality is utilized in a particular usage context, thereby presenting an opportunity to eliminate unused features. In the past, program specialization has been proposed as a mechanism for enabling automatic software debloating. In this work, we show how existing program specialization techniques lack the analyses required for providing code simplification for real-world programs. We present an approach that uses stronger analysis techniques to take advantage of constant configuration data, thereby enabling more effective debloating. We developed Trimmer, an application specialization tool that leverages user-provided configuration data to specialize an application to its deployment context. The specialization process attempts to eliminate the application functionality that is unused in the user-defined context. Our evaluation demonstrates Trimmer can effectively reduce code bloat. For 13 applications spanning various domains, we observe a mean binary size reduction of 21% and a maximum reduction of 75%. We also show specialization reduces the surface for code-reuse attacks by reducing the number of exploitable gadgets. For the evaluated programs, we observe a 20% mean reduction in the total gadget count and a maximum reduction of 87%.
- Abadi, M., Budiu, M., Erlingsson, U., and Ligatti, J. Control-flow integrity. In Proceedings of the 12th ACM Conference on Computer and Communications Security (CCS) (2005), ACM, pp. 340–353. Google ScholarDigital Library
- Aho, A. V., Sethi, R., and Ullman, J. D. Compilers, Principles, Techniques. Addison Wesley Boston, 1986. Google ScholarDigital Library
- Aleph, O. Smashing the stack for fun and profit. http://www. shmoo. com/phrack/Phrack49/p49-14 (1996).Google Scholar
- Ansel, J., Kamil, S., Veeramachaneni, K., Ragan-Kelley, J., Bosboom, J., O’Reilly, U.-M., and Amarasinghe, S. Opentuner: An extensible framework for program autotuning. In 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT) (2014), IEEE, pp. 303–315. Google ScholarDigital Library
- Basu, P., Venkat, A., Hall, M., Williams, S., Van Straalen, B., and Oliker, L. Compiler generation and autotuning of communication-avoiding operators for geometric multigrid. In 20th Annual International Conference on High Performance Computing (2013).Google ScholarCross Ref
- Bhatia, S., Consel, C., Le Meur, A.-F., and Pu, C. Automatic specialization of protocol stacks in OS kernels. In Proceedings of the 29th Annual IEEE Conference on Local Computer Networks (2004). Google ScholarDigital Library
- Bhattacharya, S., Rajamani, K., Gopinath, K., and Gupta, M. The interplay of software bloat, hardware energy proportionality and system bottlenecks. In Proceedings of the 4th Workshop on Power-Aware Computing and Systems (2011), ACM, p. 1. Google ScholarDigital Library
- Biere, A., Cimatti, A., Clarke, E. M., Strichman, O., Zhu, Y., et al. Bounded model checking. Advances in Computers 58 (2003), 117–148.Google ScholarCross Ref
- Binkert, N. L., Hallnor, E. G., and Reinhardt, S. K. Network-oriented fullsystem simulation using m5. In Sixth Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW) (2003), pp. 36–43.Google Scholar
- Bletsch, T., Jiang, X., Freeh, V. W., and Liang, Z. Jump-oriented programming: a new class of code-reuse attack. In Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security (2011), ACM, pp. 30–40. Google ScholarDigital Library
- Blum, R. Network Performance Open Source Toolkit: using Netperf, tcptrace, NISTnet, and SSFNet. John Wiley & Sons, 2003. Google ScholarDigital Library
- Brummayer, R., and Biere, A. Boolector: An efficient SMT solver for bit-vectors and arrays. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems (2009), Springer, pp. 174–177. Google ScholarDigital Library
- Cadar, C., Dunbar, D., Engler, D. R., et al. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX conference on Operating Systems Design and Implementation (OSDI) (2008), vol. 8, pp. 209–224. Google ScholarDigital Library
- Christen, M., Schenk, O., and Burkhart, H. Patus: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In IEEE International Parallel & Distributed Processing Symposium (IPDPS) (2011), IEEE, pp. 676–687. Google ScholarDigital Library
- Clarke, E., Talupur, M., Veith, H., and Wang, D. SAT based predicate abstraction for hardware verification. In International Conference on Theory and Applications of Satisfiability Testing (2003), Springer, pp. 78–92.Google Scholar
- Consel, C., Hornof, L., Marlet, R., Muller, G., Thibault, S., Volanschi, E.-N., Lawall, J., and Noyé, J. Tempo: Specializing systems applications and beyond. ACM Computing Surveys (CSUR) 30, 3es (1998), 19. Google ScholarDigital Library
- Cooper, K., and Torczon, L. Engineering a Compiler. Elsevier, 2011. Google ScholarDigital Library
- Danvy, O. Type-directed partial evaluation. In Partial Evaluation. Springer, 1999, pp. 367–411. Google ScholarDigital Library
- Davi, L., Sadeghi, A.-R., Lehmann, D., and Monrose, F. Stitching the gadgets: On the ineffectiveness of coarse-grained control-flow integrity protection. In USENIX Security Symposium (2014), pp. 401–416. Google ScholarDigital Library
- Dinaburg, A., and Ruef, A. Mcsema: Static translation of x86 instructions to LLVM. In ReCon 2014 Conference, Montreal, Canada (2014).Google Scholar
- Dutertre, B. Yices 2.2. In International Conference on Computer Aided Verification (CAV) (2014), Springer, pp. 737–744. Google ScholarDigital Library
- Ganesh, V., and Dill, D. L. A decision procedure for bit-vectors and arrays. In International Conference on Computer Aided Verification (CAV) (2007), Springer, pp. 519–531. Google ScholarDigital Library
- Grauer-Gray, S., Xu, L., Searles, R., Ayalasomayajula, S., and Cavazos, J. Auto-tuning a high-level language targeted to GPU codes. In Innovative Parallel Computing (InPar) (2012), IEEE, pp. 1–10.Google ScholarCross Ref
- Hibbs, C., Jewett, S., and Sullivan, M. The art of lean software development: a practical and incremental approach. " O’Reilly Media, Inc.", 2009. Google ScholarDigital Library
- Hiser, J., Nguyen-Tuong, A., Co, M., Hall, M., and Davidson, J. W. ILR: Where’d my gadgets go? In IEEE Symposium on Security and Privacy (SP) (2012), IEEE, pp. 571–585. Google ScholarDigital Library
- Homescu, A., Neisius, S., Larsen, P., Brunthaler, S., and Franz, M. Profileguided automated software diversity. In Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (2013), IEEE Computer Society, pp. 1–11. Google ScholarDigital Library
- Jha, S., Limaye, R., and Seshia, S. A. Beaver: Engineering an efficient SMT solver for bit-vector arithmetic. In International Conference on Computer Aided Verification (CAV) (2009), Springer, pp. 668–674. Google ScholarDigital Library
- Jones, N. D., Gomard, C. K., and Sestoft, P. Partial evaluation and automatic program generation. Peter Sestoft, 1993. Google ScholarDigital Library
- Jose, J., Subramoni, H., Luo, M., Zhang, M., Huang, J., Wasi-ur Rahman, M., Islam, N. S., Ouyang, X., Wang, H., Sur, S., et al. Memcached design on high performance RDMA capable interconnects. In International Conference on Parallel Processing (ICPP) (2011), IEEE, pp. 743–752. Google ScholarDigital Library
- Kim, H., Joao, J. A., Mutlu, O., Lee, C. J., Patt, Y. N., and Cohn, R. VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization. ACM SIGARCH Computer Architecture News 35, 2 (2007), 424– 435. Google ScholarDigital Library
- Kindermann, R., Junttila, T., and Niemelä, I. SMT-based induction methods for timed systems. In International Conference on Formal Modeling and Analysis of Timed Systems (2012), Springer, pp. 171–187. Google ScholarDigital Library
- Larsen, P., Brunthaler, S., Davi, L., Sadeghi, A.-R., and Franz, M. Automated software diversity. Synthesis Lectures on Information Security, Privacy, & Trust 10, 2 (2015), 1–88. Google ScholarDigital Library
- Larus, J. Spending Moore’s dividend. Communications of the ACM 52, 5 (2009), 62–69. Google ScholarDigital Library
- Lattner, C., and Adve, V. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (2004), IEEE Computer Society, p. 75. Google ScholarDigital Library
- Lee, C.-T., Lin, J.-M., Hong, Z.-W., and Lee, W.-T. An application-oriented Linux kernel customization for embedded systems. J. Inf. Sci. Eng. 20, 6 (2004), 1093–1107.Google Scholar
- Lekatsas, H., and Wolf, W. Code compression for embedded systems. In Proceedings of the 35th Annual Design Automation Conference (DAC) (1998), ACM, pp. 516–521. Google ScholarDigital Library
- Lhee, K.-S., and Chapin, S. J. Buffer overflow and format string overflow vulnerabilities. Software: Practice and Experience 33, 5 (2003), 423–460. Google ScholarDigital Library
- Ma, K.-K., Phang, K. Y., Foster, J. S., and Hicks, M. Directed symbolic execution. In International Static Analysis Symposium (SAS) (2011), Springer, pp. 95–111. Google ScholarDigital Library
- Madia, A., Nikoletseas, S., Stamatiou, Y., Tsolovos, D., and Vlachos, V. Crowd sourcing based privacy threat analysis and alerting. Cryptography, Cyber Security and Information Warfare (3rd CryCybIW) (2016).Google Scholar
- Malecha, G., Gehani, A., and Shankar, N. Automated software winnowing. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC) (2015), ACM, pp. 1504–1511. Google ScholarDigital Library
- McNamee, D., Walpole, J., Pu, C., Cowan, C., Krasic, C., Goel, A., Wagle, P., Consel, C., Muller, G., and Marlet, R. Specialization tools and techniques for systematic optimization of system software. ACM Transactions on Computer Systems (TOCS) 19, 2 (2001), 217–251. Google ScholarDigital Library
- Molnar, D., Li, X. C., and Wagner, D. Dynamic test generation to find integer bugs in x86 binary Linux programs. In USENIX Security Symposium (2009), vol. 9, pp. 67–82. Google ScholarDigital Library
- Rastogi, V., Davidson, D., De Carli, L., Jha, S., and McDaniel, P. Cimplifier: automatically debloating containers. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (FSE) (2017), ACM, pp. 476–486. Google ScholarDigital Library
- Salwan, J. Ropgadget tool. http://shell-storm.org/project/ROPgadget/, 2012.Google Scholar
- Schilit, B. N., Theimer, M. M., and Welch, B. B. Customizing mobile applications. In Proceedings USENIX Symposium on Mobile & Location-indendent Computing (1993), vol. 9. Google ScholarDigital Library
- Shacham, H. The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86). In Proceedings of the 14th ACM conference on Computer and Communications Security (CCS) (2007), ACM, pp. 552–561. Google ScholarDigital Library
- Sheeran, M., Singh, S., and Stålmarck, G. Checking safety properties using induction and a sat-solver. In International Conference on Formal Methods in Computer-aided Design (FMCAD) (2000), Springer, pp. 127–144. Google ScholarDigital Library
- Smowton, C., and Hand, S. Make world. In Proceedings of the 13th USENIX Conference on Hot Topics in Operating Systems (HotOS) (2011), USENIX Association, pp. 26–26. Google ScholarDigital Library
- Snow, K. Z., Monrose, F., Davi, L., Dmitrienko, A., Liebchen, C., and Sadeghi, A.-R. Just-in-time code reuse: On the effectiveness of fine-grained address space layout randomization. In Proceedings of the 2013 IEEE Symposium on Security and Privacy (2013), IEEE, pp. 574–588. Google ScholarDigital Library
- Stenberg, D. Everything curl. https://legacy.gitbook.com/book/bagder/ everything-curl/, 2017.Google Scholar
- Tiwari, A., Chen, C., Chame, J., Hall, M., and Hollingsworth, J. K. A scalable auto-tuning framework for compiler optimization. In IEEE International Symposium on Parallel & Distributed Processing (IPDPS) (2009), IEEE, pp. 1–12. Google ScholarDigital Library
- Xu, G., Mitchell, N., Arnold, M., Rountev, A., and Sevitsky, G. Software bloat analysis: finding, removing, and preventing performance problems in modern large-scale object-oriented applications. In Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research (2010), ACM, pp. 421–426. Google ScholarDigital Library
Index Terms
- TRIMMER: application specialization for code debloating
Recommendations
Configuration-Driven Software Debloating
EuroSec '19: Proceedings of the 12th European Workshop on Systems SecurityWith legitimate code becoming an attack surface due to the proliferation of code reuse attacks, software debloating is an effective mitigation that reduces the amount of instruction sequences that may be useful for an attacker, in addition to ...
Trimmer: Context-Specific Code Reduction
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software EngineeringWe present Trimmer, a state-of-the-art tool for reducing code size. Trimmer reduces code sizes by specializing programs with respect to constant inputs provided by developers. The static data can be provided as command-line options or through ...
Input-Driven Dynamic Program Debloating for Code-Reuse Attack Mitigation
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringModern software is bloated, especially for libraries. The unnecessary code not only brings severe vulnerabilities, but also assists attackers to construct exploits. To mitigate the damage of bloated libraries, researchers have proposed several debloating ...
Comments