skip to main content
10.1145/3238147.3238160acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

TRIMMER: application specialization for code debloating

Published:03 September 2018Publication History

ABSTRACT

With the proliferation of new hardware architectures and ever-evolving user requirements, the software stack is becoming increasingly bloated. In practice, only a limited subset of the supported functionality is utilized in a particular usage context, thereby presenting an opportunity to eliminate unused features. In the past, program specialization has been proposed as a mechanism for enabling automatic software debloating. In this work, we show how existing program specialization techniques lack the analyses required for providing code simplification for real-world programs. We present an approach that uses stronger analysis techniques to take advantage of constant configuration data, thereby enabling more effective debloating. We developed Trimmer, an application specialization tool that leverages user-provided configuration data to specialize an application to its deployment context. The specialization process attempts to eliminate the application functionality that is unused in the user-defined context. Our evaluation demonstrates Trimmer can effectively reduce code bloat. For 13 applications spanning various domains, we observe a mean binary size reduction of 21% and a maximum reduction of 75%. We also show specialization reduces the surface for code-reuse attacks by reducing the number of exploitable gadgets. For the evaluated programs, we observe a 20% mean reduction in the total gadget count and a maximum reduction of 87%.

References

  1. Abadi, M., Budiu, M., Erlingsson, U., and Ligatti, J. Control-flow integrity. In Proceedings of the 12th ACM Conference on Computer and Communications Security (CCS) (2005), ACM, pp. 340–353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aho, A. V., Sethi, R., and Ullman, J. D. Compilers, Principles, Techniques. Addison Wesley Boston, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aleph, O. Smashing the stack for fun and profit. http://www. shmoo. com/phrack/Phrack49/p49-14 (1996).Google ScholarGoogle Scholar
  4. Ansel, J., Kamil, S., Veeramachaneni, K., Ragan-Kelley, J., Bosboom, J., O’Reilly, U.-M., and Amarasinghe, S. Opentuner: An extensible framework for program autotuning. In 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT) (2014), IEEE, pp. 303–315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Basu, P., Venkat, A., Hall, M., Williams, S., Van Straalen, B., and Oliker, L. Compiler generation and autotuning of communication-avoiding operators for geometric multigrid. In 20th Annual International Conference on High Performance Computing (2013).Google ScholarGoogle ScholarCross RefCross Ref
  6. Bhatia, S., Consel, C., Le Meur, A.-F., and Pu, C. Automatic specialization of protocol stacks in OS kernels. In Proceedings of the 29th Annual IEEE Conference on Local Computer Networks (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bhattacharya, S., Rajamani, K., Gopinath, K., and Gupta, M. The interplay of software bloat, hardware energy proportionality and system bottlenecks. In Proceedings of the 4th Workshop on Power-Aware Computing and Systems (2011), ACM, p. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Biere, A., Cimatti, A., Clarke, E. M., Strichman, O., Zhu, Y., et al. Bounded model checking. Advances in Computers 58 (2003), 117–148.Google ScholarGoogle ScholarCross RefCross Ref
  9. Binkert, N. L., Hallnor, E. G., and Reinhardt, S. K. Network-oriented fullsystem simulation using m5. In Sixth Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW) (2003), pp. 36–43.Google ScholarGoogle Scholar
  10. Bletsch, T., Jiang, X., Freeh, V. W., and Liang, Z. Jump-oriented programming: a new class of code-reuse attack. In Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security (2011), ACM, pp. 30–40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Blum, R. Network Performance Open Source Toolkit: using Netperf, tcptrace, NISTnet, and SSFNet. John Wiley & Sons, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Brummayer, R., and Biere, A. Boolector: An efficient SMT solver for bit-vectors and arrays. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems (2009), Springer, pp. 174–177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Cadar, C., Dunbar, D., Engler, D. R., et al. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX conference on Operating Systems Design and Implementation (OSDI) (2008), vol. 8, pp. 209–224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Christen, M., Schenk, O., and Burkhart, H. Patus: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In IEEE International Parallel & Distributed Processing Symposium (IPDPS) (2011), IEEE, pp. 676–687. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Clarke, E., Talupur, M., Veith, H., and Wang, D. SAT based predicate abstraction for hardware verification. In International Conference on Theory and Applications of Satisfiability Testing (2003), Springer, pp. 78–92.Google ScholarGoogle Scholar
  16. Consel, C., Hornof, L., Marlet, R., Muller, G., Thibault, S., Volanschi, E.-N., Lawall, J., and Noyé, J. Tempo: Specializing systems applications and beyond. ACM Computing Surveys (CSUR) 30, 3es (1998), 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cooper, K., and Torczon, L. Engineering a Compiler. Elsevier, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Danvy, O. Type-directed partial evaluation. In Partial Evaluation. Springer, 1999, pp. 367–411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Davi, L., Sadeghi, A.-R., Lehmann, D., and Monrose, F. Stitching the gadgets: On the ineffectiveness of coarse-grained control-flow integrity protection. In USENIX Security Symposium (2014), pp. 401–416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dinaburg, A., and Ruef, A. Mcsema: Static translation of x86 instructions to LLVM. In ReCon 2014 Conference, Montreal, Canada (2014).Google ScholarGoogle Scholar
  21. Dutertre, B. Yices 2.2. In International Conference on Computer Aided Verification (CAV) (2014), Springer, pp. 737–744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ganesh, V., and Dill, D. L. A decision procedure for bit-vectors and arrays. In International Conference on Computer Aided Verification (CAV) (2007), Springer, pp. 519–531. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Grauer-Gray, S., Xu, L., Searles, R., Ayalasomayajula, S., and Cavazos, J. Auto-tuning a high-level language targeted to GPU codes. In Innovative Parallel Computing (InPar) (2012), IEEE, pp. 1–10.Google ScholarGoogle ScholarCross RefCross Ref
  24. Hibbs, C., Jewett, S., and Sullivan, M. The art of lean software development: a practical and incremental approach. " O’Reilly Media, Inc.", 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hiser, J., Nguyen-Tuong, A., Co, M., Hall, M., and Davidson, J. W. ILR: Where’d my gadgets go? In IEEE Symposium on Security and Privacy (SP) (2012), IEEE, pp. 571–585. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Homescu, A., Neisius, S., Larsen, P., Brunthaler, S., and Franz, M. Profileguided automated software diversity. In Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (2013), IEEE Computer Society, pp. 1–11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jha, S., Limaye, R., and Seshia, S. A. Beaver: Engineering an efficient SMT solver for bit-vector arithmetic. In International Conference on Computer Aided Verification (CAV) (2009), Springer, pp. 668–674. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jones, N. D., Gomard, C. K., and Sestoft, P. Partial evaluation and automatic program generation. Peter Sestoft, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jose, J., Subramoni, H., Luo, M., Zhang, M., Huang, J., Wasi-ur Rahman, M., Islam, N. S., Ouyang, X., Wang, H., Sur, S., et al. Memcached design on high performance RDMA capable interconnects. In International Conference on Parallel Processing (ICPP) (2011), IEEE, pp. 743–752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kim, H., Joao, J. A., Mutlu, O., Lee, C. J., Patt, Y. N., and Cohn, R. VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization. ACM SIGARCH Computer Architecture News 35, 2 (2007), 424– 435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Kindermann, R., Junttila, T., and Niemelä, I. SMT-based induction methods for timed systems. In International Conference on Formal Modeling and Analysis of Timed Systems (2012), Springer, pp. 171–187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Larsen, P., Brunthaler, S., Davi, L., Sadeghi, A.-R., and Franz, M. Automated software diversity. Synthesis Lectures on Information Security, Privacy, & Trust 10, 2 (2015), 1–88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Larus, J. Spending Moore’s dividend. Communications of the ACM 52, 5 (2009), 62–69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Lattner, C., and Adve, V. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (2004), IEEE Computer Society, p. 75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Lee, C.-T., Lin, J.-M., Hong, Z.-W., and Lee, W.-T. An application-oriented Linux kernel customization for embedded systems. J. Inf. Sci. Eng. 20, 6 (2004), 1093–1107.Google ScholarGoogle Scholar
  36. Lekatsas, H., and Wolf, W. Code compression for embedded systems. In Proceedings of the 35th Annual Design Automation Conference (DAC) (1998), ACM, pp. 516–521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lhee, K.-S., and Chapin, S. J. Buffer overflow and format string overflow vulnerabilities. Software: Practice and Experience 33, 5 (2003), 423–460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Ma, K.-K., Phang, K. Y., Foster, J. S., and Hicks, M. Directed symbolic execution. In International Static Analysis Symposium (SAS) (2011), Springer, pp. 95–111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Madia, A., Nikoletseas, S., Stamatiou, Y., Tsolovos, D., and Vlachos, V. Crowd sourcing based privacy threat analysis and alerting. Cryptography, Cyber Security and Information Warfare (3rd CryCybIW) (2016).Google ScholarGoogle Scholar
  40. Malecha, G., Gehani, A., and Shankar, N. Automated software winnowing. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC) (2015), ACM, pp. 1504–1511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. McNamee, D., Walpole, J., Pu, C., Cowan, C., Krasic, C., Goel, A., Wagle, P., Consel, C., Muller, G., and Marlet, R. Specialization tools and techniques for systematic optimization of system software. ACM Transactions on Computer Systems (TOCS) 19, 2 (2001), 217–251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Molnar, D., Li, X. C., and Wagner, D. Dynamic test generation to find integer bugs in x86 binary Linux programs. In USENIX Security Symposium (2009), vol. 9, pp. 67–82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Rastogi, V., Davidson, D., De Carli, L., Jha, S., and McDaniel, P. Cimplifier: automatically debloating containers. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (FSE) (2017), ACM, pp. 476–486. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Salwan, J. Ropgadget tool. http://shell-storm.org/project/ROPgadget/, 2012.Google ScholarGoogle Scholar
  45. Schilit, B. N., Theimer, M. M., and Welch, B. B. Customizing mobile applications. In Proceedings USENIX Symposium on Mobile & Location-indendent Computing (1993), vol. 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Shacham, H. The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86). In Proceedings of the 14th ACM conference on Computer and Communications Security (CCS) (2007), ACM, pp. 552–561. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Sheeran, M., Singh, S., and Stålmarck, G. Checking safety properties using induction and a sat-solver. In International Conference on Formal Methods in Computer-aided Design (FMCAD) (2000), Springer, pp. 127–144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Smowton, C., and Hand, S. Make world. In Proceedings of the 13th USENIX Conference on Hot Topics in Operating Systems (HotOS) (2011), USENIX Association, pp. 26–26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Snow, K. Z., Monrose, F., Davi, L., Dmitrienko, A., Liebchen, C., and Sadeghi, A.-R. Just-in-time code reuse: On the effectiveness of fine-grained address space layout randomization. In Proceedings of the 2013 IEEE Symposium on Security and Privacy (2013), IEEE, pp. 574–588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Stenberg, D. Everything curl. https://legacy.gitbook.com/book/bagder/ everything-curl/, 2017.Google ScholarGoogle Scholar
  51. Tiwari, A., Chen, C., Chame, J., Hall, M., and Hollingsworth, J. K. A scalable auto-tuning framework for compiler optimization. In IEEE International Symposium on Parallel & Distributed Processing (IPDPS) (2009), IEEE, pp. 1–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Xu, G., Mitchell, N., Arnold, M., Rountev, A., and Sevitsky, G. Software bloat analysis: finding, removing, and preventing performance problems in modern large-scale object-oriented applications. In Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research (2010), ACM, pp. 421–426. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. TRIMMER: application specialization for code debloating

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering
      September 2018
      955 pages
      ISBN:9781450359375
      DOI:10.1145/3238147

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 September 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate82of337submissions,24%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader