Skip to main content

Verification of Functional Correctness of Code Diversification Techniques

  • Conference paper
  • First Online:
NASA Formal Methods (NFM 2021)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12673))

Included in the following conference series:

  • 930 Accesses

Abstract

Code diversification techniques are popular code-reuse attacks defense. The majority of code diversification research focuses on analyzing non-functional properties, such as whether the technique improves security. This paper provides a methodology to verify functional equivalence between the original and a diversified binary. We present a formal notion of binary equivalence resilient to diversification. Moreover, an algorithm is presented that checks whether two binaries – one original and one diversified – satisfy that notion of equivalence. The purpose of our work is to allow untrusted diversification techniques in a safety-critical context. We apply the methodology to three state-of-the-art diversification techniques used on the GNU Coreutils package. Overall, results show that our method can prove functional equivalence for 85,315 functions in the analyzed binaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baier, C., Katoen, J.P.: Principles of Model Checking. Representation and Mind Series. The MIT Press, Cambridge (2008)

    Google Scholar 

  2. Bhatkar, S., DuVarney, D.C., Sekar, R.: Address obfuscation: an efficient approach to combat a board range of memory error exploits. In: Proceedings of the 12th Conference on USENIX Security Symposium - Volume 12, SSYM 2003, p. 8. USENIX Association, USA (2003)

    Google Scholar 

  3. Bhatkar, S., Sekar, R., DuVarney, D.C.: Efficient techniques for comprehensive protection from memory error exploits. In: Proceedings of the 14th Conference on USENIX Security Symposium - Volume 14, SSYM 2005, p. 17. USENIX Association, USA (2005)

    Google Scholar 

  4. Browne, M., Clarke, E., Grümberg, O.: Characterizing finite Kripke structures in propositional temporal logic. Theor. Comput. Sci. 59(1), 115–131 (1988). https://doi.org/10.1016/0304-3975(88)90098-9

  5. Chae, D.K., Ha, J., Kim, S.W., Kang, B., Im, E.G.: Software plagiarism detection: a graph-based approach. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, CIKM 2013, pp. 1577–1580. ACM, New York (2013). https://doi.org/10.1145/2505515.2507848

  6. Checkoway, S., Davi, L., Dmitrienko, A., Sadeghi, A.R., Shacham, H., Winandy, M.: Return-oriented programming without returns. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS 2010, pp. 559–572. ACM, New York (2010). https://doi.org/10.1145/1866307.1866370

  7. Churchill, B., Padon, O., Sharma, R., Aiken, A.: Semantic program alignment for equivalence checking. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, pp. 1027–1040. ACM, New York (2019). https://doi.org/10.1145/3314221.3314596

  8. Clutterbuck, D.L., Carré, B.A.: The verification of low-level code. Softw. Eng. J. 3(3), 97–111 (1988). https://doi.org/10.1049/sej.1988.0012

  9. Cohen, F.B.: Operating System Protection Through Program Evolution, vol. 12, pp. 565–584. Elsevier Advanced Technology Publications, GBR (1993). https://doi.org/10.1016/0167-4048(93)90054-9

  10. Crane, S., Homescu, A., Larsen, P.: Code randomization: haven’t we solved this problem yet? In: 2016 IEEE Cybersecurity Development (SecDev), pp. 124–129 (2016)

    Google Scholar 

  11. Crane, S., et al.: Readactor: practical code randomization resilient to memory disclosure. In: Proceedings of the 2015 IEEE Symposium on Security and Privacy, SP 2015, pp. 763–780. IEEE Computer Society, Washington, DC (2015). https://doi.org/10.1109/SP.2015.52

  12. David, Y., Partush, N., Yahav, E.: Statistical similarity of binaries. In: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, pp. 266–280. ACM, New York (2016) https://doi.org/10.1145/2908080.2908126

  13. Egele, M., Woo, M., Chapman, P., Brumley, D.: Blanket execution: dynamic similarity testing for program binaries and components. In: 23rd USENIX Security Symposium (USENIX Security 2014), San Diego, CA, pp. 303–317. USENIX Association, August 2014

    Google Scholar 

  14. Fernandez, J.C., Mounier, L.: Verifying bisimulations “on the fly”. FORTE. 90, 95–110 (1990)

    Google Scholar 

  15. Forrest, S., Somayaji, A., Ackley, D.: Building diverse computer systems. In: Proceedings of the 6th Workshop on Hot Topics in Operating Systems (HotOS-VI), HOTOS 1997, p. 67. IEEE Computer Society, Washington, DC (1997). https://doi.org/10.1109/hotos.1997.595185

  16. Gao, D., Reiter, M.K., Song, D.: BinHunt: automatically finding semantic differences in binary programs. In: Chen, L., Ryan, M.D., Wang, G. (eds.) ICICS 2008. LNCS, vol. 5308, pp. 238–255. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88625-9_16

    Chapter  Google Scholar 

  17. Giuffrida, C., Kuijsten, A., Tanenbaum, A.S.: Enhanced operating system security through efficient and fine-grained address space randomization. In: Proceedings of the 21st USENIX Conference on Security Symposium, Security 2012, p. 40. USENIX Association, USA (2012)

    Google Scholar 

  18. Hiser, J., Nguyen-Tuong, A., Co, M., Hall, M., Davidson, J.W.: ILR: where’d my gadgets go? In: 2012 IEEE Symposium on Security and Privacy, pp. 571–585 (2012)

    Google Scholar 

  19. Hiser, J., Nguyen-Tuong, A., Hawkins, W., McGill, M., Co, M., Davidson, J.: Zipr++: exceptional binary rewriting. In: Proceedings of the 2017 Workshop on Forming an Ecosystem Around Software Transformation, FEAST 2017, pp. 9–15. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3141235.3141240

  20. Homescu, A., Neisius, S., Larsen, P., Brunthaler, S., Franz, M.: Profile-guided automated software diversity. In: Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), CGO 2013, pp. 1–11. IEEE Computer Society, Washington, DC (2013). https://doi.org/10.1109/CGO.2013.6494997

  21. Hosseinzadeh, S., et al.: Diversification and obfuscation techniques for software security: a systematic literature review. Inf. Softw. Technol. 104, 72–93 (2018)

    Google Scholar 

  22. Jackson, T., Homescu, A., Crane, S., Larsen, P., Brunthaler, S., Franz, M.: Diversifying the software stack using randomized NOP insertion. In: Jajodia, S., Ghosh, A.K., Subrahmanian, V.S., Swarup, V., Wang, C., Wang, X.S. (eds.) Moving Target Defense. Advances in Information Security, vol. 100, pp. 151–173. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5416-8_8

  23. Jackson, T., et al.: Compiler-Generated Software Diversity. In: Jajodia, S., Ghosh, A., Swarup, V., Wang, C., Wang, X. (eds.) Moving Target Defense. Advances in Information Security, vol. 54, pp. 77–98. Springer, New York (2011). https://doi.org/10.1007/978-1-4614-0977-9_4

  24. Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM - software protection for the masses. In: Wyseur, B. (ed.) Proceedings of the IEEE/ACM 1st International Workshop on Software Protection, SPRO 2015, Firenze, Italy, 19th May 2015, pp. 3–9. IEEE (2015). https://doi.org/10.1109/SPRO.2015.10

  25. Kil, C., Jun, J., Bookholt, C., Xu, J., Ning, P.: Address space layout permutation (ASLP): towards fine-grained randomization of commodity software. In: 2006 22nd Annual Computer Security Applications Conference (ACSAC 2006), pp. 339–348 (2006)

    Google Scholar 

  26. Komondoor, R., Horwitz, S.: Using slicing to identify duplication in source code. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 40–56. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-47764-0_3

    Chapter  Google Scholar 

  27. Koo, H., Chen, Y., Lu, L., Kemerlis, V.P., Polychronakis, M.: Compiler-assisted code randomization. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 461–477, May 2018. https://doi.org/10.1109/SP.2018.00029

  28. Larsen, P., Homescu, A., Brunthaler, S., Franz, M.: SoK: automated software diversity. In: 2014 IEEE Symposium on Security and Privacy, pp. 276–291, May 2014. https://doi.org/10.1109/SP.2014.25

  29. Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009). https://doi.org/10.1145/1538788.1538814

  30. Li, L., Feng, H., Zhuang, W., Meng, N., Ryder, B.: CCLearner: a deep learning-based clone detection approach. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 249–260, September 2017. https://doi.org/10.1109/ICSME.2017.46

  31. Liang, Yu., et al.: Stack layout randomization with minimal rewriting of Android binaries. In: Kwon, S., Yun, A. (eds.) ICISC 2015. LNCS, vol. 9558, pp. 229–245. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30840-1_15

    Chapter  Google Scholar 

  32. Lim, J.P., Nagarakatte, S.: Automatic equivalence checking for assembly implementations of cryptography libraries, pp. 37–49 (2019). https://doi.org/10.1109/cgo.2019.8661180

  33. Liu, B., et al.: \(\alpha \)diff: cross-version binary code similarity detection with DNN. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, pp. 667–678. ACM, New York (2018). https://doi.org/10.1145/3238147.3238199

  34. Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp. 872–881. ACM, New York (2006). https://doi.org/10.1145/1150402.1150522

  35. Luo, L., Ming, J., Wu, D., Liu, P., Zhu, S.: Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In: Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 389–400. ACM, New York (2014). https://doi.org/10.1145/2635868.2635900

  36. Pappas, V., Polychronakis, M., Keromytis, A.D.: Practical software diversification using in-place code randomization. In: Jajodia, S., Ghosh, A., Subrahmanian, V., Swarup, V., Wang, C., Wang, X. (eds.) Moving Target Defense II. Advances in Information Security, vol. 100. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5416-8_9

  37. Peng, H., Mou, L., Li, G., Liu, Y., Zhang, L., Jin, Z.: Building program vector representations for deep learning. In: Zhang, S., Wirsing, M., Zhang, Z. (eds.) KSEM 2015. LNCS (LNAI), vol. 9403, pp. 547–553. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25159-2_49

    Chapter  Google Scholar 

  38. Pewny, J., Garmany, B., Gawlik, R., Rossow, C., Holz, T.: Cross-architecture bug search in binary executables. In: 2015 IEEE Symposium on Security and Privacy, pp. 709–724, May 2015. https://doi.org/10.1109/SP.2015.49

  39. Ramos, D.A., Engler, D.R.: Practical, low-effort equivalence verification of real code. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 669–685. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_55

    Chapter  Google Scholar 

  40. Roy, C.K., Cordy, J.R., Koschke, R.: Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci. Comput. Program. 74(7), 470–495 (2009). https://doi.org/10.1016/j.scico.2009.02.007

  41. Shacham, H.: The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86). In: Proceedings of the 14th ACM Conference on Computer and Communications Security, CCS 2007, pp. 552–561. ACM, New York (2007). https://doi.org/10.1145/1315245.1315313

  42. Shalev, N., Partush, N.: Binary similarity detection using machine learning. In: Proceedings of the 13th Workshop on Programming Languages and Analysis for Security, PLAS 2018, pp. 42–47. ACM, New York (2018). https://doi.org/10.1145/3264820.3264821

  43. Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 138–157 (2016)

    Google Scholar 

  44. Siegel, S.F., Mironova, A., Avrunin, G.S., Clarke, L.A.: Using model checking with symbolic execution to verify parallel numerical programs. In: Proceedings of the 2006 International Symposium on Software Testing and Analysis, ISSTA 2006, pp. 157–168. ACM, New York (2006). https://doi.org/10.1145/1146238.1146256

  45. Wang, R., et al.: Ramblr: making reassembly great again. In: The Network and Distributed System Security Symposium, NDSS 2017 (2017). https://doi.org/10.14722/ndss.2017.23225

  46. Wang, S., Wang, P., Wu, D.: Composite software diversification. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 284–294 (2017)

    Google Scholar 

  47. Wartell, R., Mohan, V., Hamlen, K.W., Lin, Z.: Binary stirring: self-randomizing instruction addresses of legacy x86 binary code. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, CCS 2012, pp. 157–168. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2382196.2382216

  48. Xu, X., Liu, C., Feng, Q., Yin, H., Song, L., Song, D.: Neural network-based graph embedding for cross-platform binary code similarity detection. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 363–376. ACM, New York (2017). https://doi.org/10.1145/3133956.3134018

  49. Xu, Z., Miller, B.P., Reps, T.: Safety checking of machine code. In: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, PLDI 2000, pp. 70–82. ACM, New York (2000). https://doi.org/10.1145/349299.349313

  50. Zuo, F., Li, X., Zhang, Z., Young, P., Luo, L., Zeng, Q.: Neural machine translation inspired binary code similarity comparison beyond function pairs. CoRR abs/1808.04706 (2018). https://arxiv.org/abs/1808.04706

Download references

Acknowledgement

Project information can be found at: https://llrm-project.org/. All source codes and scripts are available at: https://github.com/jjang3/NFM_2021. This work is supported in part by the US Office of Naval Research (ONR) under grant N00014-17-1-2297 and NSWCDD/NEEC under grant N00174-20-1-0009.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jae-Won Jang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jang, JW., Verbeek, F., Ravindran, B. (2021). Verification of Functional Correctness of Code Diversification Techniques. In: Dutle, A., Moscato, M.M., Titolo, L., Muñoz, C.A., Perez, I. (eds) NASA Formal Methods. NFM 2021. Lecture Notes in Computer Science(), vol 12673. Springer, Cham. https://doi.org/10.1007/978-3-030-76384-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-76384-8_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-76383-1

  • Online ISBN: 978-3-030-76384-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics