Verification of Functional Correctness of Code Diversification Techniques

Jang, Jae-Won; Verbeek, Freek; Ravindran, Binoy

doi:10.1007/978-3-030-76384-8_11

Jae-Won Jang¹³,
Freek Verbeek^13,14 &
Binoy Ravindran¹³

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12673))

Included in the following conference series:

NASA Formal Methods Symposium

930 Accesses

Abstract

Code diversification techniques are popular code-reuse attacks defense. The majority of code diversification research focuses on analyzing non-functional properties, such as whether the technique improves security. This paper provides a methodology to verify functional equivalence between the original and a diversified binary. We present a formal notion of binary equivalence resilient to diversification. Moreover, an algorithm is presented that checks whether two binaries – one original and one diversified – satisfy that notion of equivalence. The purpose of our work is to allow untrusted diversification techniques in a safety-critical context. We apply the methodology to three state-of-the-art diversification techniques used on the GNU Coreutils package. Overall, results show that our method can prove functional equivalence for 85,315 functions in the analyzed binaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baier, C., Katoen, J.P.: Principles of Model Checking. Representation and Mind Series. The MIT Press, Cambridge (2008)
Google Scholar
Bhatkar, S., DuVarney, D.C., Sekar, R.: Address obfuscation: an efficient approach to combat a board range of memory error exploits. In: Proceedings of the 12th Conference on USENIX Security Symposium - Volume 12, SSYM 2003, p. 8. USENIX Association, USA (2003)
Google Scholar
Bhatkar, S., Sekar, R., DuVarney, D.C.: Efficient techniques for comprehensive protection from memory error exploits. In: Proceedings of the 14th Conference on USENIX Security Symposium - Volume 14, SSYM 2005, p. 17. USENIX Association, USA (2005)
Google Scholar
Browne, M., Clarke, E., Grümberg, O.: Characterizing finite Kripke structures in propositional temporal logic. Theor. Comput. Sci. 59(1), 115–131 (1988). https://doi.org/10.1016/0304-3975(88)90098-9
Chae, D.K., Ha, J., Kim, S.W., Kang, B., Im, E.G.: Software plagiarism detection: a graph-based approach. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, CIKM 2013, pp. 1577–1580. ACM, New York (2013). https://doi.org/10.1145/2505515.2507848
Checkoway, S., Davi, L., Dmitrienko, A., Sadeghi, A.R., Shacham, H., Winandy, M.: Return-oriented programming without returns. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS 2010, pp. 559–572. ACM, New York (2010). https://doi.org/10.1145/1866307.1866370
Churchill, B., Padon, O., Sharma, R., Aiken, A.: Semantic program alignment for equivalence checking. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, pp. 1027–1040. ACM, New York (2019). https://doi.org/10.1145/3314221.3314596
Clutterbuck, D.L., Carré, B.A.: The verification of low-level code. Softw. Eng. J. 3(3), 97–111 (1988). https://doi.org/10.1049/sej.1988.0012
Cohen, F.B.: Operating System Protection Through Program Evolution, vol. 12, pp. 565–584. Elsevier Advanced Technology Publications, GBR (1993). https://doi.org/10.1016/0167-4048(93)90054-9
Crane, S., Homescu, A., Larsen, P.: Code randomization: haven’t we solved this problem yet? In: 2016 IEEE Cybersecurity Development (SecDev), pp. 124–129 (2016)
Google Scholar
Crane, S., et al.: Readactor: practical code randomization resilient to memory disclosure. In: Proceedings of the 2015 IEEE Symposium on Security and Privacy, SP 2015, pp. 763–780. IEEE Computer Society, Washington, DC (2015). https://doi.org/10.1109/SP.2015.52
David, Y., Partush, N., Yahav, E.: Statistical similarity of binaries. In: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, pp. 266–280. ACM, New York (2016) https://doi.org/10.1145/2908080.2908126
Egele, M., Woo, M., Chapman, P., Brumley, D.: Blanket execution: dynamic similarity testing for program binaries and components. In: 23rd USENIX Security Symposium (USENIX Security 2014), San Diego, CA, pp. 303–317. USENIX Association, August 2014
Google Scholar
Fernandez, J.C., Mounier, L.: Verifying bisimulations “on the fly”. FORTE. 90, 95–110 (1990)
Google Scholar
Forrest, S., Somayaji, A., Ackley, D.: Building diverse computer systems. In: Proceedings of the 6th Workshop on Hot Topics in Operating Systems (HotOS-VI), HOTOS 1997, p. 67. IEEE Computer Society, Washington, DC (1997). https://doi.org/10.1109/hotos.1997.595185
Gao, D., Reiter, M.K., Song, D.: BinHunt: automatically finding semantic differences in binary programs. In: Chen, L., Ryan, M.D., Wang, G. (eds.) ICICS 2008. LNCS, vol. 5308, pp. 238–255. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88625-9_16
Chapter Google Scholar
Giuffrida, C., Kuijsten, A., Tanenbaum, A.S.: Enhanced operating system security through efficient and fine-grained address space randomization. In: Proceedings of the 21st USENIX Conference on Security Symposium, Security 2012, p. 40. USENIX Association, USA (2012)
Google Scholar
Hiser, J., Nguyen-Tuong, A., Co, M., Hall, M., Davidson, J.W.: ILR: where’d my gadgets go? In: 2012 IEEE Symposium on Security and Privacy, pp. 571–585 (2012)
Google Scholar
Hiser, J., Nguyen-Tuong, A., Hawkins, W., McGill, M., Co, M., Davidson, J.: Zipr++: exceptional binary rewriting. In: Proceedings of the 2017 Workshop on Forming an Ecosystem Around Software Transformation, FEAST 2017, pp. 9–15. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3141235.3141240
Homescu, A., Neisius, S., Larsen, P., Brunthaler, S., Franz, M.: Profile-guided automated software diversity. In: Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), CGO 2013, pp. 1–11. IEEE Computer Society, Washington, DC (2013). https://doi.org/10.1109/CGO.2013.6494997
Hosseinzadeh, S., et al.: Diversification and obfuscation techniques for software security: a systematic literature review. Inf. Softw. Technol. 104, 72–93 (2018)
Google Scholar
Jackson, T., Homescu, A., Crane, S., Larsen, P., Brunthaler, S., Franz, M.: Diversifying the software stack using randomized NOP insertion. In: Jajodia, S., Ghosh, A.K., Subrahmanian, V.S., Swarup, V., Wang, C., Wang, X.S. (eds.) Moving Target Defense. Advances in Information Security, vol. 100, pp. 151–173. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5416-8_8
Jackson, T., et al.: Compiler-Generated Software Diversity. In: Jajodia, S., Ghosh, A., Swarup, V., Wang, C., Wang, X. (eds.) Moving Target Defense. Advances in Information Security, vol. 54, pp. 77–98. Springer, New York (2011). https://doi.org/10.1007/978-1-4614-0977-9_4
Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM - software protection for the masses. In: Wyseur, B. (ed.) Proceedings of the IEEE/ACM 1st International Workshop on Software Protection, SPRO 2015, Firenze, Italy, 19th May 2015, pp. 3–9. IEEE (2015). https://doi.org/10.1109/SPRO.2015.10
Kil, C., Jun, J., Bookholt, C., Xu, J., Ning, P.: Address space layout permutation (ASLP): towards fine-grained randomization of commodity software. In: 2006 22nd Annual Computer Security Applications Conference (ACSAC 2006), pp. 339–348 (2006)
Google Scholar
Komondoor, R., Horwitz, S.: Using slicing to identify duplication in source code. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 40–56. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-47764-0_3
Chapter Google Scholar
Koo, H., Chen, Y., Lu, L., Kemerlis, V.P., Polychronakis, M.: Compiler-assisted code randomization. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 461–477, May 2018. https://doi.org/10.1109/SP.2018.00029
Larsen, P., Homescu, A., Brunthaler, S., Franz, M.: SoK: automated software diversity. In: 2014 IEEE Symposium on Security and Privacy, pp. 276–291, May 2014. https://doi.org/10.1109/SP.2014.25
Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009). https://doi.org/10.1145/1538788.1538814
Li, L., Feng, H., Zhuang, W., Meng, N., Ryder, B.: CCLearner: a deep learning-based clone detection approach. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 249–260, September 2017. https://doi.org/10.1109/ICSME.2017.46
Liang, Yu., et al.: Stack layout randomization with minimal rewriting of Android binaries. In: Kwon, S., Yun, A. (eds.) ICISC 2015. LNCS, vol. 9558, pp. 229–245. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30840-1_15
Chapter Google Scholar
Lim, J.P., Nagarakatte, S.: Automatic equivalence checking for assembly implementations of cryptography libraries, pp. 37–49 (2019). https://doi.org/10.1109/cgo.2019.8661180
Liu, B., et al.: \(\alpha \)diff: cross-version binary code similarity detection with DNN. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, pp. 667–678. ACM, New York (2018). https://doi.org/10.1145/3238147.3238199
Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp. 872–881. ACM, New York (2006). https://doi.org/10.1145/1150402.1150522
Luo, L., Ming, J., Wu, D., Liu, P., Zhu, S.: Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In: Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 389–400. ACM, New York (2014). https://doi.org/10.1145/2635868.2635900
Pappas, V., Polychronakis, M., Keromytis, A.D.: Practical software diversification using in-place code randomization. In: Jajodia, S., Ghosh, A., Subrahmanian, V., Swarup, V., Wang, C., Wang, X. (eds.) Moving Target Defense II. Advances in Information Security, vol. 100. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5416-8_9
Peng, H., Mou, L., Li, G., Liu, Y., Zhang, L., Jin, Z.: Building program vector representations for deep learning. In: Zhang, S., Wirsing, M., Zhang, Z. (eds.) KSEM 2015. LNCS (LNAI), vol. 9403, pp. 547–553. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25159-2_49
Chapter Google Scholar
Pewny, J., Garmany, B., Gawlik, R., Rossow, C., Holz, T.: Cross-architecture bug search in binary executables. In: 2015 IEEE Symposium on Security and Privacy, pp. 709–724, May 2015. https://doi.org/10.1109/SP.2015.49
Ramos, D.A., Engler, D.R.: Practical, low-effort equivalence verification of real code. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 669–685. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_55
Chapter Google Scholar
Roy, C.K., Cordy, J.R., Koschke, R.: Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci. Comput. Program. 74(7), 470–495 (2009). https://doi.org/10.1016/j.scico.2009.02.007
Shacham, H.: The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86). In: Proceedings of the 14th ACM Conference on Computer and Communications Security, CCS 2007, pp. 552–561. ACM, New York (2007). https://doi.org/10.1145/1315245.1315313
Shalev, N., Partush, N.: Binary similarity detection using machine learning. In: Proceedings of the 13th Workshop on Programming Languages and Analysis for Security, PLAS 2018, pp. 42–47. ACM, New York (2018). https://doi.org/10.1145/3264820.3264821
Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 138–157 (2016)
Google Scholar
Siegel, S.F., Mironova, A., Avrunin, G.S., Clarke, L.A.: Using model checking with symbolic execution to verify parallel numerical programs. In: Proceedings of the 2006 International Symposium on Software Testing and Analysis, ISSTA 2006, pp. 157–168. ACM, New York (2006). https://doi.org/10.1145/1146238.1146256
Wang, R., et al.: Ramblr: making reassembly great again. In: The Network and Distributed System Security Symposium, NDSS 2017 (2017). https://doi.org/10.14722/ndss.2017.23225
Wang, S., Wang, P., Wu, D.: Composite software diversification. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 284–294 (2017)
Google Scholar
Wartell, R., Mohan, V., Hamlen, K.W., Lin, Z.: Binary stirring: self-randomizing instruction addresses of legacy x86 binary code. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, CCS 2012, pp. 157–168. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2382196.2382216
Xu, X., Liu, C., Feng, Q., Yin, H., Song, L., Song, D.: Neural network-based graph embedding for cross-platform binary code similarity detection. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 363–376. ACM, New York (2017). https://doi.org/10.1145/3133956.3134018
Xu, Z., Miller, B.P., Reps, T.: Safety checking of machine code. In: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, PLDI 2000, pp. 70–82. ACM, New York (2000). https://doi.org/10.1145/349299.349313
Zuo, F., Li, X., Zhang, Z., Young, P., Luo, L., Zeng, Q.: Neural machine translation inspired binary code similarity comparison beyond function pairs. CoRR abs/1808.04706 (2018). https://arxiv.org/abs/1808.04706

Download references

Acknowledgement

Project information can be found at: https://llrm-project.org/. All source codes and scripts are available at: https://github.com/jjang3/NFM_2021. This work is supported in part by the US Office of Naval Research (ONR) under grant N00014-17-1-2297 and NSWCDD/NEEC under grant N00174-20-1-0009.

Author information

Authors and Affiliations

Virginia Tech, Blacksburg, VA, USA
Jae-Won Jang, Freek Verbeek & Binoy Ravindran
Open University of The Netherlands, Heerlen, The Netherlands
Freek Verbeek

Authors

Jae-Won Jang
View author publications
You can also search for this author in PubMed Google Scholar
Freek Verbeek
View author publications
You can also search for this author in PubMed Google Scholar
Binoy Ravindran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jae-Won Jang .

Editor information

Editors and Affiliations

NASA Langley Research Center, Hampton, VA, USA
Aaron Dutle
National Institute of Aerospace, Hampton, VA, USA
Mariano M. Moscato
National Institute of Aerospace, Hampton, VA, USA
Laura Titolo
NASA Langley Research Center, Hampton, VA, USA
César A. Muñoz
National Institute of Aerospace, Hampton, VA, USA
Ivan Perez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jang, JW., Verbeek, F., Ravindran, B. (2021). Verification of Functional Correctness of Code Diversification Techniques. In: Dutle, A., Moscato, M.M., Titolo, L., Muñoz, C.A., Perez, I. (eds) NASA Formal Methods. NFM 2021. Lecture Notes in Computer Science(), vol 12673. Springer, Cham. https://doi.org/10.1007/978-3-030-76384-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-76384-8_11
Published: 19 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76383-1
Online ISBN: 978-3-030-76384-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics