Abstract
Code diversification techniques are popular code-reuse attacks defense. The majority of code diversification research focuses on analyzing non-functional properties, such as whether the technique improves security. This paper provides a methodology to verify functional equivalence between the original and a diversified binary. We present a formal notion of binary equivalence resilient to diversification. Moreover, an algorithm is presented that checks whether two binaries – one original and one diversified – satisfy that notion of equivalence. The purpose of our work is to allow untrusted diversification techniques in a safety-critical context. We apply the methodology to three state-of-the-art diversification techniques used on the GNU Coreutils package. Overall, results show that our method can prove functional equivalence for 85,315 functions in the analyzed binaries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baier, C., Katoen, J.P.: Principles of Model Checking. Representation and Mind Series. The MIT Press, Cambridge (2008)
Bhatkar, S., DuVarney, D.C., Sekar, R.: Address obfuscation: an efficient approach to combat a board range of memory error exploits. In: Proceedings of the 12th Conference on USENIX Security Symposium - Volume 12, SSYM 2003, p. 8. USENIX Association, USA (2003)
Bhatkar, S., Sekar, R., DuVarney, D.C.: Efficient techniques for comprehensive protection from memory error exploits. In: Proceedings of the 14th Conference on USENIX Security Symposium - Volume 14, SSYM 2005, p. 17. USENIX Association, USA (2005)
Browne, M., Clarke, E., Grümberg, O.: Characterizing finite Kripke structures in propositional temporal logic. Theor. Comput. Sci. 59(1), 115–131 (1988). https://doi.org/10.1016/0304-3975(88)90098-9
Chae, D.K., Ha, J., Kim, S.W., Kang, B., Im, E.G.: Software plagiarism detection: a graph-based approach. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, CIKM 2013, pp. 1577–1580. ACM, New York (2013). https://doi.org/10.1145/2505515.2507848
Checkoway, S., Davi, L., Dmitrienko, A., Sadeghi, A.R., Shacham, H., Winandy, M.: Return-oriented programming without returns. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS 2010, pp. 559–572. ACM, New York (2010). https://doi.org/10.1145/1866307.1866370
Churchill, B., Padon, O., Sharma, R., Aiken, A.: Semantic program alignment for equivalence checking. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, pp. 1027–1040. ACM, New York (2019). https://doi.org/10.1145/3314221.3314596
Clutterbuck, D.L., Carré, B.A.: The verification of low-level code. Softw. Eng. J. 3(3), 97–111 (1988). https://doi.org/10.1049/sej.1988.0012
Cohen, F.B.: Operating System Protection Through Program Evolution, vol. 12, pp. 565–584. Elsevier Advanced Technology Publications, GBR (1993). https://doi.org/10.1016/0167-4048(93)90054-9
Crane, S., Homescu, A., Larsen, P.: Code randomization: haven’t we solved this problem yet? In: 2016 IEEE Cybersecurity Development (SecDev), pp. 124–129 (2016)
Crane, S., et al.: Readactor: practical code randomization resilient to memory disclosure. In: Proceedings of the 2015 IEEE Symposium on Security and Privacy, SP 2015, pp. 763–780. IEEE Computer Society, Washington, DC (2015). https://doi.org/10.1109/SP.2015.52
David, Y., Partush, N., Yahav, E.: Statistical similarity of binaries. In: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, pp. 266–280. ACM, New York (2016) https://doi.org/10.1145/2908080.2908126
Egele, M., Woo, M., Chapman, P., Brumley, D.: Blanket execution: dynamic similarity testing for program binaries and components. In: 23rd USENIX Security Symposium (USENIX Security 2014), San Diego, CA, pp. 303–317. USENIX Association, August 2014
Fernandez, J.C., Mounier, L.: Verifying bisimulations “on the fly”. FORTE. 90, 95–110 (1990)
Forrest, S., Somayaji, A., Ackley, D.: Building diverse computer systems. In: Proceedings of the 6th Workshop on Hot Topics in Operating Systems (HotOS-VI), HOTOS 1997, p. 67. IEEE Computer Society, Washington, DC (1997). https://doi.org/10.1109/hotos.1997.595185
Gao, D., Reiter, M.K., Song, D.: BinHunt: automatically finding semantic differences in binary programs. In: Chen, L., Ryan, M.D., Wang, G. (eds.) ICICS 2008. LNCS, vol. 5308, pp. 238–255. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88625-9_16
Giuffrida, C., Kuijsten, A., Tanenbaum, A.S.: Enhanced operating system security through efficient and fine-grained address space randomization. In: Proceedings of the 21st USENIX Conference on Security Symposium, Security 2012, p. 40. USENIX Association, USA (2012)
Hiser, J., Nguyen-Tuong, A., Co, M., Hall, M., Davidson, J.W.: ILR: where’d my gadgets go? In: 2012 IEEE Symposium on Security and Privacy, pp. 571–585 (2012)
Hiser, J., Nguyen-Tuong, A., Hawkins, W., McGill, M., Co, M., Davidson, J.: Zipr++: exceptional binary rewriting. In: Proceedings of the 2017 Workshop on Forming an Ecosystem Around Software Transformation, FEAST 2017, pp. 9–15. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3141235.3141240
Homescu, A., Neisius, S., Larsen, P., Brunthaler, S., Franz, M.: Profile-guided automated software diversity. In: Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), CGO 2013, pp. 1–11. IEEE Computer Society, Washington, DC (2013). https://doi.org/10.1109/CGO.2013.6494997
Hosseinzadeh, S., et al.: Diversification and obfuscation techniques for software security: a systematic literature review. Inf. Softw. Technol. 104, 72–93 (2018)
Jackson, T., Homescu, A., Crane, S., Larsen, P., Brunthaler, S., Franz, M.: Diversifying the software stack using randomized NOP insertion. In: Jajodia, S., Ghosh, A.K., Subrahmanian, V.S., Swarup, V., Wang, C., Wang, X.S. (eds.) Moving Target Defense. Advances in Information Security, vol. 100, pp. 151–173. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5416-8_8
Jackson, T., et al.: Compiler-Generated Software Diversity. In: Jajodia, S., Ghosh, A., Swarup, V., Wang, C., Wang, X. (eds.) Moving Target Defense. Advances in Information Security, vol. 54, pp. 77–98. Springer, New York (2011). https://doi.org/10.1007/978-1-4614-0977-9_4
Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM - software protection for the masses. In: Wyseur, B. (ed.) Proceedings of the IEEE/ACM 1st International Workshop on Software Protection, SPRO 2015, Firenze, Italy, 19th May 2015, pp. 3–9. IEEE (2015). https://doi.org/10.1109/SPRO.2015.10
Kil, C., Jun, J., Bookholt, C., Xu, J., Ning, P.: Address space layout permutation (ASLP): towards fine-grained randomization of commodity software. In: 2006 22nd Annual Computer Security Applications Conference (ACSAC 2006), pp. 339–348 (2006)
Komondoor, R., Horwitz, S.: Using slicing to identify duplication in source code. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 40–56. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-47764-0_3
Koo, H., Chen, Y., Lu, L., Kemerlis, V.P., Polychronakis, M.: Compiler-assisted code randomization. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 461–477, May 2018. https://doi.org/10.1109/SP.2018.00029
Larsen, P., Homescu, A., Brunthaler, S., Franz, M.: SoK: automated software diversity. In: 2014 IEEE Symposium on Security and Privacy, pp. 276–291, May 2014. https://doi.org/10.1109/SP.2014.25
Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009). https://doi.org/10.1145/1538788.1538814
Li, L., Feng, H., Zhuang, W., Meng, N., Ryder, B.: CCLearner: a deep learning-based clone detection approach. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 249–260, September 2017. https://doi.org/10.1109/ICSME.2017.46
Liang, Yu., et al.: Stack layout randomization with minimal rewriting of Android binaries. In: Kwon, S., Yun, A. (eds.) ICISC 2015. LNCS, vol. 9558, pp. 229–245. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30840-1_15
Lim, J.P., Nagarakatte, S.: Automatic equivalence checking for assembly implementations of cryptography libraries, pp. 37–49 (2019). https://doi.org/10.1109/cgo.2019.8661180
Liu, B., et al.: \(\alpha \)diff: cross-version binary code similarity detection with DNN. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, pp. 667–678. ACM, New York (2018). https://doi.org/10.1145/3238147.3238199
Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp. 872–881. ACM, New York (2006). https://doi.org/10.1145/1150402.1150522
Luo, L., Ming, J., Wu, D., Liu, P., Zhu, S.: Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In: Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 389–400. ACM, New York (2014). https://doi.org/10.1145/2635868.2635900
Pappas, V., Polychronakis, M., Keromytis, A.D.: Practical software diversification using in-place code randomization. In: Jajodia, S., Ghosh, A., Subrahmanian, V., Swarup, V., Wang, C., Wang, X. (eds.) Moving Target Defense II. Advances in Information Security, vol. 100. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5416-8_9
Peng, H., Mou, L., Li, G., Liu, Y., Zhang, L., Jin, Z.: Building program vector representations for deep learning. In: Zhang, S., Wirsing, M., Zhang, Z. (eds.) KSEM 2015. LNCS (LNAI), vol. 9403, pp. 547–553. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25159-2_49
Pewny, J., Garmany, B., Gawlik, R., Rossow, C., Holz, T.: Cross-architecture bug search in binary executables. In: 2015 IEEE Symposium on Security and Privacy, pp. 709–724, May 2015. https://doi.org/10.1109/SP.2015.49
Ramos, D.A., Engler, D.R.: Practical, low-effort equivalence verification of real code. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 669–685. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_55
Roy, C.K., Cordy, J.R., Koschke, R.: Comparison and evaluation of code clone detection techniques and tools: a qualitative approach. Sci. Comput. Program. 74(7), 470–495 (2009). https://doi.org/10.1016/j.scico.2009.02.007
Shacham, H.: The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86). In: Proceedings of the 14th ACM Conference on Computer and Communications Security, CCS 2007, pp. 552–561. ACM, New York (2007). https://doi.org/10.1145/1315245.1315313
Shalev, N., Partush, N.: Binary similarity detection using machine learning. In: Proceedings of the 13th Workshop on Programming Languages and Analysis for Security, PLAS 2018, pp. 42–47. ACM, New York (2018). https://doi.org/10.1145/3264820.3264821
Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 138–157 (2016)
Siegel, S.F., Mironova, A., Avrunin, G.S., Clarke, L.A.: Using model checking with symbolic execution to verify parallel numerical programs. In: Proceedings of the 2006 International Symposium on Software Testing and Analysis, ISSTA 2006, pp. 157–168. ACM, New York (2006). https://doi.org/10.1145/1146238.1146256
Wang, R., et al.: Ramblr: making reassembly great again. In: The Network and Distributed System Security Symposium, NDSS 2017 (2017). https://doi.org/10.14722/ndss.2017.23225
Wang, S., Wang, P., Wu, D.: Composite software diversification. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 284–294 (2017)
Wartell, R., Mohan, V., Hamlen, K.W., Lin, Z.: Binary stirring: self-randomizing instruction addresses of legacy x86 binary code. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, CCS 2012, pp. 157–168. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2382196.2382216
Xu, X., Liu, C., Feng, Q., Yin, H., Song, L., Song, D.: Neural network-based graph embedding for cross-platform binary code similarity detection. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 363–376. ACM, New York (2017). https://doi.org/10.1145/3133956.3134018
Xu, Z., Miller, B.P., Reps, T.: Safety checking of machine code. In: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, PLDI 2000, pp. 70–82. ACM, New York (2000). https://doi.org/10.1145/349299.349313
Zuo, F., Li, X., Zhang, Z., Young, P., Luo, L., Zeng, Q.: Neural machine translation inspired binary code similarity comparison beyond function pairs. CoRR abs/1808.04706 (2018). https://arxiv.org/abs/1808.04706
Acknowledgement
Project information can be found at: https://llrm-project.org/. All source codes and scripts are available at: https://github.com/jjang3/NFM_2021. This work is supported in part by the US Office of Naval Research (ONR) under grant N00014-17-1-2297 and NSWCDD/NEEC under grant N00174-20-1-0009.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jang, JW., Verbeek, F., Ravindran, B. (2021). Verification of Functional Correctness of Code Diversification Techniques. In: Dutle, A., Moscato, M.M., Titolo, L., Muñoz, C.A., Perez, I. (eds) NASA Formal Methods. NFM 2021. Lecture Notes in Computer Science(), vol 12673. Springer, Cham. https://doi.org/10.1007/978-3-030-76384-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-76384-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76383-1
Online ISBN: 978-3-030-76384-8
eBook Packages: Computer ScienceComputer Science (R0)