AttnCall: Refining Indirect Call Targets in Binaries with Attention

Sun, Rui; Guo, Yinggang; Wang, Zicheng; Zeng, Qingkai

doi:10.1007/978-3-031-51482-1_20

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14347))

Included in the following conference series:

European Symposium on Research in Computer Security

781 Accesses
1 Citations

Abstract

Accurate Control Flow Graphs are crucial for effective binary program analysis, while solving indirect function call targets is its major challenge. Existing static analysis methods heavily rely on domain-specific patterns, resulting in an abundance of false positive edges due to limited expert knowledge. Concurrently, learning-based approaches often depend on heuristic analysis during the code representation stage, which prevents the model from fully comprehending program semantics.

To address these limitations, this paper presents AttnCall, a novel neural network learning framework that leverages the attention mechanism to automatically learn the matching relationship between function callsites and callees’ context semantics. AttnCall refines the identification of indirect call targets through the learned matching patterns, eliminating the drawbacks of existing techniques. Additionally, we propose an end-to-end code representation scheme that effectively embeds the semantics of callsites and callees without relying on heuristic rules.

The evaluation of AttnCall focuses on the task of predicting indirect function call targets. The results demonstrate that AttnCall surpasses state-of-the-art approaches, achieving 31.4% higher precision and 5% higher recall. Moreover, AttnCall enhances model interpretability, allowing for a better understanding of the underlying analysis process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PromeTrans: Bootstrap binary functionality classification with knowledge transferred from pre-trained models

Article 27 November 2024

Code Action Network for Binary Function Scope Identification

HIMALIA: Recovering Compiler Optimization Levels from Binaries by Deep Learning

Notes

1.
https://github.com/anonmai/AttnCall.

References

Abadi, M., Budiu, M., Erlingsson, Ú., Ligatti, J.: Control-flow integrity. In: Proceedings of the 12th ACM Conference on Computer and Communications Security, CCS ’05, pp. 340–353. Association for Computing Machinery, New York (2005). https://doi.org/10.1145/1102120.1102165
Abadi, M., Budiu, M., Erlingsson, Ú., Ligatti, J.: Control-flow integrity principles, implementations, and applications. ACM Trans. Inf. Syst. Secur. 13(1), 1–40 (2009). https://doi.org/10.1145/1609956.1609960
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2015). https://doi.org/10.48550/arXiv.1409.0473
Balakrishnan, G., Reps, T.: Analyzing memory accesses in x86 executables. In: Duesterwald, E. (ed.) CC 2004. LNCS, vol. 2985, pp. 5–23. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24723-4_2
Chapter Google Scholar
Burow, N., Zhang, X., Payer, M.: SoK: shining light on shadow stacks. In: 2019 IEEE Symposium on Security and Privacy (SP), Oakland, pp. 985–999 (2019). https://doi.org/10.1109/SP.2019.00076
Chua, Z.L., Shen, S., Saxena, P., Liang, Z.: Neural nets can learn function type signatures from binaries. In: Proceedings of the 26th USENIX Security Symposium, Security, pp. 99–116 (2017). https://doi.org/10.5555/3241189.3241199
Debray, S., Muth, R., Weippert, M.: Alias analysis of executable code. In: Conference Record of the Annual ACM Symposium on Principles of Programming Languages, POPL, pp. 12–24. ACM (1998). https://doi.org/10.1145/268946.268948
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2019). https://doi.org/10.48550/arXiv.1810.04805
Ding, S.H.H., Fung, B.C.M., Charland, P.: Asm2Vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization. In: 2019 IEEE Symposium on Security and Privacy (SP), SP, pp. 472–489 (2019). https://doi.org/10.1109/SP.2019.00003
Emami, M., Ghiya, R., Hendren, L.J.: Context-sensitive interprocedural points-to analysis in the presence of function pointers. ACM SIGPLAN Not. 29(6), 242–256 (1994). https://doi.org/10.1145/773473.178264
Article Google Scholar
Farkhani, R.M., Robertson, W., Jafari, S., Kirda, E., Arshad, S., Okhravi, H.: On the effectiveness of type-based control flow integrity. ACM Int. Conf. Proc. Ser. 12, 28–39 (2018). https://doi.org/10.1145/3274694.3274739
Article Google Scholar
Feng, Z., et al.: CodeBERT: a pre-trained model for programming and natural languages (2020). https://doi.org/10.48550/arXiv.2002.08155
Google: create production-grade machine learning models with TensorFlow. https://www.tensorflow.org//
He, W., Das, S., Zhang, W., Liu, Y.: BBB-CFI: lightweight CFI approach against code-reuse attacks using basic block information. ACM Trans. Embed. Comput. Syst. 19(1), 1–22 (2020). https://doi.org/10.1145/3371151
Article Google Scholar
Hex-Rays: The IDA pro disassembler and debugger (2008). https://www.hex-rays.com/products/ida/
Hu, H., et al.: Enforcing unique code target property for control-flow integrity. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS, pp. 1470–1486. ACM, New York (2018). https://doi.org/10.1145/3243734.3243797
Jiang, S., Fu, C., Qian, Y., He, S., Lv, J., Han, L.: IFAttn: binary code similarity analysis based on interpretable features with attention. Comput. Secur. 120, 102804 (2022). https://doi.org/10.1016/j.cose.2022.102804
Article Google Scholar
Khandaker, M.R., Liu, W., Naser, A., Wang, Z., Yang, J.: Origin-sensitive control flow integrity. In: Proceedings of the 28th USENIX Security Symposium, Security, pp. 195–211 (2019). https://doi.org/10.5555/3361338.3361353
Kim, S.H., Sun, C., Zeng, D., Tan, G.: Refining indirect call targets at the binary level. In: Proceedings 2021 Network and Distributed System Security Symposium. No. February in NDSS, Internet Society, Reston, VA (2021). https://doi.org/10.14722/ndss.2021.24386
Lee, Y.J., Choi, S.H., Kim, C., Lim, S.H., Park, K.W.: Learning binary code with deep learning to detect software weakness. In: In KSII The 9th International Conference on Internet (ICONI) 2017 Symposium, ICONI, p. 5 (2017)
Google Scholar
Li, X., Qu, Y., Yin, H.: PalmTree: learning an assembly language model for instruction embedding. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS, pp. 3236–3251. ACM, Virtual Event Republic of Korea (2021). https://doi.org/10.1145/3460120.3484587
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019). https://doi.org/10.48550/arXiv.1907.11692
Lu, K., Hu, H.: Where does it go? Refining indirect-call targets with multi-layer type analysis. In: Proceedings of the ACM Conference on Computer and Communications Security, CCS, pp. 1867–1881. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3319535.3354244
Luk, C.K., et al.: Pin: building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Not. 40(6), 190–200 (2005). https://doi.org/10.1145/1064978.1065034
Article Google Scholar
Muntean, P., Fischer, M., Tan, G., Lin, Z., Grossklags, J., Eckert, C.: $\tau $CFI: type-assisted control flow integrity for x86-64 binaries. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 423–444. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_20
Chapter Google Scholar
Payer, M., Barresi, A., Gross, T.R.: Fine-grained control-flow integrity through binary hardening. In: Almgren, M., Gulisano, V., Maggi, F. (eds.) DIMVA 2015. LNCS, vol. 9148, pp. 144–164. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20550-2_8
Chapter Google Scholar
Pei, K., et al.: StateFormer: fine-grained type recovery from binaries using generative state modeling. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, pp. 690–702 (2021). https://doi.org/10.1145/3468264.3468607
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training, p. 12. arXiv (2018)
Google Scholar
Ramalingam, G.: The undecidability of aliasing. ACM Trans. Program. Lang. Syst. (TOPLAS) 16(5), 1467–1471 (1994). https://doi.org/10.1145/186025.186041
Article Google Scholar
Shacham, H.: The geometry of innocent flesh on the bone. In: Proceedings of the 14th ACM Conference on Computer and Communications Security - CCS ’07, CCS, p. 552. ACM Press, New York (2007). https://doi.org/10.1145/1315245.1315313
Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: Offensive techniques in binary analysis. In: Proceedings - 2016 IEEE Symposium on Security and Privacy, SP 2016, SP, pp. 138–157 (2016). https://doi.org/10.1109/SP.2016.17
Tice, C., et al.: Enforcing forward-edge control-flow integrity in $\{$GCC$\}$$\{$ &$\}$$\{$LLVM$\}$. $\{$USENIX$\}$ Security, pp. 941–955 (2014). https://doi.org/10.5555/2671225.2671285
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11) (2008)
Google Scholar
Van der Veen, V., et al.: A tough call: mitigating advanced code-reuse attacks at the binary level. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 934–953. IEEE, Oakland (2016). https://doi.org/10.1109/SP.2016.60
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, NIPS, vol. 30. Curran Associates, Inc. (2017)
Google Scholar
Wang, H., et al.: jTrans: jump-aware transformer for binary code similarity detection. In: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA, pp. 1–13. ACM, Virtual South Korea (2022). https://doi.org/10.1145/3533767.3534367
Wang, M., Yin, H., Vasisht Bhaskar, A., Su, P., Feng, D.: Binary code continent: finer-grained control flow integrity for stripped binaries. In: Proceedings of the 31st Annual Computer Security Applications Conference on - ACSAC 2015, ACSAC. ACM Press, New York (2015). https://doi.org/10.1145/2818000.2818017
Yang, G., Chen, X., Zhou, Y., Yu, C.: DualSC: automatic generation and summarization of shellcode via transformer and dual learning. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 361–372 (2022). https://doi.org/10.1109/SANER53432.2022.00052
Yu, L., Hermann, K.M., Blunsom, P., Pulman, S.: Deep learning for answer sentence selection. arXiv preprint: arXiv:1412.1632 (2014). https://doi.org/10.48550/arXiv.1412.1632
Zhang, C., et al.: Practical control flow integrity and randomization for binary executables. In: Proceedings - IEEE Symposium on Security and Privacy, Oakland, pp. 559–573 (2013). https://doi.org/10.1109/SP.2013.44
Zhu, W., et al.: CALLEE: recovering call graphs for binaries with transfer and contrastive learning. In: 2023 IEEE Symposium on Security and Privacy (SP), pp. 2357–2374. IEEE (2023). https://doi.org/10.1109/SP46215.2023.10179482
Zuo, F., Li, X., Young, P., Luo, L., Zeng, Q., Zhang, Z.: neural machine translation inspired binary code similarity comparison beyond function Pairs. In: Proceedings 2019 Network and Distributed System Security Symposium, NDSS (2019). https://doi.org/10.14722/ndss.2019.23492

Download references

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Rui Sun, Yinggang Guo, Zicheng Wang & Qingkai Zeng

Authors

Rui Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yinggang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zicheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qingkai Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Sun .

Editor information

Editors and Affiliations

University of California, Irvine, CA, USA
Gene Tsudik
University of Padua, Padua, Italy
Mauro Conti
Delft University of Technology, Delft, The Netherlands
Kaitai Liang
Delft University of Technology, Delft, The Netherlands
Georgios Smaragdakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, R., Guo, Y., Wang, Z., Zeng, Q. (2024). AttnCall: Refining Indirect Call Targets in Binaries with Attention. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14347. Springer, Cham. https://doi.org/10.1007/978-3-031-51482-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-51482-1_20
Published: 11 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51481-4
Online ISBN: 978-3-031-51482-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AttnCall: Refining Indirect Call Targets in Binaries with Attention