ABSTRACT
Predicting function names in stripped binaries is an extremely useful but challenging task, as it requires summarizing the execution behavior and semantics of the function in human languages. Recently, there has been significant progress in this direction with machine learning. However, existing approaches fail to model the exhaustive function behavior and thus suffer from the poor generalizability to unseen binaries. To advance the state of the art, we present a function Symbol name prediction and binary Language Modeling (SymLM) framework, with a novel neural architecture that learns the comprehensive function semantics by jointly modeling the execution behavior of the calling context and instructions via a novel fusing encoder. We have evaluated SymLM with 1,431,169 binary functions from 27 popular open source projects, compiled with 4 optimizations (O0-O3) for 4 different architectures (i.e., x64, x86, ARM, and MIPS) and 4 obfuscations. SymLM outperforms the state-of-the-art function name prediction tools by up to 15.4%, 59.6%, and 35.0% in precision, recall, and F1 score, with significantly better generalizability and obfuscation resistance. Ablation studies also show that our design choices (e.g., fusing components of the calling context and execution behavior) substantially boost the performance of function name prediction. Finally, our case studies further demonstrate the practical use cases of SymLM in analyzing firmware images.
- "Coreutils - gnu core utilities," https://www.gnu.org/software/coreutils/, accessed: 2022-04--14.Google Scholar
- "Gateway," https://github.com/RiS3-Lab/p2im-real_firmware/blob/master/ binary/Gateway, accessed: 2022-04-26.Google Scholar
- "Gnu binutilss," https://www.gnu.org/software/binutils/, accessed: 2022-04--14.Google Scholar
- "Hikari," https://github.com/HikariObfuscator/Hikari#hikari, accessed: 2022-03- 14.Google Scholar
- "Ida pro," https://hex-rays.com/ida-pro/, accessed: 2022-04-14.Google Scholar
- "Linux system call table," https://chromium.googlesource.com/chromiumos/docs/ /master/constants/syscalls.md, accessed: 2022-04--14.Google Scholar
- "Options that control optimization," https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, accessed: 2022-08-29.Google Scholar
- "usbutils," https://github.com/gregkh/usbutils, accessed: 2022-04-11.Google Scholar
- M. Allamanis, E. T. Barr, C. Bird, and C. Sutton, "Suggesting accurate method and class names," in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, 2015, pp. 38--49.Google Scholar
- M. Allamanis, E. T. Barr, S. Ducousso, and Z. Gao, "Typilus: Neural type hints," in Proceedings of the 41st acm sigplan conference on programming language design and implementation, 2020, pp. 91--105.Google Scholar
- M. Allamanis, H. Peng, and C. Sutton, "A convolutional attention network for extreme summarization of source code," in International conference on machine learning. PMLR, 2016, pp. 2091--2100.Google Scholar
- U. Alon, M. Zilberstein, O. Levy, and E. Yahav, "code2vec: Learning distributed representations of code," Proceedings of the ACM on Programming Languages, vol. 3, no. POPL, pp. 1--29, 2019.Google Scholar
- P. Banerjee, K. K. Pal, F. Wang, and C. Baral, "Variable name recovery in decom- piled binary code using constrained masked language modeling," arXiv preprint arXiv:2103.12801, 2021.Google Scholar
- E. Bauman, Z. Lin, and K. Hamlen, "Superset disassembly: Statically rewriting x86 binaries without heuristics," in Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS'18), San Diego, CA, February 2018.Google Scholar
- K. Beck, Implementation patterns. Pearson Education, 2007.Google ScholarDigital Library
- G. Beniamini, S. Gingichashvili, A. K. Orbach, and D. G. Feitelson, "Meaningful identifier names: the case of single-letter variables," in 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC). IEEE, 2017, pp. 45--54.Google Scholar
- S. Bird, E. Klein, and E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit. "O'Reilly Media, Inc.", 2009.Google ScholarDigital Library
- T. Blazytko, M. Contag, C. Aschermann, and T. Holz, "Syntia: Synthesizing the semantics of obfuscated code," in 26th USENIX Security Symposium (USENIX Security 17), 2017, pp. 643--659.Google Scholar
- P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching word vectors with subword information," Transactions of the association for computational linguistics, vol. 5, pp. 135--146, 2017.Google ScholarCross Ref
- J. Caballero, N. M. Johnson, S. McCamant, and D. Song, "Binary Code Extraction and Interface Identification for Security Applications," in Proceedings of the Net- work and Distributed System Security Symposium, San Diego, CA, USA, February 2010.Google Scholar
- M. Chandramohan, Y. Xue, Z. Xu, Y. Liu, C. Y. Cho, and H. B. K. Tan, "Bingo: Cross-architecture cross-os binary search," in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016, pp. 678--689.Google Scholar
- Q. Chen, J. Lacomis, E. J. Schwartz, C. Le Goues, G. Neubig, and B. Vasilescu, "Augmenting decompiler output with learned variable names and types," in 31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 4327--4343.Google Scholar
- S. Chen, Z. Lin, and Y. Zhang, "SelectiveTaint: Efficient data flow tracking with static binary rewriting," in 30th USENIX Security Symposium (USENIX Security, 2021, pp. 1665--1682.Google Scholar
- X. Chen, C. Chen, D. Zhang, and Z. Xing, "Sethesaurus: Wordnet in software engineering," IEEE Transactions on Software Engineering, vol. 47, no. 9, pp. 1960--1979, 2019.Google Scholar
- K. Chowdhary, "Natural language processing," Fundamentals of artificial intelli- gence, pp. 603--649, 2020.Google Scholar
- B. Cornelissen, A. Zaidman, A. Van Deursen, L. Moonen, and R. Koschke, "A systematic survey of program comprehension through dynamic analysis," IEEE Transactions on Software Engineering, vol. 35, no. 5, pp. 684--702, 2009.Google ScholarDigital Library
- Y. David, U. Alon, and E. Yahav, "Neural reverse engineering of stripped binaries using augmented control flow graphs," Proceedings of the ACM on Programming Languages, vol. 4, no. OOPSLA, pp. 1--28, 2020.Google ScholarDigital Library
- Derek Anderson and Scott Randal, "Word ninja," https://github.com/keredson/ wordninja, accessed: 2022-02-26.Google Scholar
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.Google Scholar
- Devopedia, "Naming conventions," https://devopedia.org/naming-conventions, accessed: 2022-02-15.Google Scholar
- S. H. Ding, B. C. Fung, and P. Charland, "Asm2vec: Boosting static representa- tion robustness for binary clone search against code obfuscation and compiler optimization," in 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 2019, pp. 472--489.Google Scholar
- Y. Duan, X. Li, J. Wang, and H. Yin, "Deepbindiff: Learning program-wide code representations for binary diffing," in Network and Distributed System Security Symposium, 2020.Google Scholar
- E. Enslen, E. Hill, L. Pollock, and K. Vijay-Shanker, "Mining source code to automatically split identifiers for software analysis," in 2009 6th IEEE International Working Conference on Mining Software Repositories. IEEE, 2009, pp. 71--80.Google Scholar
- A. Farghaly and K. Shaalan, "Arabic natural language processing: Challenges and solutions," ACM Transactions on Asian Language Information Processing (TALIP), vol. 8, no. 4, pp. 1--22, 2009.Google ScholarDigital Library
- D. Feitelson, A. Mizrahi, N. Noy, A. B. Shabat, O. Eliyahu, and R. Sheffer, "How developers choose names," IEEE Transactions on Software Engineering, 2020.Google Scholar
- B. Feng, A. Mera, and L. Lu, "{P2IM}: Scalable and hardware-independent firmware testing via automatic peripheral interface modeling," in 29th USENIX Security Symposium (USENIX Security 20), 2020, pp. 1237--1254.Google Scholar
- A. Flores-Montoya and E. Schulte, "Datalog disassembly," in 29th USENIX Security Symposium (USENIX Security 20), 2020, pp. 1075--1092.Google Scholar
- H. Gao, S. Cheng, Y. Xue, and W. Zhang, "A lightweight framework for function name reassignment based on large-scale stripped binaries," in Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2021, pp. 607--619.Google Scholar
- C. Gulcehre, S. Ahn, R. Nallapati, B. Zhou, and Y. Bengio, "Pointing the unknown words," arXiv preprint arXiv:1603.08148, 2016.Google Scholar
- W. Guo, D. Mu, X. Xing, M. Du, and D. Song, "{DEEPVSA}: Facilitating value-set analysis with deep learning for postmortem program analysis," in 28th USENIX Security Symposium (USENIX Security 19), 2019, pp. 1787--1804.Google Scholar
- J. He, P. Ivanov, P. Tsankov, V. Raychev, and M. Vechev, "Debin: Predicting debug information in stripped binaries," in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018, pp. 1667--1680.Google Scholar
- A. Hindle, E. T. Barr, M. Gabel, Z. Su, and P. Devanbu, "On the naturalness of software," Communications of the ACM, vol. 59, no. 5, pp. 122--131, 2016.Google ScholarDigital Library
- S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735--1780, 1997.Google ScholarDigital Library
- J. Hofmeister, J. Siegmund, and D. V. Holt, "Shorter identifier names take longer to comprehend," in 2017 IEEE 24th International conference on software analysis, evolution and reengineering (SANER). IEEE, 2017, pp. 217--227.Google Scholar
- S. Hosseinzadeh, S. Rauti, S. Laurén, J.-M. Mäkelä, J. Holvitie, S. Hyrynsalmi, and V. Leppänen, "Diversification and obfuscation techniques for software security: A systematic literature review," Information and Software Technology, vol. 104, pp. 72--93, 2018.Google ScholarCross Ref
- E. W. Høst and B. M. Østvold, "Debugging method names," in European Conference on Object-Oriented Programming. Springer, 2009, pp. 294--317.Google Scholar
- X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, "Deep code comment generation," in 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC). IEEE, 2018, pp. 200--20 010.Google Scholar
- J. Huang, D. Tang, W. Zhong, S. Lu, L. Shou, M. Gong, D. Jiang, and N. Duan, "Whiteningbert: An easy unsupervised sentence embedding approach," arXiv preprint arXiv:2104.01767, 2021.Google Scholar
- H. Husain, H.-H. Wu, T. Gazit, M. Allamanis, and M. Brockschmidt, "CodeSearch- Net challenge: Evaluating the state of semantic code search," arXiv preprint arXiv:1909.09436, 2019.Google Scholar
- Jeff Burt, "How ai can help reverse-engineer malware: Predicting function names of code," https://www.theregister.com/2022/03/26/machine_learning_malware/, accessed: 2022-04-26.Google Scholar
- L. Jiang, H. Liu, and H. Jiang, "Machine learning based automated method name recommendation: How far are we," in Proceedings of the 34th ACM/IEEE Interna- tional Conference on Automated Software Engineering (ASE'19). IEEE CS, 2019.Google Scholar
- Y. Jiang, H. Liu, and L. Zhang, "Semantic relation based expansion of abbrevia- tions," in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 131--141.Google Scholar
- D. S. Katz, J. Ruchti, and E. Schulte, "Using recurrent neural networks for de- compilation," in 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2018, pp. 346--356.Google Scholar
- D. Khurana, A. Koli, K. Khatter, and S. Singh, "Natural language processing: State of the art, current trends and challenges," arXiv preprint arXiv:1708.05148, 2017.Google Scholar
- T. Kudo, "Subword regularization: Improving neural network translation models with multiple subword candidates," arXiv preprint arXiv:1804.10959, 2018.Google Scholar
- T. Kudo and J. Richardson, "Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing," arXiv preprint arXiv:1808.06226, 2018.Google Scholar
- J. Lacomis, P. Yin, E. Schwartz, M. Allamanis, C. Le Goues, G. Neubig, and B. Vasilescu, "Dire: A neural approach to decompiled identifier naming," in 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2019, pp. 628--639.Google Scholar
- A. M. Lamb, A. G. ALIAS PARTH GOYAL, Y. Zhang, S. Zhang, A. C. Courville, and Y. Bengio, "Professor forcing: A new algorithm for training recurrent networks," Advances in neural information processing systems, vol. 29, 2016.Google Scholar
- B. Li, H. Zhou, J. He, M. Wang, Y. Yang, and L. Li, "On the sentence embeddings from pre-trained language models," arXiv preprint arXiv:2011.05864, 2020.Google Scholar
- Y. Li, S. Wang, and T. N. Nguyen, "A context-based automated approach for method name consistency checking and suggestion," in IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 2021, pp. 574--586.Google Scholar
- Y. Liang and K. Zhu, "Automatic generation of text descriptive comments for code blocks," in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.Google Scholar
- C. Lin, Z. Ouyang, J. Zhuang, J. Chen, H. Li, and R. Wu, "Improving code sum- marization with block-wise abstract syntax tree splitting," in 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). IEEE, 2021, pp. 184--195.Google Scholar
- Z. Lin, X. Zhang, and D. Xu, "Automatic reverse engineering of data structures from binary execution," in Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS'10), San Diego, CA, February 2010.Google Scholar
- K. Liu, D. Kim, T. F. Bissyandé, T. Kim, K. Kim, A. Koyuncu, S. Kim, and Y. Le Traon, "Learning to spot and refactor inconsistent method names," in 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 2019, pp. 1--12.Google Scholar
- Z. Liu and S. Wang, "How far we have come: testing decompilation correctness of c decompilers," in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020, pp. 475--487.Google Scholar
- A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y. Fratantonio, M. Mansouri, and D. Balzarotti, "How machine learning is solving the binary function similarity problem," in USENIX 2022, 31st USENIX Security Symposium, 10-12 August 2022, Boston, MA, USA, Usenix, Ed., Boston, 2022.Google Scholar
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.Google Scholar
- K. Miller, Y. Kwon, Y. Sun, Z. Zhang, X. Zhang, and Z. Lin, "Probabilistic disassembly," in Proceedings of the 41st International Conference on Software Engineering, ser. ICSE'19, Montreal, Quebec, Canada, 2019, pp. 1187--1198.Google Scholar
- National Security Agency, "Ghidra," https://ghidra-sre.org/, accessed: 2022-04-21.Google Scholar
- S. Nguyen, H. Phan, T. Le, and T. N. Nguyen, "Suggesting natural method names to check name consistencies," in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, pp. 1372--1384.Google Scholar
- M. Ott, S. Edunov, A. Baevski, A. Fan, S. Gross, N. Ng, D. Grangier, and M. Auli, "fairseq: A fast, extensible toolkit for sequence modeling," arXiv preprint arXiv:1904.01038, 2019.Google Scholar
- A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., "Pytorch: An imperative style, high-performance deep learning library," Advances in neural information processing systems, vol. 32, 2019.Google Scholar
- J. Patrick-Evans, L. Cavallaro, and J. Kinder, "Probabilistic naming of functions in stripped binaries," in Annual Computer Security Applications Conference, 2020, pp. 373--385.Google ScholarDigital Library
- M. Payer, A. Barresi, and T. R. Gross, "Fine-grained control-flow integrity through binary hardening," in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2015, pp. 144--164.Google Scholar
- K. Pei, J. Guan, M. Broughton, Z. Chen, S. Yao, D. Williams-King, V. Ummadisetty, J. Yang, B. Ray, and S. Jana, "Stateformer: fine-grained type recovery from binaries using generative state modeling," in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021, pp. 690--702.Google Scholar
- K. Pei, J. Guan, D. Williams-King, J. Yang, and S. Jana, "Xda: Accurate, robust disassembly with transfer learning," arXiv preprint arXiv:2010.00770, 2020.Google Scholar
- K. Pei, Z. Xuan, J. Yang, S. Jana, and B. Ray, "Trex: Learning execution semantics from micro-traces for binary similarity," arXiv preprint arXiv:2012.08680, 2020.Google Scholar
- Pytorch developers, "Embedding," https://pytorch.org/docs/stable/generated/ torch.nn.Embedding.html, accessed: 2022-03-24.Google Scholar
- R. Rehurek and P. Sojka, "Gensim--python framework for vector space modelling," NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic, vol. 3, no. 2, 2011.Google Scholar
- S. Ruder, I. Vulić, and A. Søgaard, "A survey of cross-lingual word embedding models," Journal of Artificial Intelligence Research, vol. 65, pp. 569--631, 2019.Google ScholarDigital Library
- G. Scanniello, M. Risi, P. Tramontana, and S. Romano, "Fixing faults in c and java source code: Abbreviated vs. full-word identifier names," ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 26, no. 2, pp. 1--43, 2017.Google ScholarDigital Library
- S. Seeha, I. Bilan, L. M. Sanchez, J. Huber, M. Matuschek, and H. Schütze, "Thailm- cut: Unsupervised pretraining for thai word segmentation," in Proceedings of The 12th Language Resources and Evaluation Conference, 2020, pp. 6947--6957.Google Scholar
- R. Sennrich, B. Haddow, and A. Birch, "Neural machine translation of rare words with subword units," arXiv preprint arXiv:1508.07909, 2015.Google Scholar
- M. I. Sharif, A. Lanzi, J. T. Giffin, and W. Lee, "Impeding malware analysis us- ing conditional code obfuscation." in Network and Distributed System Security Symposium. Citeseer, 2008.Google Scholar
- Y. Shoshitaishvili, R. Wang, C. Hauser, C. Kruegel, and G. Vigna, "Firmalice- automatic detection of authentication bypass vulnerabilities in binary firmware." in Network and Distributed System Security Symposium, vol. 1, 2015, pp. 1--1.Google Scholar
- J. Siegmund, "Program comprehension: Past, present, and future," in 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol. 5. IEEE, 2016, pp. 13--20.Google Scholar
- K. Toutanova, D. Klein, C. D. Manning, and Y. Singer, "Feature-rich part-of-speech tagging with a cyclic dependency network," in Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003, pp. 252--259.Google Scholar
- L. Van der Maaten and G. Hinton, "Visualizing data using t-sne." Journal of machine learning research, vol. 9, no. 11, 2008.Google Scholar
- A. K. Vijayakumar, M. Cogswell, R. R. Selvaraju, Q. Sun, S. Lee, D. Crandall, and D. Batra, "Diverse beam search: Decoding diverse solutions from neural sequence models," arXiv preprint arXiv:1610.02424, 2016.Google Scholar
- K. Wang, R. Singh, and Z. Su, "Dynamic neural program embedding for program repair," arXiv preprint arXiv:1711.07163, 2017.Google Scholar
- K. Wang and Z. Su, "Blended, precise semantic program embeddings," in Proceed- ings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020, pp. 121--134.Google Scholar
- H. Yakura, S. Shinozaki, R. Nishimura, Y. Oyama, and J. Sakuma, "Malware anal- ysis of imaged binary samples by convolutional neural network with attention mechanism," in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, 2018, pp. 127--134.Google Scholar
- J. Yang, C. Fu, X.-Y. Liu, H. Yin, and P. Zhou, "Codee: A tensor embedding scheme for binary code search," IEEE Transactions on Software Engineering, 2021.Google ScholarCross Ref
- S. Yu, Y. Qu, X. Hu, and H. Yin, "Deepdi: Learning a relational graph convolutional network model on instructions for fast and accurate disassembly," in 31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2709--2725.Google Scholar
- J. Zeng, Y. Fu, K. Miller, Z. Lin, X. Zhang, and D. Xu, "Obfuscation-resilient binary code reuse through trace-oriented programming," in Proceedings of the 20th ACM Conference on Computer and Communications Security (CCS'13), Berlin, Germany, November 2013.Google Scholar
- J. Zhang, X. Wang, H. Zhang, H. Sun, and X. Liu, "Retrieval-based neural source code summarization," in 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 2020, pp. 1385--1397.Google Scholar
- C. Zhao and S. Sahni, "String correction using the damerau-levenshtein distance," BMC bioinformatics, vol. 20, no. 11, pp. 1--28, 2019.Google Scholar
- W. Zhou, L. Guan, P. Liu, and Y. Zhang, "Automatic firmware emulation through invalidity-guided knowledge inference," in USENIX Security Symposium, 2021, pp. 2007--2024.Google Scholar
Recommendations
Named Entity Extractors for New Domains by Transfer Learning with Automatically Annotated Data
Computational Processing of the Portuguese LanguageAbstractNamed entity recognition (NER) tasks imply token-level labels. Annotating documents can be time-consuming, costly, and prone to human error. In many real-life scenarios, the lack of labeled data has become the biggest bottleneck preventing NER ...
Towards adaptation of named entity classification
SAC '17: Proceedings of the Symposium on Applied ComputingNumerous state-of-the-art Named Entity Recognition (NER) systems use different classification schemas/ontologies. Comparisons and integration among NER systems, thus, becomes complex. In this paper, we propose a transfer-learning approach where we use ...
Cross-organism learning method to discover new gene functionalities
BackgroundKnowledge of gene and protein functions is paramount for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. Analyses for biomedical knowledge discovery greatly ...
Comments