research-article

A lightweight framework for function name reassignment based on large-scale stripped binaries

Authors:
Han Gao

University of Science and Technology of China, China

University of Science and Technology of China, China

0000-0001-9798-6985
View Profile

,
Shaoyin Cheng

University of Science and Technology of China, China

University of Science and Technology of China, China
View Profile

,
Yinxing Xue

University of Science and Technology of China, China

University of Science and Technology of China, China
View Profile

,
Weiming Zhang

University of Science and Technology of China, China

University of Science and Technology of China, China

0000-0001-5576-6108
View Profile

ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and AnalysisJuly 2021Pages 607–619https://doi.org/10.1145/3460319.3464804

Published:11 July 2021Publication History

ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pages 607–619

ABSTRACT

Software in the wild is usually released as stripped binaries that contain no debug information (e.g., function names). This paper studies the issue of reassigning descriptive names for functions to help facilitate reverse engineering. Since the essence of this issue is a data-driven prediction task, persuasive research should be based on sufficiently large-scale and diverse data. However, prior studies can only be based on small-scale datasets because their techniques suffer from heavyweight binary analysis, making them powerless in the face of big-size and large-scale binaries.

This paper presents the Neural Function Rename Engine (NFRE), a lightweight framework for function name reassignment that utilizes both sequential and structural information of assembly code. NFRE uses fine-grained and easily acquired features to model assembly code, making it more effective and efficient than existing techniques. In addition, we construct a large-scale dataset and present two data-preprocessing approaches to help improve its usability. Benefiting from the lightweight design, NFRE can be efficiently trained on the large-scale dataset, thereby having better generalization capability for unknown functions. The comparative experiments show that NFRE outperforms two existing techniques by a relative improvement of 32% and 16%, respectively, while the time cost for binary analysis is much less.

References

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2020. A Transformer-based Approach for Source Code Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 4998–5007. https://doi.org/10.18653/v1/2020.acl-main.449 Google ScholarCross Ref
Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015. Suggesting Accurate Method and Class Names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). 38–49. isbn:9781450336758 https://doi.org/10.1145/2786805.2786849 Google ScholarDigital Library
Miltiadis Allamanis, Hao Peng, and Charles Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In Proceedings of The 33rd International Conference on Machine Learning, Maria Florina Balcan and Kilian Q. Weinberger (Eds.) (Proceedings of Machine Learning Research, Vol. 48). 2091–2100. http://proceedings.mlr.press/v48/allamanis16.htmlGoogle Scholar
Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. 2019. code2seq: Generating Sequences from Structured Representations of Code. In International Conference on Learning Representations. https://openreview.net/forum?id=H1gKYo09tXGoogle Scholar
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. Code2Vec: Learning Distributed Representations of Code. Proc. ACM Program. Lang., 3, POPL (2019), Article 40, Jan., 29 pages. issn:2475-1421 https://doi.org/10.1145/3290353 Google ScholarDigital Library
Antti Haapala. 2021. python-Levenshtein. https://github.com/ztane/python-LevenshteinGoogle Scholar
Avast Software. 2021. RetDec. https://retdec.comGoogle Scholar
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In 3rd International Conference on Learning Representations, ICLR, Yoshua Bengio and Yann LeCun (Eds.). arxiv:1409.0473Google Scholar
Joost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, and Khalil Sima’an. 2017. Graph Convolutional Encoders for Syntax-aware Neural Machine Translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1957–1967. https://doi.org/10.18653/v1/D17-1209 Google ScholarCross Ref
Daniel Beck, Gholamreza Haffari, and Trevor Cohn. 2018. Graph-to-Sequence Learning using Gated Graph Neural Networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 273–283. https://doi.org/10.18653/v1/P18-1026 Google ScholarCross Ref
Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. 2018. Neural Code Comprehension: A Learnable Representation of Code Semantics. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). 31, Curran Associates, Inc., 3585–3597. https://proceedings.neurips.cc/paper/2018/file/17c3433fecc21b57000debdf7ad5c930-Paper.pdfGoogle Scholar
F. A. Breve, L. Zhao, and M. G. Quiles. 2010. Semi-supervised learning from imperfect data through particle cooperation and competition. In The 2010 International Joint Conference on Neural Networks (IJCNN). 1–8. https://doi.org/10.1109/IJCNN.2010.5596659 Google ScholarCross Ref
David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J. Schwartz. 2011. BAP: A Binary Analysis Platform. In Computer Aided Verification, Ganesh Gopalakrishnan and Shaz Qadeer (Eds.). 463–469. isbn:978-3-642-22110-1Google ScholarDigital Library
H. Cai, V. W. Zheng, and K. C. Chang. 2018. A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications. IEEE Transactions on Knowledge and Data Engineering, 30, 9 (2018), Sep., 1616–1637. issn:1558-2191 https://doi.org/10.1109/TKDE.2018.2807452 Google ScholarDigital Library
Canonical Ltd.. 2021. Enterprise Open Source and Linux | Ubuntu. https://ubuntu.comGoogle Scholar
Canonical Ltd.. 2021. Ubuntu Debug Symbol Packages. https://wiki.ubuntu.com/Debug_Symbol_PackagesGoogle Scholar
G. Chen, Z. Wang, R. Zhang, K. Zhou, S. Huang, K. Ni, Z. Qi, K. Chen, and H. Guan. 2010. A Refined Decompiler to Generate C Code with High Readability. In 2010 17th Working Conference on Reverse Engineering. 150–154. issn:1095-1350 https://doi.org/10.1109/WCRE.2010.24 Google ScholarDigital Library
Qiming Chen and Ren Wu. 2017. CNN Is All You Need. CoRR, abs/1712.09662 (2017), arxiv:1712.09662Google Scholar
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). 785–794. isbn:9781450342322 https://doi.org/10.1145/2939672.2939785 Google ScholarDigital Library
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1724–1734. https://doi.org/10.3115/v1/D14-1179 Google ScholarCross Ref
Zheng Leong Chua, Shiqi Shen, Prateek Saxena, and Zhenkai Liang. 2017. Neural Nets Can Learn Function Type Signatures From Binaries. In 26th USENIX Security Symposium (USENIX Security 17). 99–116. isbn:978-1-931971-40-9 https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/chuaGoogle Scholar
Yaniv David, Uri Alon, and Eran Yahav. 2020. Neural Reverse Engineering of Stripped Binaries Using Augmented Control Flow Graphs. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 225, Nov., 28 pages. https://doi.org/10.1145/3428293 Google ScholarDigital Library
S. H. H. Ding, B. C. M. Fung, and P. Charland. 2019. Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. In 2019 IEEE Symposium on Security and Privacy (SP). 472–489.Google Scholar
Yue Duan, Xuezixiang Li, Jinghan Wang, and Heng Yin. 2020. DeepBinDiff: Learning Program-Wide Code Representations for Binary Diffing. In Proceedings of the 2020 Network and Distributed Systems Security Symposium (NDSS).Google ScholarCross Ref
Eclipse Foundation, Inc.. 2021. Eclipse Java development tools. https://www.eclipse.org/jdtGoogle Scholar
Eli Bendersky. 2021. pycparser. https://github.com/eliben/pycparserGoogle Scholar
Patrick Fernandes, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Structured Neural Summarization. In International Conference on Learning Representations. https://openreview.net/forum?id=H1ersoRqtmGoogle Scholar
Jerome Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29 (2000), 11, https://doi.org/10.1214/aos/1013203451 Google ScholarCross Ref
Jerome H. Friedman. 2002. Stochastic Gradient Boosting. Comput. Stat. Data Anal., 38, 4 (2002), Feb., 367–378. issn:0167-9473 https://doi.org/10.1016/S0167-9473(01)00065-2 Google ScholarDigital Library
Edward M Gellenbeck and Curtis R Cook. 1991. An investigation of procedure and variable names as beacons during program comprehension. In Empirical studies of programmers: Fourth workshop. 65–81.Google Scholar
GitHub, Inc.. 2021. GitHub. https://github.comGoogle Scholar
Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1631–1640. https://doi.org/10.18653/v1/P16-1154 Google ScholarCross Ref
Han Gao. 2021. Code for Neural Function Rename Engine. https://github.com/USTC-TTCN/NFREGoogle Scholar
Jingxuan He, Pesho Ivanov, Petar Tsankov, Veselin Raychev, and Martin Vechev. 2018. Debin: Predicting Debug Information in Stripped Binaries. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS ’18). 1667–1680. isbn:9781450356930 https://doi.org/10.1145/3243734.3243866 Google ScholarDigital Library
Hex-Rays SA. 2021. Hex-Rays Decompiler. https://www.hex-rays.com/products/decompilerGoogle Scholar
Hex-Rays SA. 2021. IDA Pro. https://www.hex-rays.com/products/idaGoogle Scholar
Intel Corporation. 2021. Intel Advanced Encryption Standard Instructions (AES-NI). https://software.intel.com/content/www/us/en/develop/articles/intel-advanced-encryption-standard-instructions-aes-ni.htmlGoogle Scholar
Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller. 2011. Labeling Library Functions in Stripped Binaries. In Proceedings of the 10th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools (PASTE ’11). 1–8. isbn:9781450308496 https://doi.org/10.1145/2024569.2024571 Google ScholarDigital Library
Alan Jaffe. 2017. Suggesting Meaningful Variable Names for Decompiled Code: A Machine Translation Approach. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). 1050–1052. isbn:9781450351058 https://doi.org/10.1145/3106237.3121274 Google ScholarDigital Library
Alan Jaffe, Jeremy Lacomis, Edward J. Schwartz, Claire Le Goues, and Bogdan Vasilescu. 2018. Meaningful Variable Names for Decompiled Code: A Machine Translation Approach. In Proceedings of the 26th Conference on Program Comprehension (ICPC ’18). 20–30. isbn:9781450357142 https://doi.org/10.1145/3196321.3196330 Google ScholarDigital Library
Yingjiu Li Jiayun Xu and Robert H. Deng. 2021. Differential Training: A Generic Framework to Reduce Label Noises for Android Malware Detection. In Proceedings of the Network and Distributed System Security Symposium, NDSS.Google Scholar
D. S. Katz, J. Ruchti, and E. Schulte. 2018. Using recurrent neural networks for decompilation. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 346–356. issn:null https://doi.org/10.1109/SANER.2018.8330222 Google ScholarCross Ref
Shachar Kaufman, Saharon Rosset, Claudia Perlich, and Ori Stitelman. 2012. Leakage in Data Mining: Formulation, Detection, and Avoidance. ACM Trans. Knowl. Discov. Data, 6, 4 (2012), Article 15, Dec., 21 pages. issn:1556-4681 https://doi.org/10.1145/2382577.2382579 Google ScholarDigital Library
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arxiv:1412.6980.Google Scholar
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=SJU4ayYglGoogle Scholar
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander M. Rush. 2017. OpenNMT: Open-Source Toolkit for Neural Machine Translation. In Proc. ACL. https://doi.org/10.18653/v1/P17-4012 Google ScholarCross Ref
Taku Kudo. 2018. Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia. 66–75. https://doi.org/10.18653/v1/P18-1007 Google ScholarCross Ref
Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 66–71. https://doi.org/10.18653/v1/D18-2012 Google ScholarCross Ref
J. Lacomis, P. Yin, E. Schwartz, M. Allamanis, C. Le Goues, G. Neubig, and B. Vasilescu. 2019. DIRE: A Neural Approach to Decompiled Identifier Naming. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 628–639. issn:1938-4300 https://doi.org/10.1109/ASE.2019.00064 Google ScholarDigital Library
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Carla E. Brodley and Andrea Pohoreckyj Danyluk (Eds.). 282–289.Google ScholarDigital Library
A. LeClair, S. Jiang, and C. McMillan. 2019. A Neural Model for Generating Natural Language Summaries of Program Subroutines. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 795–806. issn:0270-5257 https://doi.org/10.1109/ICSE.2019.00087 Google ScholarDigital Library
Alexander LeClair and Collin McMillan. 2019. Recommendations for Datasets for Source Code Summarization. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota. 3931–3937. https://doi.org/10.18653/v1/N19-1394 Google ScholarCross Ref
JongHyup Lee, Thanassis Avgerinos, and David Brumley. 2011. TIE: Principled Reverse Engineering of Types in Binary Programs. In Proceedings of the Network and Distributed System Security Symposium, NDSS.Google Scholar
Qimai Li, Zhichao Han, and Xiao ming Wu. 2018. Deeper Insights Into Graph Convolutional Networks for Semi-Supervised Learning. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16098Google Scholar
Zhibo Liu and Shuai Wang. 2020. How Far We Have Come: Testing Decompilation Correctness of C Decompilers. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2020). Association for Computing Machinery, 475–487. isbn:9781450380089 https://doi.org/10.1145/3395363.3397370 Google ScholarDigital Library
Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1412–1421. https://doi.org/10.18653/v1/D15-1166 Google ScholarCross Ref
Luca Massarelli, Giuseppe Antonio Di Luna, Fabio Petroni, Roberto Baldoni, and Leonardo Querzoni. 2019. SAFE: Self-Attentive Function Embeddings for Binary Similarity. In Detection of Intrusions and Malware, and Vulnerability Assessment, Roberto Perdisci, Clémentine Maurice, Giorgio Giacinto, and Magnus Almgren (Eds.). 309–329. isbn:978-3-030-22038-9Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR, Yoshua Bengio and Yann LeCun (Eds.). arxiv:1301.3781Google Scholar
George A. Miller. 1995. WordNet: A Lexical Database for English. Commun. ACM, 38, 11 (1995), Nov., 39–41. issn:0001-0782 https://doi.org/10.1145/219717.219748 Google ScholarDigital Library
Palo Alto Networks, Inc.. 2021. Domain Generation Algorithm (DGA) Detection. https://docs.paloaltonetworks.com/pan-os/9-1/pan-os-admin/threat-prevention/dns-security/domain-generation-algorithm-detection.htmlGoogle Scholar
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12 (2011), 2825–2830.Google ScholarDigital Library
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14). 701–710. isbn:9781450329569 https://doi.org/10.1145/2623330.2623732 Google ScholarDigital Library
PNF Software, Inc.. 2021. IDA Pro. https://www.pnfsoftware.com/Google Scholar
Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. 45–50. http://is.muni.cz/publication/884893/enGoogle Scholar
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). 3104–3112. http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdfGoogle Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdfGoogle ScholarDigital Library
Daniel Votipka, Seth Rabin, Kristopher Micinski, Jeffrey S. Foster, and Michelle L. Mazurek. 2020. An Observational Investigation of Reverse Engineers’ Processes. In 29th USENIX Security Symposium (USENIX Security 20). USENIX Association, 1875–1892. isbn:978-1-939133-17-5 https://www.usenix.org/conference/usenixsecurity20/presentation/votipka-observationalGoogle Scholar
Yanlin Wang and Hui Li. 2021. Code Completion by Modeling Flattened Abstract Syntax Trees as Graphs. In The Thirty-Fifth AAAI Conference on Artificial Intelligence.Google Scholar
Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural Network-Based Graph Embedding for Cross-Platform Binary Code Similarity Detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS ’17). Association for Computing Machinery, New York, NY, USA. 363–376. isbn:9781450349468 https://doi.org/10.1145/3133956.3134018 Google ScholarDigital Library
Khaled Yakdan, Sebastian Eschweiler, Elmar Gerhards-Padilla, and Matthew Smith. 2015. No More Gotos: Decompilation Using Pattern-Independent Control-Flow Structuring and Semantic-Preserving Transformations. In 22nd Annual Network and Distributed System Security Symposium, NDSS 2015, San Diego, California, USA, February 8-11, 2015. The Internet Society.Google Scholar
Yaniv David, Uri Alon and Eran Yahav. 2021. The Dataset of Nero. https://doi.org/10.5281/zenodo.4081641 Google ScholarCross Ref
Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2017. Understanding deep learning requires rethinking generalization. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=Sy8gdB9xxGoogle Scholar
Hong Zhao, Zhaobin Chang, Guangbin Bao, and Xiangyan Zeng. 2019. Malicious Domain Names Detection Algorithm Based on N-Gram. J. Comput. Networks Commun., 2019 (2019), 4612474:1–4612474:9. https://doi.org/10.1155/2019/4612474 Google ScholarDigital Library
Lingxiao Zhao and Leman Akoglu. 2020. PairNorm: Tackling Oversmoothing in GNNs. In International Conference on Learning Representations. https://openreview.net/forum?id=rkecl1rtwBGoogle Scholar
Fei Zuo, Xiaopeng Li, Patrick Young, Lannan Luo, Qiang Zeng, and Zhexin Zhang. 2019. Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs. In 26th Annual Network and Distributed System Security Symposium, NDSS.Google ScholarCross Ref

Index Terms

A lightweight framework for function name reassignment based on large-scale stripped binaries
1. Social and professional topics
  1. Computing / technology policy
    1. Intellectual property
      1. Software reverse engineering
2. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program analysis

Recommendations

Probabilistic Naming of Functions in Stripped Binaries
ACSAC '20: Proceedings of the 36th Annual Computer Security Applications Conference

Debugging symbols in binary executables carry the names of functions and global variables. When present, they greatly simplify the process of reverse engineering, but they are almost always removed (stripped) for deployment. We present the design and ...
Read More
Function boundary detection in stripped binaries
ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference

Automated cyber defense tools require the ability to analyze binary applications, detect vulnerabilities and automatically patch those vulnerabilities. The insertion of security mechanisms that operate at function boundaries (e.g, control flow ...
Read More
Robust hybrid name disambiguation framework for large databases

In many databases, science bibliography database for example, name attribute is the most commonly chosen identifier to identify entities. However, names are often ambiguous and not always unique which cause problems in many fields. Name disambiguation is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis
July 2021
685 pages
ISBN:9781450384599
DOI:10.1145/3460319
General Chair:
Cristian Cadar
Imperial College London, UK
,
Program Chair:
Xiangyu Zhang
Purdue University, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 July 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Distinguished Paper
Author Tags
Binary Analysis
Neural Networks
Reverse Engineering
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate58of213submissions,27%
Upcoming Conference
ISSTA '24

Sponsor:

sigsoft

33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

September 16 - 20, 2024

Vienna , Austria
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 692
  Total Downloads
- Downloads (Last 12 months)161
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A lightweight framework for function name reassignment based on large-scale stripped binaries

ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

ABSTRACT

References

Cited By

Index Terms

Recommendations

Probabilistic Naming of Functions in Stripped Binaries

Function boundary detection in stripped binaries

Robust hybrid name disambiguation framework for large databases