BinDNN: Resilient Function Matching Using Deep Learning

Lageman, Nathaniel; Kilmer, Eric D.; Walls, Robert J.; McDaniel, Patrick D.

doi:10.1007/978-3-319-59608-2_29

Nathaniel Lageman¹⁹,
Eric D. Kilmer¹⁹,
Robert J. Walls²⁰ &
…
Patrick D. McDaniel¹⁹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 198))

Included in the following conference series:

International Conference on Security and Privacy in Communication Systems

1831 Accesses
5 Citations

Abstract

Determining if two functions taken from different compiled binaries originate from the same function in the source code has many applications to malware reverse engineering. Namely, this process allows an analyst to filter large swaths of code, removing functions that have been previously observed or those that originate in shared or trusted libraries. However, this task is challenging due to the myriad factors that influence the translation between source code and assembly instructions—the instruction stream created by a compiler is heavily influenced by a number of factors including optimizations, target platforms, and runtime constraints. In this paper, we seek to advance methods for reliably testing the equivalence of functions found in different executables. By leveraging advances in deep learning and natural language processing, we design and evaluate a novel algorithm, BinDNN, that is resilient to variations in compiler, compiler optimization level, and architecture. We show that BinDNN is effective both in isolation or in conjunction with existing approaches. In the case of the latter, we boost performance by 109% when combining BinDNN with BinDiff to compare functions across architectures. This result—an improvement of 32% for BinDNN and 185% for BinDiff—demonstrates the utility of employing multiple orthogonal approaches to function matching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We paired functions representations from gcc -O0 against gcc O1, O2, and O3. See Sect. 4.1 for a description of the data set.
2.
http://deeplearning.net/software/theano/library/tensor/nnet/nnet.html#tensor.nnet.binary_crossentropy.
3.
This complicates the process of deciding when BinDiff has correctly or incorrectly identified a function. Our process for making this decision required that we first provide BinDiff with unstripped binaries, where is would successfully match all functions via name hashing, then save the effective address of the two functions it matched. Using these effective addresses, we were then able to verify matches made by BinDiff on the stripped binaries.

References

Bao, T., Burket, J., Woo, M., Turner, R., Brumley, D.: Byteweight: learning to recognize functions in binary code. In: USENIX Security Symposium (2014)
Google Scholar
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Article Google Scholar
Bourquin, M., King, A., Robbins, E.: Binslayer: accurate comparison of binary executables. In: Proceedings of the 2nd ACM SIGPLAN Program Protection and Reverse Engineering Workshop, p. 4. ACM (2013)
Google Scholar
Ciregan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)
Google Scholar
Dullien, T., Rolles, R.: Graph-based comparison of executable objects (English version). SSTIC 5, 1–3 (2005)
Google Scholar
Duygulu, P., Barnard, K., Freitas, J.F.G., Forsyth, D.A.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002). doi:10.1007/3-540-47979-1_7
Chapter Google Scholar
Egele, M., Woo, M., Chapman, P., Brumley, D.: Blanket execution: dynamic similarity testing for program binaries and components. In: USENIX Security Symposium (2014)
Google Scholar
Gao, D., Reiter, M.K., Song, D.: BinHunt: automatically finding semantic differences in binary programs. In: Chen, L., Ryan, M.D., Wang, G. (eds.) ICICS 2008. LNCS, vol. 5308, pp. 238–255. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88625-9_16
Chapter Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with lstm. Neural Comput. 12(10), 2451–2471 (2000)
Article Google Scholar
Hex-Rays. Hex-rays: IDA pro disassembler and debugger (2016). https://www.hex-rays.com/products/ida/
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jacobson, E.R., Rosenblum, N., Miller, B.P.: Labeling library functions in stripped binaries. In: Proceedings of the 10th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools, pp. 1–8. ACM (2011)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Lageman, N., Lindsey, M., Glodek, W.: Detecting malicious android applications from runtime behavior. In: Military Communications Conference, MILCOM 2015–2015 IEEE, pp. 324–329. IEEE (2015)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Mikolov, T., Yih, W.-T., Zweig, G.: Linguistic regularities in continuous space word representations. In: HLT-NAACL, pp. 746–751 (2013)
Google Scholar
Mukkamala, S., Janoski, G., Sung, A.: Intrusion detection using neural networks and support vector machines. In: Proceedings of the 2002 International Joint Conference on Neural Networks, IJCNN 2002, vol. 2, pp. 1702–1707. IEEE (2002)
Google Scholar
Pontil, M., Verri, A.: Support vector machines for 3D object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 20(6), 637–646 (1998)
Article Google Scholar
Rosenblum, N.E., Zhu, X., Miller, B.P., Hunt, K.: Learning to analyze binary computer code. In: AAAI, pp. 798–804 (2008)
Google Scholar
Saad, S., Traore, I., Ghorbani, A., Sayed, B., Zhao, D., Lu, W., Felix, J., Hakimian, P.: Detecting P2P botnets through network behavior analysis and machine learning. In: 2011 Ninth Annual International Conference on Privacy, Security and Trust (PST), pp. 174–180. IEEE (2011)
Google Scholar
Sainath, T.N., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580–4584. IEEE (2015)
Google Scholar
Shin, E.C.R., Song, D., Moazzezi, R.: Recognizing functions in binaries with neural networks. In: 24th USENIX Conference on Security Symposium (SEC). USENIX Association, Washington, DC (2015)
Google Scholar
Sinclair, C., Pierce, L., Matzner, S.: An application of machine learning to network intrusion detection. In: Proceedings of 15th Annual Computer Security Applications Conference (ACSAC 1999), pp. 371–377. IEEE (1999)
Google Scholar
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Article Google Scholar
Zynamics: zynamics BinDiff (2016). https://www.zynamics.com/bindiff.html

Download references

Acknowledgments

Research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-13-2-0045 (ARL Cyber Security CRA). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Additionally, this material is based upon work supported by the National Science Foundation under Grant Nos. CNS-1228700 and CNS-1064900. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Pennsylvania State University, University Park, PA, USA
Nathaniel Lageman, Eric D. Kilmer & Patrick D. McDaniel
Worcester Polytechnic Institute, Worcester, MA, USA
Robert J. Walls

Authors

Nathaniel Lageman
View author publications
You can also search for this author in PubMed Google Scholar
Eric D. Kilmer
View author publications
You can also search for this author in PubMed Google Scholar
Robert J. Walls
View author publications
You can also search for this author in PubMed Google Scholar
Patrick D. McDaniel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathaniel Lageman .

Editor information

Editors and Affiliations

Singapore Management University, Singapore, Singapore
Robert Deng
Jinan University, Guangzhou, Guangdong, China
Jian Weng
University at Buffalo, Buffalo, New York, USA
Kui Ren
SRI International, Menlo Park, California, USA
Vinod Yegneswaran

A Network Architecture

We design what is essentially an 8 layer network. The first layer is an embedding layer, this layer learns mappings for global vocabulary indexes into dense vectors. This layer is especially important for our first goal, the ability to recognize similar instructions that have different names. This layer allows the model to more easily map instructions that appear to have similar meaning to real values that are close.

Next, we pass the output from the embedding layer to two 1 dimensional convolutional layers. The convolutional layers each use 64 kernels with filter size 3. These layers allows the model to learn small groups of instructions. This allows the model to classify not only on the exact sequence of instructions that makes up the function representation, but also the sequence of meaningful instruction subsequences. From the convolutional layers we downscale by a factor of 2 by using Max Pooling.

Next, we use two long-short term memory (LSTM) layers with 70 cells each. These layers are the heart of the model. They learn the temporal relationships between instructions. By using LSTM layers, we are better able to overcome the vanishing or exploding gradient problem associated with standard RNNs [2], which in turn allows us to more easily learn long-term dependencies within the functions.

Lastly, we incorporate dropout throughout the model to help it resist overfitting. Specifically, we include 25% dropout in-between the two convolutional layers, and we include 50% dropout between the final two dense layers. The model also uses a sigmoid activation function. We provide a diagram of the network architecture in Fig. 12.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lageman, N., Kilmer, E.D., Walls, R.J., McDaniel, P.D. (2017). BinDNN: Resilient Function Matching Using Deep Learning. In: Deng, R., Weng, J., Ren, K., Yegneswaran, V. (eds) Security and Privacy in Communication Networks. SecureComm 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 198. Springer, Cham. https://doi.org/10.1007/978-3-319-59608-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-59608-2_29
Published: 14 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59607-5
Online ISBN: 978-3-319-59608-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

BinDNN: Resilient Function Matching Using Deep Learning

Abstract

Access this chapter

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Network Architecture

A Network Architecture

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation