Abstract
Static analysis tools need information about the ISA and the boundaries of the code and data sections of the binary they analyze. This information is often not readily available in embedded systems firmware, often provided only in a non-standard format or as a raw memory dump. This paper proposes a novel methodology for ISA identification and code and data separation, that extends and improves the state of the art. We identify the main shortcoming of state-of-the-art approaches and add a capability to classify packed binaries’ architecture employing an entropy-based method. Then, we implement an LSTM-based model with heuristics to recognize the section boundaries inside a binary, showing that it outperforms state-of-the-art methods. Finally, we evaluate our approach on a dataset of binaries extracted from real-world firmware.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cova, M., Felmetsger, V., Banks, G., Vigna, G.: Static detection of vulnerabilities in x86 executables. In: Proceedings 22nd Annual Computer Security Applications Conference, ACSAC, pp. 269–278. IEEE (2006)
Song, D., et al.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89862-7_1
Brumley, D., Jager, I., Avgerinos, T., Schwartz, E.J.: BAP: a binary analysis platform. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 463–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_37
Shoshitaishvili, Y., et al.: SoK: (state of) the art of war: offensive techniques in binary analysis. In: Proceedings of 2016 IEEE Symposium on Security and Privacy, SP, pp. 138–157 (2016)
Shoshitaishvili, Y., Wang, R., Hauser, C., Kruegel, C., Vigna, G.: Firmalice-automatic detection of authentication bypass vulnerabilities in binary firmware. In: Proceedings of 2015 Network and Distributed System Security Symposium, NDSS (2015)
Haller, I., Slowinska, A., Neugschwandtner, M., Bos, H.: Dowsing for overflows: a guided fuzzer to find buffer boundary violations. In: Proceedings 22nd USENIX Security Symposium, USENIX Security 2013, pp. 49–64 (2013)
Corina, J., et al.: Difuze: interface aware fuzzing for kernel drivers. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 2123–2138 (2017)
Stephens, N., et al.: Driller: augmenting fuzzing through selective symbolic execution. In: Proceedings of 2016 Network and Distributed System Security Symposium, NDSS, vol. 16, pp. 1–16 (2016)
Wartell, R., Zhou, Y., Hamlen, K.W., Kantarcioglu, M., Thuraisingham, B.: Differentiating code from data in x86 binaries. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6913, pp. 522–536. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23808-6_34
De Nicolao, P., Pogliani, M., Polino, M., Carminati, M., Quarta, D., Zanero, S.: ELISA: ELiciting ISA of raw binaries for fine-grained code and data separation. In: Giuffrida, C., Bardin, S., Blanc, G. (eds.) DIMVA 2018. LNCS, vol. 10885, pp. 351–371. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93411-2_16
McDaniel, M., Heydari, M.H.: Content based file type detection algorithms. In: Proceedings of 36th Annual Hawaii International Conference on System Sciences (2003)
Li, W.-J., Wang, K., Stolfo, S.J., Herzog, B.: Fileprints: identifying file types by n-gram analysis. In: Proceedings of the 6th Annual IEEE SMC Information Assurance Workshop, IAW 2005, pp. 64–71. IEEE (2005)
Sportiello, L., Zanero, S.: Context-based file block classification. In: Peterson, G., Shenoi, S. (eds.) DigitalForensics 2012. IAICT, vol. 383, pp. 67–82. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33962-2_5
Penrose, P., Macfarlane, R., Buchanan, W.J.: Approaches to the classification of high entropy file fragments. Digit. Investig. 10(4), 372–384 (2013)
Clemens, J.: Automatic classification of object code using machine learning. Digit. Invest. 14, S156–S162 (2015)
Lyda, R., Hamrock, J.: Using entropy analysis to find encrypted and packed malware. IEEE Secur. Priv. 5, 40–45 (2007)
Granboulan, L.: cpu_rec: recognize CPU instructions in an arbitrary binary file (2017). https://github.com/airbus-seclab/cpu_rec
ReFirmLabs: Binwalk
Kairajärvi, S., Costin, A., Hämäläinen, T.: Isadetect: usable automated detection of CPU architecture and endianness for executable binary files and object code. In: Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy, CODASPY 2020, pp. 376–380. Association for Computing Machinery, New York (2020)
Andriesse, D., Chen, X., Van Der Veen, V., Slowinska, A., Bos, H.: An in-depth analysis of disassembly on full-scale x86/x64 binaries. In: Proceedings of 25th USENIX Security Symposium, USENIX Security 2016, pp. 583–600 (2016)
Linn, C., Debray, S.: Obfuscation of executable code to improve resistance to static disassembly. In: Proceedings of 10th ACM Conference on Computer and Communications Security, CCS 2003, pp. 290–299. ACM (2003)
Kruegel, C., Robertson, W., Valeur, F., Vigna, G.: Static disassembly of obfuscated binaries. In: Proceedings of 13th USENIX Security Symposium (2004)
Chen, J.-Y., Shen, B.-Y., Ou, Q.-H., Yang, W., Hsu, W.-C.: Effective code discovery for ARM/Thumb mixed ISA binaries in a static binary translator. In: Proceedings of 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2013, pp. 1–10 (2013)
Karampatziakis, N.: Static analysis of binary executables using structural SVMs. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 1063–1071. Curran Associates Inc. (2010)
Rosenblum, N., Zhu, X., Miller, B., Hunt, K.: Learning to analyze binary computer code. In: Proceedings of 23th AAAI Conference on Artificial Intelligence, AAAI 2008, pp. 798–804. AAAI Press (2008)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of 18th International Conference on Machine Learning, ICML 2001, pp. 282–289. Morgan Kaufmann Publishers Inc. (2001)
Bao, T., Burket, J., Woo, M., Turner, R., Brumley, D.: ByteWeight: learning to recognize functions in binary code. In: Proceedings of 23rd USENIX Security Symposium, pp. 845–860 (2014)
Shin, E.C.R., Song, D., Moazzezi, R.: Recognizing functions in binaries with neural networks. In: Proceedings of 24th USENIX Security Symposium, pp. 611–626 (2015)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proc. IRE 40(9), 1098–1101 (1952)
Shin, E.C.R., Song, D., Moazzezi, R.: Recognizing functions in binaries with neural networks. In: 24th USENIX Security Symposium (USENIX Security 2015), (Washington, D.C.), pp. 611–626. USENIX Association (2015)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014)
Bao, T., Burket, J., Woo, M., Turner, R., Brumley, D.: BYTEWEIGHT: learning to recognize functions in binary code. In: 23rd USENIX Security Symposium (USENIX Security 2014), (San Diego, CA), pp. 845–860. USENIX Association (2014)
Mkhatvari, N.: Towards big scale firmware analysis (2018)
Chen, D., Egele, M., Woo, M., Brumley, D.: Towards automated dynamic analysis for linux-based embedded firmware (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Hyperparameter Tuning
A Hyperparameter Tuning
In this appendix, we describe how we found the optimal values for some of the hyperparameters used in our approach.
1.1 A.1 Architecture Classifier
In our approach, we have to set two parameters to extract the uncompressed blocks from the binary. The first one is the size of the blocks on which we compute the entropy. We tried different values of the block size (128, 256, 512, 1024, 2048, 3072) and we found that the best value is 256. From 256 to 1024 the performances of the model remain the same, while from 2048 on we see that the performances of the classifier tend to decrease. We chose to use 256 as size as we want to extract compressed blocks in a more fine-grained way. This result is consistent with the value used in Lyda et al. [16]. The second parameter is the entropy threshold used to classify a block as compressed or not. To find the optimal value of this parameter we check the entropy of blocks inside the binaries. First, we compute the maximum entropy between the blocks of code inside a binary. Then we compute the mean between the maximum entropy of each binary for each architecture. In Table 7 we report the results of the mean maximum entropy for the different architectures. We can see that the entropy value is almost homogeneous between the architectures. In this way, we have an estimate of the value of the entropy of the blocks that we want to extract. Starting from a value of 6 as the entropy threshold, we test higher values and we find that the optimal value is 6.3. This value is also consistent with the confidence interval computed in Lyda et al. [16].
1.2 A.2 Section Identification
Bidirectional LSTM Hyperparameters. The first two hyperparameters are the dimension of the input and the dimension of the output vector of the LSTM cell. To tune these hyperparameters we perform a grid search between different values. As a starting point, we take the values used by Shin et al. for their model [28]. We define a vector of values for the input dimension [500, 1000, 2000] and for the dimension of the LSTM output [8, 16, 24, 32, 40]. The number of different models created through these values is the Cartesian product of the dimension of the two vectors since we want to check all the permutations. The model is evaluated on a validation set formed by 120 binaries taken from the Firmware dataset and with the new metrics. Since we use custom metrics, we are not able to use existent libraries to decide which hyperparameter value is better than another. The solution is to evaluate each model on the dataset and check the results by hand. The results show that for high dimensions of the input, the performances seem to decrease. So, we train and test the model again with different values of the input dimension that are [25, 50, 75, 100]. With these values, we see a great improvement in the performance of the model. Each model seems to have different best values for these hyperparameters, however, these values are included in a limited range that could be easily explored. The Table 8 shows the values of the hyperparameters used. We do not perform a grid search for the batch size: we take the same value used in the paper of Shin et al. The batch size can modify the time that the model takes to converge to an optimal solution, but should not have a great impact on the performance of the model with respect to the previous hyperparameters. The number of epochs for which the model has to train is set to 5. To decide this value we run a training phase on the model and we see that after 5 epochs the value of the loss function is low and it remains pretty constant on the successive epochs.
Postprocessing Parameters. For the postprocessing phase, we have to tune the parameters of the ELISA and Byteweight approach. The postprocessing phase of ELISA takes two parameters as input: the minimum number of sections that the prediction vector must have, and the maximum size of the chunk that can be eliminated, represented as the percentage of the size of the biggest chunk. These two parameters were already optimized by the authors of ELISA, so we decide to use the same values: 4 as the minimum number of sections and 0.1 for the chunk size. The postprocessing phase that we implemented takes two parameters as input. These two values define the range around the beginning of a section in which we search for a function prologue. To define this range we use two values that represent the positive offset and the negative offset from the section boundary. To find the optimal value for the positive offset, we used the Byteweight dataset to make an estimate of the positions of the functions prologue. We discovered that in some binaries, if the section does not begin with a function, the first encountered function is at an offset of 4 bytes. In order to avoid wrong modification of the start of a code section, we set the positive offset parameter to 3. For the negative offset parameter, we manually tested some values but, over a certain value, the results do not seem to change: only for large values of the offset (e.g., 500 bytes), the performance seems to decrease. We do not expect to find functions in a data section, but the Byteweight model can wrongly predict a function prologue. After some tests, we decide to set this parameter to 40 in order to prevent the Byteweight model from predicting a function in the data section.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Remigio, R., Bertani, A., Polino, M., Carminati, M., Zanero, S. (2023). The Good, the Bad, and the Binary: An LSTM-Based Method for Section Boundary Detection in Firmware Analysis. In: Shikata, J., Kuzuno, H. (eds) Advances in Information and Computer Security. IWSEC 2023. Lecture Notes in Computer Science, vol 14128. Springer, Cham. https://doi.org/10.1007/978-3-031-41326-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-41326-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41325-4
Online ISBN: 978-3-031-41326-1
eBook Packages: Computer ScienceComputer Science (R0)