Abstract—
Mixed Boolean-arithmetic expressions (MBA expressions) with t integer n-bit variables are often used for program obfuscations. Obfuscation consists of replacing short expressions with longer equivalent expressions that seem to take the analyst more time to explore. This paper shows that to simplify linear MBA expressions (reduce the number of terms), a technique similar to the technique of decoding linear codes by information sets can be applied. Based on this technique, algorithms for simplifying linear MBA expressions are constructed: an algorithm for finding an expression of minimum length and an algorithm for reducing the length of an expression. Based on the length reduction algorithm, an algorithm is constructed that allows us to estimate the resistance of an MBA expression to simplification. We experimentally estimate the dependence of the average number of terms in a linear MBA expression returned by simplification algorithms on n, the number of decoding iterations, and the power of the set of Boolean functions, by which a linear combination with a minimum number of nonzero coefficients is sought. The results of the experiments for all considered t and n show that if before obfuscation the linear MBA expression contained r = 1, 2, 3 terms, then the developed simplification algorithms with a probability close to one allow using the obfuscated version of this expression find an equivalent one with no more than r terms. This is the main difference between the information set decoding technique and the well-known techniques for simplifying linear MBA expressions, where the goal is to reduce the number of terms to no more than 2t. We also found that for randomly generated linear MBA expressions with increasing n, the average number of terms in the returned expression tends to 2t and does not differ from the average number of terms in the linear expression returned by known simplification algorithms. The results obtained, in particular, make it possible to determine t and n for which the number of terms in the simplified linear MBA expression on average will not be less than the given one.

REFERENCES
Barak, B., Goldreich, O., Impagliazzo, R., Rudich, S., Sahai, A., Vadhan, S., and Yang, K., On the (im)possibility of obfuscating programs, Advances in Cryptology—CRYPTO 2001, Kilian, J., Ed., Lecture Notes in Computer Science, vol. 2139, Berlin: Springer, 2001, pp. 1–18. https://doi.org/10.1007/3-540-44647-8_1
Zhou, Yo., Main, A., Gu, Yu.X., and Johnson, H., Information hiding in software with mixed Boolean-arithmetic transforms, Information Security Applications, Kim, S., Yung, M., and Lee, H.W., Eds., Lecture Notes in Computer Science, vol. 4867, Berlin: Springer, 2007, pp. 61–75. https://doi.org/10.1007/978-3-540-77535-5_5
Gulwani, S., Polozov, O., and Singh, R., Program synthesis, Found. Trends Program. Lang.s, 2017, vol. 4, nos. 1–2, pp. 1–119. https://doi.org/10.1561/2500000010
Reichenwallner, B. and Meerwald-Stadler, P., Efficient deobfuscation of linear mixed Boolean-arithmetic expressions, Proc. 2022 ACM Workshop on Research on Offensive and Defensive Techniques in the Context of Man At The End (MATE) Attacks, Los Angeles, 2022, New York: Association for Computing Machinery, 2022, pp. 19–28. https://doi.org/10.1145/3560831.3564256
Zobernig, L., Mathematical aspects of program obfuscation, PhD Dissertation, Auckland, New Zealand: Univ. of Auckland, 2020. http://hdl.handle.net/2292/53400.
Garba, P. and Favaro, M., SATURN–Software deobfuscation framework based on LLVM, Proc. 3rd ACM Workshop on Software Protection, London, 2019, New York: Association for Computing Machinery, 2019, pp. 27–38. https://doi.org/10.1145/3338503.3357721
Eyrolles, N., Obfuscation with mixed Boolean-arithmetic expressions: Reconstruction, analysis and simplification tools, PhD Dissertation, Paris: Univ. Paris-Saclay, 2017.
Xu, D., Liu, B., Feng, W., Ming, J., Zheng, Q., Li, J., and Yu, Q., Boosting SMT solver performance on mixed-bitwise-arithmetic expressions, Proc. 42nd ACM SIGPLAN Int. Conf. on Programming Language Design and Implementation, New York: Association for Computing Machinery, 2021, pp. 651–664. https://doi.org/10.1145/3453483.3454068
Liu, B., Shen, J., Ming, J., Zheng, Q., Li, J., and Xu, D., MBA-Blast: Unveiling and simplifying mixed Boolean-arithmetic obfuscation, 30th USENIX Security Symp. (USENIX Security 21), USENIX Association, 2021, pp. 1701–1718. https://www.usenix.org/conference/usenixsecurity21/presentation/liu-binbin.
Berlekamp, E., McEliece, R., and Van Tilborg, H., On the inherent intractability of certain coding problems (Corresp.), IEEE Trans. Inf. Theory, 1978, vol. 24, no. 3, pp. 384–386. https://doi.org/10.1109/tit.1978.1055873
Prange, E., The use of information sets in decoding cyclic codes, IEEE Trans. Inf. Theory, 1962, vol. 8, no. 5, pp. 5–9. https://doi.org/10.1109/tit.1962.1057777
Peters, Ch., Information-set decoding for linear codes over F q, Post-Quantum Cryptography, Sendrier, N., Ed., Lecture Notes in Computer Science, vol. 6061, Berlin: Springer, 2010, pp. 81–94. https://doi.org/10.1007/978-3-642-12929-2_7
Weger, V., Gassner, N., and Rosenthal, J., A survey on code-based cryptography, arXiv Preprint, 2022. https://doi.org/10.48550/arXiv.2201.07119
Funding
This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The author of this work declares that he has no conflicts of interest.
Additional information
Translated by V. Tereshchenko
Publisher’s Note.
Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
AI tools may have been used in the translation or editing of this article.
About this article
Cite this article
Kosolapov, Y.V. On Simplifying Mixed Boolean-Arithmetic Expressions. Aut. Control Comp. Sci. 58, 836–852 (2024). https://doi.org/10.3103/S0146411624700299
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411624700299