Obfuscating LLVM Intermediate Representation Source Code with NSGA-II

Torre, Juan Carlos de la; Aragón-Jurado, José Miguel; Jareño, Javier; Varrette, Sébastien; Dorronsoro, Bernabé

doi:10.1007/978-3-031-18409-3_18

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 532))

Included in the following conference series:

342 Accesses
1 Citations

Abstract

With the generalisation of distributed computing paradigms to sustain the surging demands for massive processing and data-analytic capabilities, the protection of the intellectual property tied to the executed programs transferred onto these remote shared platforms becomes critical. A more and more popular solution to this problem consists in applying obfuscating techniques, in particular at the source code level. Informally, the goal of obfuscation is to conceal the purpose of a program or its logic without altering its functionality, thus preventing reverse-engineering on the program even with the help of computing resources. This allows to protect software against plagiarism, tampering, or finding vulnerabilities that could be used for different kinds of attacks. The many advantages of code obfuscation, together with its low cost, makes it a popular technique. This paper proposes a novel methodology for source code obfuscation relying on the reference LLVM compiler infrastructure that can be used together with other traditional obfuscation techniques, making the code more robust against reverse engineering attacks. The problem is defined as a Multi-Objective Combinatorial Optimization (MOCO) problem, where the goal is to find sequences of LLVM optimizations that lead to highly obfuscated versions of the original code. These transformations are applied to the back-end pseudo-assembly code (i.e., LLVM Intermediate Representation), thus avoiding any further optimizations by the compiler. Three different problem flavours are defined and solved with popular NSGA-II genetic algorithm. The promising results show the potential of the proposed technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similarity of Binaries Across Optimization Levels and Obfuscation

Obfuscating LLVM IR with the Application of Lambda Calculus

An In-Depth Analysis of the Code-Reuse Gadgets Introduced by Software Obfuscation

Notes

1.
Since December 2011, “LLVM” is officially no longer an acronym and simply a brand that applies to the LLVM umbrella project. For more information, see www.llvm.org.
2.
Available in: github.com/jctor/Obfuscating-LLVM-Intermediate-Representation-Source-Code-with-NSGA-II—CISIS_2022.

References

Adelson-Velskii, G.M., Landis, E.M.: An algorithm for the organization of information. Sov. Math. Dokl. 3, 1259–1263 (1962)
Google Scholar
Al-Rashed, A.A., Alsarraf, J., Alnaqi, A.A.: Exergy optimization of a novel hydrogen production plant with fuel cell, heat recovery, and MED using NSGAII genetic algorithm. Int. J. Hydrogen Energy (2022)
Google Scholar
Behera, C.K., Bhaskari, D.L.: Different obfuscation techniques for code protection. Procedia Computer Science 70, 757–763 (2015). https://doi.org/10.1016/j.procs.2015.10.114
Benítez-Hidalgo, A., Nebro, A.J., García-Nieto, J., Oregi, I., Del Ser, J.: jMetalPy: a Python framework for multi-objective optimization with metaheuristics. Swarm Evol. Comput. 51, 100598 (2019). https://doi.org/10.1016/j.swevo.2019.100598
Article Google Scholar
Bertholon, B., Varrette, S., Bouvry, P.: JShadObf: a Javascript obfuscator based on multi-objective optimization algorithms. In: Lopez, J., Huang, X., Sandhu, R. (eds.) NSS 2013. LNCS, vol. 7873, pp. 336–349. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38631-2_25
Chapter Google Scholar
Bertholon, B., Varrette, S., Martinez, S.: ShadObf: A C-source Obfuscator based on multi-objective Optimization Algorithms. In: 27th IEEE/ACM Intl. Parallel and Distributed Processing Symposium (IPDPS 2013), pp. 435–444 (2013)
Google Scholar
Coello, C., Lamont, G.B., van Veldhuizen, D.A.: Evolutionary Algorithms for Solving Multi-Objective Problems. Springer, Cham (2007)
MATH Google Scholar
Collberg, C., Nagra, J.: Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection, 1st edn. Addison-Wesley Professional, Boston (2009)
Google Scholar
Dang, B., Gazet, A., Bachaalany, E., Josse, S.: Practical Reverse Engineering: x86, x64, ARM, Windows Kernel Reversing Tools, and Obfuscation. Wiley, Hoboken (2014)
Google Scholar
Deb, K.: Multi-Objective Optimization Using Evolutionary Algorithms. Wiley, Hoboken (2009)
MATH Google Scholar
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Article Google Scholar
Dorronsoro, B., Ruiz, P., Danoy, G., Pigné, Y., Bouvry, P.: Evolutionary Algorithms for Mobile Ad Hoc Networks. Wiley/IEEE Computer Society, Nature-Inspired Computing series (2014)
Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, Boston (1989)
MATH Google Scholar
Harrison, W., Magel, K.: A complexity measure based on nesting level. SIGPLAN Notices 16(3), 63–74 (1981)
Article Google Scholar
He, C., Ge, D., Yang, M., Yong, N., Wang, J., Yu, J.: A data-driven adaptive fault diagnosis methodology for nuclear power systems based on NSGAII-CNN. Ann. Nucl. Energy 159, 108326 (2021)
Article Google Scholar
Hosseinzadeh, S., et al.: Diversification and obfuscation techniques for software security: a systematic literature review. Inf. Softw. Technol. 104, 72–93 (2018)
Article Google Scholar
Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proc. IRE 40(9), 1098–1101 (1952). https://doi.org/10.1109/JRPROC.1952.273898
Article MATH Google Scholar
Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM - software protection for the masses. In: International Workshop on Software Protection, pp. 3–9. IEEE (2015)
Google Scholar
Kar, M.B., Kar, S., Guo, S., Li, X., Majumder, S.: A new bi-objective fuzzy portfolio selection model and its solution through evolutionary algorithms. Soft. Comput. 23(12), 4367–4381 (2018). https://doi.org/10.1007/s00500-018-3094-0
Article MATH Google Scholar
Kim, J.I., Lee, E.J.: A technique to apply inlining for code obfuscation based on genetic algorithm. J. Inf. Technol. Serv. 10(3), 167–177 (2011)
Google Scholar
Linn, C., Debray, S.: Obfuscation of executable code to improve resistance to static disassembly. In: 10th ACM Conference on Computer and Communications Security, pp. 290–299. ACM (2003)
Google Scholar
LLVM: The LLVM Compiler Infrastructure. https://llvm.org/
McCabe, T.: A complexity measure. IEEE Trans. Softw. Eng. SE-2(4), 308–320 (1976)
Google Scholar
Mohsen, R.: Quantitative measures for code obfuscation security. Ph.D. thesis, ICL (2016)
Google Scholar
Petke, J.: Genetic improvement for code obfuscation. In: Genetic and Evolutionary Computation Conference Companion, pp. 1135–1136. ACM (2016)
Google Scholar
Santiago, A., Dorronsoro, B., Fraire, H.J., Ruiz, P.: Micro-genetic algorithm with fuzzy selection of operators for multi-objective optimization: $\mu $FAME. Swarm Evol. Comput. 61, 100818 (2021)
Article Google Scholar
de la Torre, J.C., Ruiz, P., Dorronsoro, B., Galindo, P.L.: Analyzing the influence of LLVM code optimization passes on software performance. In: Medina, J., Ojeda-Aciego, M., Verdegay, J.L., Perfilieva, I., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2018. CCIS, vol. 855, pp. 272–283. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91479-4_23
Chapter Google Scholar
Varrette, S., Cartiaux, H., Peter, S., Kieffer, E., Valette, T., Olloh, A.: Management of an academic HPC & research computing facility: the ULHPC experience 2.0. In: Proceedings of the 6th ACM HPC and Cluster Technologies Conference (HPCCT 2022) (2022)
Google Scholar

Download references

Acknowledgement

This work was supported by Junta de Andalucía and ERDF under contract P18-2399 (GENIUS), the Ministerio de Ciencia, Innovación y Universidades and the ERDF (iSUN – RTI2018-100754-B-I00), and ERDF (OPTIMALE – FEDER-UCA18-108393). J.C. de la Torre acknowledges the Ministerio de Ciencia, Innovación y Universidades for the support through FPU grant (FPU17/00563). B. Dorronsoro acknowledges “ayuda de recualificación” funding by Ministerio de Universidades and the European Union-Next GenerationEU. The experiments presented in this paper were carried out using the HPC facilities of the University of Luxembourg [28] hpc.uni.lu.

Author information

Authors and Affiliations

Superior School of Engineering, University of Cádiz, Cádiz, Spain
Juan Carlos de la Torre, José Miguel Aragón-Jurado, Javier Jareño & Bernabé Dorronsoro
Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Sébastien Varrette
School of Computer Science, The University of Sydney, Sydney, Australia
Bernabé Dorronsoro

Authors

Juan Carlos de la Torre
View author publications
You can also search for this author in PubMed Google Scholar
José Miguel Aragón-Jurado
View author publications
You can also search for this author in PubMed Google Scholar
Javier Jareño
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Varrette
View author publications
You can also search for this author in PubMed Google Scholar
Bernabé Dorronsoro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernabé Dorronsoro .

Editor information

Editors and Affiliations

Faculty of Engineering, University of Deusto, Bilbao, Spain
Pablo García Bringas
University of León, León, Spain
Hilde Pérez García
University of La Rioja, Logroño, Spain
Francisco Javier Martínez de Pisón
University of Oviedo, Oviedo, Spain
José Ramón Villar Flecha
Data Science and Big Data Lab, Pablo de Olavide University, Sevilla, Spain
Alicia Troncoso Lora
University of Oviedo, Oviedo, Spain
Enrique A. de la Cal
Departamento de Ingeniería Informática, Escuela Politécnica Superior, University of Burgos, Burgos, Spain
Álvaro Herrero
Pablo de Olavide University, Seville, Spain
Francisco Martínez Álvarez
University of Bergamo, Bergamo, Italy
Giuseppe Psaila
Department of Industrial Engineering, University of A Coruña, Ferrol, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Torre, J.C.d.l., Aragón-Jurado, J.M., Jareño, J., Varrette, S., Dorronsoro, B. (2023). Obfuscating LLVM Intermediate Representation Source Code with NSGA-II. In: García Bringas, P., et al. International Joint Conference 15th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2022) 13th International Conference on EUropean Transnational Education (ICEUTE 2022). CISIS ICEUTE 2022 2022. Lecture Notes in Networks and Systems, vol 532. Springer, Cham. https://doi.org/10.1007/978-3-031-18409-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-18409-3_18
Published: 05 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18408-6
Online ISBN: 978-3-031-18409-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Obfuscating LLVM Intermediate Representation Source Code with NSGA-II