Skip to main content

Abstract

With the generalisation of distributed computing paradigms to sustain the surging demands for massive processing and data-analytic capabilities, the protection of the intellectual property tied to the executed programs transferred onto these remote shared platforms becomes critical. A more and more popular solution to this problem consists in applying obfuscating techniques, in particular at the source code level. Informally, the goal of obfuscation is to conceal the purpose of a program or its logic without altering its functionality, thus preventing reverse-engineering on the program even with the help of computing resources. This allows to protect software against plagiarism, tampering, or finding vulnerabilities that could be used for different kinds of attacks. The many advantages of code obfuscation, together with its low cost, makes it a popular technique. This paper proposes a novel methodology for source code obfuscation relying on the reference LLVM compiler infrastructure that can be used together with other traditional obfuscation techniques, making the code more robust against reverse engineering attacks. The problem is defined as a Multi-Objective Combinatorial Optimization (MOCO) problem, where the goal is to find sequences of LLVM optimizations that lead to highly obfuscated versions of the original code. These transformations are applied to the back-end pseudo-assembly code (i.e., LLVM Intermediate Representation), thus avoiding any further optimizations by the compiler. Three different problem flavours are defined and solved with popular NSGA-II genetic algorithm. The promising results show the potential of the proposed technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Since December 2011, “LLVM” is officially no longer an acronym and simply a brand that applies to the LLVM umbrella project. For more information, see www.llvm.org.

  2. 2.

    Available in: github.com/jctor/Obfuscating-LLVM-Intermediate-Representation-Source-Code-with-NSGA-II—CISIS_2022.

References

  1. Adelson-Velskii, G.M., Landis, E.M.: An algorithm for the organization of information. Sov. Math. Dokl. 3, 1259–1263 (1962)

    Google Scholar 

  2. Al-Rashed, A.A., Alsarraf, J., Alnaqi, A.A.: Exergy optimization of a novel hydrogen production plant with fuel cell, heat recovery, and MED using NSGAII genetic algorithm. Int. J. Hydrogen Energy (2022)

    Google Scholar 

  3. Behera, C.K., Bhaskari, D.L.: Different obfuscation techniques for code protection. Procedia Computer Science 70, 757–763 (2015). https://doi.org/10.1016/j.procs.2015.10.114

  4. Benítez-Hidalgo, A., Nebro, A.J., García-Nieto, J., Oregi, I., Del Ser, J.: jMetalPy: a Python framework for multi-objective optimization with metaheuristics. Swarm Evol. Comput. 51, 100598 (2019). https://doi.org/10.1016/j.swevo.2019.100598

    Article  Google Scholar 

  5. Bertholon, B., Varrette, S., Bouvry, P.: JShadObf: a Javascript obfuscator based on multi-objective optimization algorithms. In: Lopez, J., Huang, X., Sandhu, R. (eds.) NSS 2013. LNCS, vol. 7873, pp. 336–349. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38631-2_25

    Chapter  Google Scholar 

  6. Bertholon, B., Varrette, S., Martinez, S.: ShadObf: A C-source Obfuscator based on multi-objective Optimization Algorithms. In: 27th IEEE/ACM Intl. Parallel and Distributed Processing Symposium (IPDPS 2013), pp. 435–444 (2013)

    Google Scholar 

  7. Coello, C., Lamont, G.B., van Veldhuizen, D.A.: Evolutionary Algorithms for Solving Multi-Objective Problems. Springer, Cham (2007)

    MATH  Google Scholar 

  8. Collberg, C., Nagra, J.: Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection, 1st edn. Addison-Wesley Professional, Boston (2009)

    Google Scholar 

  9. Dang, B., Gazet, A., Bachaalany, E., Josse, S.: Practical Reverse Engineering: x86, x64, ARM, Windows Kernel Reversing Tools, and Obfuscation. Wiley, Hoboken (2014)

    Google Scholar 

  10. Deb, K.: Multi-Objective Optimization Using Evolutionary Algorithms. Wiley, Hoboken (2009)

    MATH  Google Scholar 

  11. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  12. Dorronsoro, B., Ruiz, P., Danoy, G., Pigné, Y., Bouvry, P.: Evolutionary Algorithms for Mobile Ad Hoc Networks. Wiley/IEEE Computer Society, Nature-Inspired Computing series (2014)

    Google Scholar 

  13. Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, Boston (1989)

    MATH  Google Scholar 

  14. Harrison, W., Magel, K.: A complexity measure based on nesting level. SIGPLAN Notices 16(3), 63–74 (1981)

    Article  Google Scholar 

  15. He, C., Ge, D., Yang, M., Yong, N., Wang, J., Yu, J.: A data-driven adaptive fault diagnosis methodology for nuclear power systems based on NSGAII-CNN. Ann. Nucl. Energy 159, 108326 (2021)

    Article  Google Scholar 

  16. Hosseinzadeh, S., et al.: Diversification and obfuscation techniques for software security: a systematic literature review. Inf. Softw. Technol. 104, 72–93 (2018)

    Article  Google Scholar 

  17. Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proc. IRE 40(9), 1098–1101 (1952). https://doi.org/10.1109/JRPROC.1952.273898

    Article  MATH  Google Scholar 

  18. Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM - software protection for the masses. In: International Workshop on Software Protection, pp. 3–9. IEEE (2015)

    Google Scholar 

  19. Kar, M.B., Kar, S., Guo, S., Li, X., Majumder, S.: A new bi-objective fuzzy portfolio selection model and its solution through evolutionary algorithms. Soft. Comput. 23(12), 4367–4381 (2018). https://doi.org/10.1007/s00500-018-3094-0

    Article  MATH  Google Scholar 

  20. Kim, J.I., Lee, E.J.: A technique to apply inlining for code obfuscation based on genetic algorithm. J. Inf. Technol. Serv. 10(3), 167–177 (2011)

    Google Scholar 

  21. Linn, C., Debray, S.: Obfuscation of executable code to improve resistance to static disassembly. In: 10th ACM Conference on Computer and Communications Security, pp. 290–299. ACM (2003)

    Google Scholar 

  22. LLVM: The LLVM Compiler Infrastructure. https://llvm.org/

  23. McCabe, T.: A complexity measure. IEEE Trans. Softw. Eng. SE-2(4), 308–320 (1976)

    Google Scholar 

  24. Mohsen, R.: Quantitative measures for code obfuscation security. Ph.D. thesis, ICL (2016)

    Google Scholar 

  25. Petke, J.: Genetic improvement for code obfuscation. In: Genetic and Evolutionary Computation Conference Companion, pp. 1135–1136. ACM (2016)

    Google Scholar 

  26. Santiago, A., Dorronsoro, B., Fraire, H.J., Ruiz, P.: Micro-genetic algorithm with fuzzy selection of operators for multi-objective optimization: \(\mu \)FAME. Swarm Evol. Comput. 61, 100818 (2021)

    Article  Google Scholar 

  27. de la Torre, J.C., Ruiz, P., Dorronsoro, B., Galindo, P.L.: Analyzing the influence of LLVM code optimization passes on software performance. In: Medina, J., Ojeda-Aciego, M., Verdegay, J.L., Perfilieva, I., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2018. CCIS, vol. 855, pp. 272–283. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91479-4_23

    Chapter  Google Scholar 

  28. Varrette, S., Cartiaux, H., Peter, S., Kieffer, E., Valette, T., Olloh, A.: Management of an academic HPC & research computing facility: the ULHPC experience 2.0. In: Proceedings of the 6th ACM HPC and Cluster Technologies Conference (HPCCT 2022) (2022)

    Google Scholar 

Download references

Acknowledgement

This work was supported by Junta de Andalucía and ERDF under contract P18-2399 (GENIUS), the Ministerio de Ciencia, Innovación y Universidades and the ERDF (iSUN – RTI2018-100754-B-I00), and ERDF (OPTIMALE – FEDER-UCA18-108393). J.C. de la Torre acknowledges the Ministerio de Ciencia, Innovación y Universidades for the support through FPU grant (FPU17/00563). B. Dorronsoro acknowledges “ayuda de recualificación” funding by Ministerio de Universidades and the European Union-Next GenerationEU. The experiments presented in this paper were carried out using the HPC facilities of the University of Luxembourg [28] hpc.uni.lu.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernabé Dorronsoro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Torre, J.C.d.l., Aragón-Jurado, J.M., Jareño, J., Varrette, S., Dorronsoro, B. (2023). Obfuscating LLVM Intermediate Representation Source Code with NSGA-II. In: García Bringas, P., et al. International Joint Conference 15th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2022) 13th International Conference on EUropean Transnational Education (ICEUTE 2022). CISIS ICEUTE 2022 2022. Lecture Notes in Networks and Systems, vol 532. Springer, Cham. https://doi.org/10.1007/978-3-031-18409-3_18

Download citation

Publish with us

Policies and ethics