Skip to main content
Log in

Metamorphic code generation from LLVM bytecode

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Metamorphic software changes its internal structure across generations with its functionality remaining unchanged. Metamorphism has been employed by malware writers as a means of evading signature detection and other advanced detection strategies. However, code morphing also has potential security benefits, since it can serve to increase the “genetic diversity” of software. We have created a metamorphic code generator within the LLVM compiler framework. LLVM is a three-phase compiler that supports multiple source languages and target architectures. It uses a common intermediate representation (IR) bytecode in its optimizer. Consequently, any supported high-level programming language is transformed to this IR bytecode as part of the LLVM compilation process. Our metamorphic generator functions at the IR bytecode level, which provides many advantages over morphing at the assembly or source code level. The morphing techniques that we employ include dead code insertion and transposition, where the dead code is actually executed within the morphed code, making its detection and removal more challenging. We have verified the effectiveness of our code morphing using hidden Markov model analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. “LLVM” was initially derived as an acronym for Low Level Virtual Machine. However, LLVM is now the official name—it is no longer an acronym.

References

  1. The Mental Driller, Metamorphism in practice or “How I made MetaPHOR and what I’ve learnt” (2002). http://download.adamas.ai/dlbase/Stuff/VX%20Heavens%20Library/vmd01.html

  2. An example of metamorphic virus. http://spth.virii.lu/main.html

  3. Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. J. Comput. Virol. 7(3), 201–214 (2011)

    Article  Google Scholar 

  4. Sridhara, S., Stamp, M.: Metamorphic worm that carries its own morphing engine. J. Comput. Virol. Hacking Tech. 9(2), 49–58 (2013)

    Article  Google Scholar 

  5. Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)

    Article  Google Scholar 

  6. Gao, X., Stamp, M.: Metamorphic software for buffer overflow mitigation. In: Dey, P.P., Amin, M.N. (eds.) Proceedings of 3rd Conference on Computer Science and its Applications. San Diego, California (2005)

  7. Stamp, M.: Risks of monoculture, Inside Risks 165. Commun. ACM 47(3):120 (2004). http://www.csl.sri.com/users/neumann/insiderisks04.html#165

    Google Scholar 

  8. Open Malware. http://www.offensivecomputing.net/

  9. Virus Construction Kits. http://computervirus.uw.hu/ch07lev1sec7.html

  10. Attaluri, S., McGhee, S., Stamp, M.: Profile hidden markov models and metamorphic virus detection. J. Comput. Virol. 5(2), 151–169 (2009)

    Article  Google Scholar 

  11. Lattner, C., Adve, V.: Architecture for a next generation GCC. In: First GCC Annual Developer’s Summit (2003). http://llvm.org/pubs/2003-05-01-GCCSummit2003pres.pdf

  12. The LLVM Compiler Infrastructure Project. http://llvm.org/

  13. Sharif, M. et al.: Impending Malware Analysis Using Conditional Code Obfuscation. College of Computing, Georgia Institute of Technology. http://cyber4.us/sites/default/files/Impeding%20Malware%20Analysis%20Using%20Conditional%20Code%20Obfuscation-NDSS2008.pdf

  14. Ma, W., et al.: Shadow attacks: automatically evading system-call behavior. J. Comput. Virol. 8(1–2), 1–13 (2012)

    Article  Google Scholar 

  15. Kazi, S., Stamp, M.: Hidden Markov models for software piracy detection. Inf. Secur. J. A Glob. Perspect. 22(3), 140–149 (2013)

    Article  Google Scholar 

  16. Baysa, D., Low, R.M., Stamp, M.: Structural entropy and metamorphic malware. J. Comput. Virol. Hacking Tech. 9(4), 179–192 (2013) (to appear)

    Google Scholar 

  17. Runwal, N., Low, R.M., Stamp, M.: Opcode graph similarity and metamorphic detection. J. Comput. Virol. 8(1–2), 37–52 (2012)

    Article  Google Scholar 

  18. Shanmugam, G., Low, R.M., Stamp, M.: Simple substitution distance and metamorphic detection. J. Comput. Virol. Hacking Tech. 9(3), 159–170 (2013)

    Article  Google Scholar 

  19. Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic virus detection. J. Comput. Virol. Hacking Tech. 9(1), 1–14 (2013)

    Article  Google Scholar 

  20. Panda Security, Virus, worms, trojans and backdoors: other harmful relatives of viruses (2011). http://www.pandasecurity.com/homeusers-cms3/security-info/about-malware/generalconcepts/concept-2.html

  21. Aycock, J.: Computer Viruses and Malware. Springer, New York (2006)

    Google Scholar 

  22. Filiol, E.: Computer Viruses: From Theory to Applications, vol. 1, pp. 19–38. Birkhäuser (2005)

  23. Computer virus creation kit. http://www.informit.com/articles/article.aspx?p=366890&seqNum=6

  24. Beaucamps, P.: Advanced metamorphic techniques in computer viruses. In: International Conference on Computer, Electrical, and Systems Science, and Engineering, CESSE’07. Venice, Italy (2007)

  25. Filiol, E.: Metamorphism, formal grammars and undecidable code mutation. Int. J. Comput. Sci. 2, 70–75 (2007)

    Google Scholar 

  26. Zbitskiy, P.: Code mutation techniques by means of formal grammars and automatons. J. Comput. Virol. 5(3), 199–207 (2009)

    Article  Google Scholar 

  27. LLVM Programming Manual. http://llvm.org/docs/ProgrammersManual.html

  28. The Lifelong Code Optimization Project. http://www-faculty.cs.uiuc.edu/vadve/lcoproject.html

  29. LLVM Architecture. http://www.aosabook.org/en/llvm.html

  30. Lattner, C., Adve, V.: A compilation framework for lifelong program analysis and transformation. In: Proceedings of the 2004 International Symposium on Code Generation and Optimization (2004). http://www.cgo.org/cgo2004/papers/06_76_lattner_c.pdf

  31. Praher, J.: A Change Framework Based on the Low Level Virtual Machine Compiler Infrastructure. Thesis Report, Johannes Kepler University (2007). http://llvm.cs.uiuc.edu/pubs/2007-04-PraherMSThesis.pdf

  32. LLVM, IR Bytecode Format. http://llvm.org/releases/1.3/docs/BytecodeFormat.html

  33. LLVM Helloworld in C. http://projects.prabir.me/compiler/wiki/LLVMHelloworldInC.ashx

  34. Stamp, M.: A revealing introduction to hidden Markov models (2012). http://www.cs.sjsu.edu/stamp/RUA/HMM.pdf

  35. Linux coreutils source code. http://ftp.gnu.org/gnu/coreutil

  36. Tamboli, T.: Metamorphic code generation from LLVM IR bytecode, Master’s Project 301 (2013). http://scholarworks.sjsu.edu/etd_projects/301/

  37. Spike Fuzzer Source Code. http://www.immunitysec.com/resources-freesoftware.shtml

  38. Introduction to fuzzing using spike fuzzer. http://resources.infosecinstitute.com/intro-to-fuzzing/

  39. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30, 1145–1159 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark Stamp.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tamboli, T., Austin, T.H. & Stamp, M. Metamorphic code generation from LLVM bytecode. J Comput Virol Hack Tech 10, 177–187 (2014). https://doi.org/10.1007/s11416-013-0194-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-013-0194-3

Keywords

Navigation