Abstract
Metamorphic software changes its internal structure across generations with its functionality remaining unchanged. Metamorphism has been employed by malware writers as a means of evading signature detection and other advanced detection strategies. However, code morphing also has potential security benefits, since it can serve to increase the “genetic diversity” of software. We have created a metamorphic code generator within the LLVM compiler framework. LLVM is a three-phase compiler that supports multiple source languages and target architectures. It uses a common intermediate representation (IR) bytecode in its optimizer. Consequently, any supported high-level programming language is transformed to this IR bytecode as part of the LLVM compilation process. Our metamorphic generator functions at the IR bytecode level, which provides many advantages over morphing at the assembly or source code level. The morphing techniques that we employ include dead code insertion and transposition, where the dead code is actually executed within the morphed code, making its detection and removal more challenging. We have verified the effectiveness of our code morphing using hidden Markov model analysis.
Similar content being viewed by others
Notes
“LLVM” was initially derived as an acronym for Low Level Virtual Machine. However, LLVM is now the official name—it is no longer an acronym.
References
The Mental Driller, Metamorphism in practice or “How I made MetaPHOR and what I’ve learnt” (2002). http://download.adamas.ai/dlbase/Stuff/VX%20Heavens%20Library/vmd01.html
An example of metamorphic virus. http://spth.virii.lu/main.html
Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. J. Comput. Virol. 7(3), 201–214 (2011)
Sridhara, S., Stamp, M.: Metamorphic worm that carries its own morphing engine. J. Comput. Virol. Hacking Tech. 9(2), 49–58 (2013)
Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)
Gao, X., Stamp, M.: Metamorphic software for buffer overflow mitigation. In: Dey, P.P., Amin, M.N. (eds.) Proceedings of 3rd Conference on Computer Science and its Applications. San Diego, California (2005)
Stamp, M.: Risks of monoculture, Inside Risks 165. Commun. ACM 47(3):120 (2004). http://www.csl.sri.com/users/neumann/insiderisks04.html#165
Open Malware. http://www.offensivecomputing.net/
Virus Construction Kits. http://computervirus.uw.hu/ch07lev1sec7.html
Attaluri, S., McGhee, S., Stamp, M.: Profile hidden markov models and metamorphic virus detection. J. Comput. Virol. 5(2), 151–169 (2009)
Lattner, C., Adve, V.: Architecture for a next generation GCC. In: First GCC Annual Developer’s Summit (2003). http://llvm.org/pubs/2003-05-01-GCCSummit2003pres.pdf
The LLVM Compiler Infrastructure Project. http://llvm.org/
Sharif, M. et al.: Impending Malware Analysis Using Conditional Code Obfuscation. College of Computing, Georgia Institute of Technology. http://cyber4.us/sites/default/files/Impeding%20Malware%20Analysis%20Using%20Conditional%20Code%20Obfuscation-NDSS2008.pdf
Ma, W., et al.: Shadow attacks: automatically evading system-call behavior. J. Comput. Virol. 8(1–2), 1–13 (2012)
Kazi, S., Stamp, M.: Hidden Markov models for software piracy detection. Inf. Secur. J. A Glob. Perspect. 22(3), 140–149 (2013)
Baysa, D., Low, R.M., Stamp, M.: Structural entropy and metamorphic malware. J. Comput. Virol. Hacking Tech. 9(4), 179–192 (2013) (to appear)
Runwal, N., Low, R.M., Stamp, M.: Opcode graph similarity and metamorphic detection. J. Comput. Virol. 8(1–2), 37–52 (2012)
Shanmugam, G., Low, R.M., Stamp, M.: Simple substitution distance and metamorphic detection. J. Comput. Virol. Hacking Tech. 9(3), 159–170 (2013)
Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic virus detection. J. Comput. Virol. Hacking Tech. 9(1), 1–14 (2013)
Panda Security, Virus, worms, trojans and backdoors: other harmful relatives of viruses (2011). http://www.pandasecurity.com/homeusers-cms3/security-info/about-malware/generalconcepts/concept-2.html
Aycock, J.: Computer Viruses and Malware. Springer, New York (2006)
Filiol, E.: Computer Viruses: From Theory to Applications, vol. 1, pp. 19–38. Birkhäuser (2005)
Computer virus creation kit. http://www.informit.com/articles/article.aspx?p=366890&seqNum=6
Beaucamps, P.: Advanced metamorphic techniques in computer viruses. In: International Conference on Computer, Electrical, and Systems Science, and Engineering, CESSE’07. Venice, Italy (2007)
Filiol, E.: Metamorphism, formal grammars and undecidable code mutation. Int. J. Comput. Sci. 2, 70–75 (2007)
Zbitskiy, P.: Code mutation techniques by means of formal grammars and automatons. J. Comput. Virol. 5(3), 199–207 (2009)
LLVM Programming Manual. http://llvm.org/docs/ProgrammersManual.html
The Lifelong Code Optimization Project. http://www-faculty.cs.uiuc.edu/vadve/lcoproject.html
LLVM Architecture. http://www.aosabook.org/en/llvm.html
Lattner, C., Adve, V.: A compilation framework for lifelong program analysis and transformation. In: Proceedings of the 2004 International Symposium on Code Generation and Optimization (2004). http://www.cgo.org/cgo2004/papers/06_76_lattner_c.pdf
Praher, J.: A Change Framework Based on the Low Level Virtual Machine Compiler Infrastructure. Thesis Report, Johannes Kepler University (2007). http://llvm.cs.uiuc.edu/pubs/2007-04-PraherMSThesis.pdf
LLVM, IR Bytecode Format. http://llvm.org/releases/1.3/docs/BytecodeFormat.html
LLVM Helloworld in C. http://projects.prabir.me/compiler/wiki/LLVMHelloworldInC.ashx
Stamp, M.: A revealing introduction to hidden Markov models (2012). http://www.cs.sjsu.edu/stamp/RUA/HMM.pdf
Linux coreutils source code. http://ftp.gnu.org/gnu/coreutil
Tamboli, T.: Metamorphic code generation from LLVM IR bytecode, Master’s Project 301 (2013). http://scholarworks.sjsu.edu/etd_projects/301/
Spike Fuzzer Source Code. http://www.immunitysec.com/resources-freesoftware.shtml
Introduction to fuzzing using spike fuzzer. http://resources.infosecinstitute.com/intro-to-fuzzing/
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30, 1145–1159 (1997)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tamboli, T., Austin, T.H. & Stamp, M. Metamorphic code generation from LLVM bytecode. J Comput Virol Hack Tech 10, 177–187 (2014). https://doi.org/10.1007/s11416-013-0194-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-013-0194-3