Skip to main content
Log in

On callgraphs and generative mechanisms

  • Original Paper
  • Published:
Journal in Computer Virology Aims and scope Submit manuscript

An Erratum to this article was published on 06 September 2007

Abstract

This paper examines the structural features of callgraphs. The sample consisted of 120 malicious and 280 non-malicious executables. Pareto models were fitted to indegree, outdegree and basic block count distribution, and a statistically significant difference shown for the derived power law exponent. A two-step optimization process involving human designers and code compilers is proposed to account for these structural features of executables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Adamic L. and Huberman B. (2002). Zipf’s law and the internet. Glottometrics 3: 143–150

    Google Scholar 

  2. Alexander S. (2005). Defeating compiler-level buffer overflow protection.. j-LOGIN 30(3): 59–71

    MATH  Google Scholar 

  3. Barabasi A.L. (1999). Mean field theory for scale-free random networks. Phys. A Stat. Mech. Appl. 272: 173–187, cond-mat/9907068

    Article  Google Scholar 

  4. Bilar, D.: Fingerprinting malicious code through statistical opcode analysis. In: ICGeS ’07: Proceedings of the 3rd International Conference on Global E-Security, London (UK) (2007)

  5. Carlson J.M. and Doyle J. (1999). Highly optimized tolerance: A mechanism for power laws in designed systems. Phys. Rev. E 60(2): 1412

    Article  Google Scholar 

  6. Carlson J.M. and Doyle J. (2002). Complexity and robustness. Proc. Natl. Acad. Sci. 99(Suppl 1): 2538–2545

    Article  Google Scholar 

  7. Chatzigeorgiou, A., Tsantalis, N., Stephanides, G.: Application of graph theory to OO software engineering. In WISER ’06: Proceedings of the 2006 International Workshop on Workshop on Interdisciplinary Software Engineering Research, pp. 29–36, New York, NY, USA. ACM Press, New York (2006)

  8. Christodorescu, M., Jha, S.: Static analysis of executables to detect malicious patterns. In Security ’03: Proceedings of the 12th USENIX Security Symposium, pp. 169–186. USENIX Association, USENIX Association (2003)

  9. Clementi, A.: Anti-virus comparative no. 11. Technical report, Kompetenzzentrum IT, Insbruck (Austria). http://www.av-comparatives.org/seiten/ergebnisse/report11.pdf (2006)

  10. Cowan, C., Pu, C., Maier, D., Walpole, J., Bakke, P., Beattie, S., Grier, A., Wagle, P., Zhang, Q., Hinton, H.: StackGuard: Automatic adaptive detection and prevention of buffer-overflow attacks. In: Proceedings of 7th USENIX Security Conference, pp. 63–78, San Antonio, Texas (1998)

  11. Doyle J. and Carlson J.M. (2000). Power laws, highly optimized tolerance and generalized source coding. Phys. Rev. Lett. 84(24): 5656–5659

    Article  Google Scholar 

  12. Doyle J.C., Alderson D.L., Li L., Low S., Roughan M., Shalunov S., Tanaka R. and Willinger W. (2005). The “robust yet fragile” nature of the Internet. Proc. Natl. Acad. Sci. 102(41): 14497–14502

    Article  Google Scholar 

  13. Dullien, T.: Binnavi v1.2. http://www.sabre-security.com/products/binnavi.html (2006)

  14. Dullien, T., Rolles, R.: Graph-based comparison of executable objects. In SSTIC ’05: Symposium sur la Sécurité des Technologies de l’Information et des Communications. Rennes, France (2005)

  15. Ekeland I. (2006). The Best of All Possible Worlds: Mathematics and Destiny. University of Chicago Press, Chicago

    MATH  Google Scholar 

  16. Fan, Z.: Estimation problems for distributions with heavy tails. PhD thesis, Georg-August-Universität zu Göttingen (2001)

  17. Filiol É. (2007). Metamorphism, formal grammars and undecidable code mutation. Int. J. Comput. Sci. 2(2): 70–75

    Google Scholar 

  18. Flake, H.: Compare, Port, Navigate. Black Hat Europe 2005 Briefings and Training (2005)

  19. Foster J.C., Osipov V., Bhalla N. and Heinen N. (2005). Buffer Overflow Attacks. Syngress, Rockland, USA

    Google Scholar 

  20. Gamma E., Helm R., Johnson R. and Vlissides J. (1993). Design patterns: Abstraction and reuse of object-oriented design. Lect. Notes Comput. Sci. 707: 406–431

    Article  Google Scholar 

  21. Goldstein M.L., Morris S.A. and Yen G.G. (2004). Problems with fitting to the power-law distribution. Eur. J. Phys. B 41(2): 255–258, cond-mat/0402322

    Article  Google Scholar 

  22. Guilfanov, I.: Ida pro v5.0.0.879. http://www.datarescue.com/idabase/ (2006)

  23. Haneda, M., Knijnenburg, P.M.W., Wijshoff, H.A.G.: Optimizing general purpose compiler optimization. In: CF ’05: Proceedings of the 2nd Conference on Computing Frontiers, pp. 180–188, New York, NY, USA. ACM Press, New York (2005)

  24. herm1t. VX Heaven. http://vx.netlux.org// (2007)

  25. Kirchner J.W. (1993). Statistical inevitability of horton’s laws and the apparent randomness of stream channel networks. Geology 21: 591–594

    Article  Google Scholar 

  26. Knuth D.E. (1976). Big omicron and big omega and big theta. SIGACT News 8(2): 18–24

    Article  Google Scholar 

  27. Krügel, C., Kirda, E., Mutz, D., Robertson, W., Vigna, G.: Polymorphic worm detection using structural information of executables. In: Valdes, A., Zamboni, D. (eds.) Recent Advances in Intrusion Detection, vol. 3858 of Lecture Notes in Computer Science, pp. 207–226. Springer, Heidelberg (2005)

  28. Lakos J. (1996). Large-scale C++ software design. Addison Wesley Longman Publishing Co., Inc, Redwood City

    Google Scholar 

  29. Li, L., Alderson, D., Willinger, W., Doyle, J.: A first-principles approach to understanding the internet’s router-level topology. In: SIGCOMM ’04: Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 3–14, New York, NY, USA. ACM Press, New York (2004)

  30. Li, W.-J., Wang, K., Stolfo, S., Herzog, B.: Fileprints: Identifying file types by n-gram analysis. In SMC ’05: Proceedings from the Sixth Annual IEEE Information Assurance Workshop on Systems, Man and Cybernetics, pp. 64– 71. West Point, New York (2005)

  31. Limpert E., Stahel W.A. and Abbt M. (2001). Log-normal distributions across the sciences: keys and clues. BioScience 51(5): 341–352

    Article  Google Scholar 

  32. Manning M., Carlson J.M. and Doyle J. (2005). Highly optimized tolerance and power laws in dense and sparse resource regimes. Phys. Rev. E (Stat. Nonlinear Soft Matter Phys.) 72(1): 016108–016125, physics/0504136

    Google Scholar 

  33. Miller, B.P., Cooksey, G., Moore, F.: An empirical study of the robustness of macos applications using random testing. In: RT ’06: Proceedings of the 1st International workshop on Random Testing, pp. 46–54, New York, NY, USA. ACM Press, New York (2006)

  34. Miller B.P., Fredriksen L. and So B. (1990). An empirical study of the reliability of unix utilities. Commun. ACM 33(12): 32–44

    Article  Google Scholar 

  35. Milo R., Shen-Orr S., Itzkovitz S., Kashtan N., Chklovskii D. and Alon U. (2002). Network Motifs: Simple Building Blocks of Complex Networks. Science 298(5594): 824–827

    Article  Google Scholar 

  36. Mina Guirguis, A.B., Matta, I.: Reduction of quality (roq) attacks on dynamic load balancers: Vulnerability assessment and design tradeoffs. In: Infocom ’07: Proceedings of the 26th IEEE International Conference on Computer Communication, Anchorage (AK) (2007, to appear)

  37. Mina Guirguis, I.M., Bestavros, A., Zhang, Y.: Adversarial exploits of end-systems adaptation dynamics. J. Parallel Distrib. Comput. (2007, to appear)

  38. Mitzenmacher M. (2004). Dynamic models for file sizes and double pareto distributions. Internet Math. 1(3): 305–334

    MATH  MathSciNet  Google Scholar 

  39. Muchnick, S.S.: Advanced compiler design and implementation. Morgan Kaufmann Publishers Inc., San Francisco, pp 326–327 (1998), ISBN 1-55860-320-4

  40. Myers C. (2003). Software systems as complex networks: Structure, function and evolvability of software collaboration graphs. Phys. Rev. E (Stat. Nonlinear Soft Matter Phys.) 68(4): 046116

    MathSciNet  Google Scholar 

  41. Newman M. (2005). Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46(5): 323–351

    Article  Google Scholar 

  42. Newman M., Barabasi A.-L. and Watts D.J. (2006). The Structure and Dynamics of Networks: (Princeton Studies in Complexity). Princeton University Press, Princeton

    MATH  Google Scholar 

  43. Newman M.E.J. (2003). The structure and function of complex networks. SIAM Rev. 45: 167

    Article  MATH  MathSciNet  Google Scholar 

  44. Potanin A., Noble J., Frean M. and Biddle R. (2005). Scale-free geometry in oo programs. Commun. ACM 48(5): 99–103

    Article  Google Scholar 

  45. Pržulj, N.: Biological network comparison using graphlet degree distribution. In: Proceedings of the 2006 European Conference on Computational Biology, ECCB ’06, Oxford, UK. Oxford University Press, New York (2006)

  46. Resnick S. (1997). Heavy tail modeling and teletraffic data. Ann. Stat. 25(5): 1805–1869

    Article  MATH  MathSciNet  Google Scholar 

  47. Schneider E.D. and Sagan D. (2005). Into the Cool : Energy Flow, Thermodynamics and Life. University Of Chicago Press, Chicago

    Google Scholar 

  48. Skoudis E. and Zeltser L. (2003). Malware: Fighting Malicious Code. Prentice Hall PTR, Upper Saddle River

    Google Scholar 

  49. Szor, P.: The Art of Computer Virus Research and Defense, pp. 252–293. Prentice Hall PTR, Upper Saddle River (2005)

  50. Szor P. (2005). The Art of Computer Virus Research and Defense. Addison-Wesley Professional, Upper Saddle River (NJ)

    Google Scholar 

  51. Szor, P., Ferrie, P.: Hunting for metamorphic. In: VB ’01: Proceedings of the 11th Virus Bulletin Conference (2001)

  52. Valverde S., Ferrer Cancho R. and Solé R.V. (2002). Scale-free networks from optimal design. Europhys. Lett. 60: 512–517, cond-mat/0204344

    Article  Google Scholar 

  53. Valverde S. and Sole R.V. (2005). Logarithmic growth dynamics in software networks. Europhys. Lett. 72: 5–12, physics/0511064

    Article  Google Scholar 

  54. Weber, M., Schmid, M., Schatz, M., Geyer, D.: A toolkit for detecting and analyzing malicious software. In: ACSAC ’02: Proceedings of the 18th Annual Computer Security Applications Conference, Washington (DC) (2002)

  55. Whittaker J. and Thompson H. (2003). How to break Software security. Addison Wesley (Pearson Education), Reading

    Google Scholar 

  56. Willinger, W., Alderson, D., Doyle, J.C., Li, L.: More normal than normal: scaling distributions and complex systems. In: WSC ’04: Proceedings of the 36th Conference on Winter Simulation, pp. 130–141. Winter Simulation Conference (2004)

  57. Wu G.T., Twomey S.L. and Thiers R.E. (1975). Statistical evaluation of method-comparison data. Clin. Chem. 21(3): 315–320

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Bilar.

Additional information

An erratum to this article can be found at http://dx.doi.org/10.1007/s11416-007-0061-1

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bilar, D. On callgraphs and generative mechanisms. J Comput Virol 3, 285–297 (2007). https://doi.org/10.1007/s11416-007-0057-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-007-0057-x

Keywords

Navigation