skip to main content
10.1145/3029806.3029824acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article

Scalable Function Call Graph-based Malware Classification

Published: 22 March 2017 Publication History

Abstract

In an attempt to preserve the structural information in malware binaries during feature extraction, function call graph-based features have been used in various research works in malware classification. However, the approach usually employed when performing classification on these graphs, is based on computing graph similarity using computationally intensive techniques. Due to this, much of the previous work in this area incurred large performance overhead and does not scale well. In this paper, we propose a linear time function call graph (FCG) vector representation based on function clustering that has significant performance gains in addition to improved classification accuracy. We also show how this representation can enable using graph features together with other non-graph features.

References

[1]
Microsoft malware classification challenge (big 2015). https://www.kaggle.com/c/malware-classification, 2015. {Online; accessed 27-April-2015}.
[2]
A. Appleby. Murmurhash3. https://github.com/aappleby/smhasher, 2008.
[3]
L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001.
[4]
A. Z. Broder. On the resemblance and containment of documents. In Compression and Complexity of Sequences 1997. Proceedings, pages 21--29. IEEE, 1997.
[5]
M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 380--388. ACM, 2002.
[6]
A. S. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th international conference on World Wide Web, pages 271--280. ACM, 2007.
[7]
T. Dullien and R. Rolles. Graph-based comparison of executable objects (english version). SSTIC, 5:1--3, 2005.
[8]
X. Hu, T.-c. Chiueh, and K. G. Shin. Large-scale malware indexing using function-call graphs. In Proceedings of the 16th ACM conference on Computer and communications security, pages 611--620. ACM, 2009.
[9]
J. Kinable and O. Kostakis. Malware classification based on call graph clustering. Journal in computer virology, 7(4):233--245, 2011.
[10]
D. Kong and G. Yan. Discriminant malware distance learning on structural information for automated malware classification. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1357--1365. ACM, 2013.
[11]
O. Kostakis, J. Kinable, H. Mahmoudi, and K. Mustonen. Improved call graph comparison using simulated annealing. In Proceedings of the 2011 ACM Symposium on Applied Computing, pages 1516--1523. ACM, 2011.
[12]
S. D. Nikolopoulos and I. Polenakis. A graph-based model for malware detection and classification using system-call groups. Journal of Computer Virology and Hacking Techniques, pages 1--18, 2016.
[13]
A. Rajaraman, J. D. Ullman, J. D. Ullman, and J. D. Ullman. Mining of massive datasets, volume 1. Cambridge University Press Cambridge, 2012.
[14]
K. Riesen and H. Bunke. Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision computing, 27(7):950--959, 2009.
[15]
B. G. Ryder. Constructing the call graph of a program. Software Engineering, IEEE Transactions on, (3):216--226, 1979.
[16]
L. Xu and E. Oja. Improved simulated annealing, boltzmann machine, and attributed graph matching. In Neural Networks, pages 151--160. Springer, 1990.
[17]
M. Xu, L. Wu, S. Qi, J. Xu, H. Zhang, Y. Ren, and N. Zheng. A similarity metric method of obfuscated malware using function-call graph. Journal of Computer Virology and Hacking Techniques, 9(1):35--47, 2013.

Cited By

View all
  • (2025)Multimodal Deep Learning for Android Malware ClassificationMachine Learning and Knowledge Extraction10.3390/make70100237:1(23)Online publication date: 28-Feb-2025
  • (2025)Semantics-Preserving Node Injection Attacks Against GNN-Based ACFG Malware ClassifiersIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.340941022:1(549-560)Online publication date: Jan-2025
  • (2024)A Malicious Program Behavior Detection Model Based on API Call SequencesElectronics10.3390/electronics1306109213:6(1092)Online publication date: 15-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CODASPY '17: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy
March 2017
382 pages
ISBN:9781450345231
DOI:10.1145/3029806
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 March 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph classification
  2. malware classification

Qualifiers

  • Research-article

Conference

CODASPY '17
Sponsor:

Acceptance Rates

CODASPY '17 Paper Acceptance Rate 21 of 134 submissions, 16%;
Overall Acceptance Rate 149 of 789 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)8
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Multimodal Deep Learning for Android Malware ClassificationMachine Learning and Knowledge Extraction10.3390/make70100237:1(23)Online publication date: 28-Feb-2025
  • (2025)Semantics-Preserving Node Injection Attacks Against GNN-Based ACFG Malware ClassifiersIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.340941022:1(549-560)Online publication date: Jan-2025
  • (2024)A Malicious Program Behavior Detection Model Based on API Call SequencesElectronics10.3390/electronics1306109213:6(1092)Online publication date: 15-Mar-2024
  • (2024)GAGE: Genetic Algorithm-Based Graph Explainer for Malware Analysis2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00179(2258-2270)Online publication date: 13-May-2024
  • (2024)Graph Based Analysis Technique for Identification of Key Functionalities in Malicious IoT Binaries2024 International Conference on Emerging Smart Computing and Informatics (ESCI)10.1109/ESCI59607.2024.10497453(1-8)Online publication date: 5-Mar-2024
  • (2024)Cimalir: Cross-Platform IoT Malware Clustering using Intermediate Representation2024 IEEE 14th Annual Computing and Communication Workshop and Conference (CCWC)10.1109/CCWC60891.2024.10427663(0460-0466)Online publication date: 8-Jan-2024
  • (2024)Android Malware Detection Method Based on Graph Convolutional Networks2024 International Conference on Artificial Intelligence and Power Systems (AIPS)10.1109/AIPS64124.2024.00027(95-98)Online publication date: 19-Apr-2024
  • (2024)Multi-class Malware Detection via Deep Graph Convolutional Networks Using TF-IDF-Based Attributed Call GraphsInformation Security Applications10.1007/978-981-99-8024-6_15(188-200)Online publication date: 11-Jan-2024
  • (2023)Humans vs. machines in malware classificationProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620302(1145-1162)Online publication date: 9-Aug-2023
  • (2023)Proposing A New Approach for Detecting Malware Based on the Event Analysis TechniqueInternational Journal of Innovative Technology and Exploring Engineering10.35940/ijitee.H9651.071282312:8(21-27)Online publication date: 30-Jul-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media