skip to main content
10.1145/2402599.2402604acmconferencesArticle/Chapter ViewAbstractPublication PagesaintecConference Proceedingsconference-collections
research-article

Malware classification based on extracted API sequences using static analysis

Published: 14 November 2012 Publication History

Abstract

In this paper, we propose a highly accurate, automatic malware-classification method, which extracts features by conducting static analysis of malware samples and the structure of malware source code. In the proposed extraction method, the presence and absence of particular pairs of consecutive Application Program Interface function calls (APIs) in the API-sequence graph are compared with those in the executable code for a sample within which malware features have been identified. To determine the degree of similarity between samples, Dice's coefficient is applied. To visualize the grouping of samples with similar features, we use hierarchical cluster analysis based on the extracted features. The results of the analysis are presented as a dendrogram with colored nodes for each family name. To evaluate the proposed method, we set up a malware-analysis system comprising a combination of disassembler, control-flow analyzer, API-sequence extractor, similarity calculator and hierarchical cluster analyzer. We acquired 4,684 malware samples, from 1,821 of which we successfully extracted API sequences to which we applied our proposed classification method. We found that the automatic hierarchical cluster analysis was processed rapidly, with significant clusters of variant groups obtained.

References

[1]
M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario. Automated classification and analysis of internet malware. In Proceedings of the 10th international conference on Recent advances in intrusion detection, pages 178--197, Berlin, Heidelberg, 2007. Springer-Verlag.
[2]
Z. Bu, T. Dirro, P. Greve, Y. Lin, D. Marcus, F. Paget, C. Schmugar, J. Shah, D. Sommer, P. Szor, and A. Wosotowsky. Mcafee threats report: First quarter 2012.
[3]
M. Christodorescu, S. Jha, and C. Kruegel. Mining specifications of malicious behavior. In Proceedings of the 1st India software engineering conference, pages 5--14, New York, NY, USA, 2008. ACM.
[4]
H. Flake. Automated unpacking and malware classification. In Black Hat Japan, pages 61--88, Tokyo, Japan, 2007.
[5]
K. Iwamoto and K. Wasaki. Detecting original entry point based on comparing runtime library codes in malware unpacking. TECHNICAL REPORT OF IEICE. ICSS, 111(82):57--62, 2011.
[6]
S. Josse. Secure and advanced unpacking using computer emulation. In AVAR 2006 Conference, pages 174--190, Auckland, New Zealand, 2006.
[7]
M. G. Kang, P. Poosankam, and H. Yin. Renovo: a hidden code extractor for packed executables. In Proceedings of the 2007 ACM workshop on Recurring malcode, pages 46--53, New York, NY, USA, 2007. ACM.
[8]
H. C. Kim, D. Inoue, M. Eto, Y. Takagi, and K. Nakao. Toward generic unpacking techniques for malware analysis with quantification of code revelation. In Joint Workshop on Information Security 2009, Kaohsiung, Taiwan, 2009.
[9]
C. Kruegel, E. Kirda, D. Mutz, W. Robertson, and G. Vigna. Polymorphic worm detection using structural information of executables. In Proceedings of the 8th international conference on Recent Advances in Intrusion Detection, pages 207--226, Berlin, Heidelberg, 2006. Springer-Verlag.
[10]
C. Lungu and M. Botis. Cj-unpack: Efficient runtime unpacking system. In 19th EICAR Annual Conference, pages 235--253, Paris, France, 2010.
[11]
L. Martignoni, M. Christodorescu, and S. Jha. Omniunpack: Fast, generic, and safe unpacking of malware. In In Proceedings of the Annual Computer Security Applications Conference (ACSAC), 2007.
[12]
P. Royal, M. Halpin, D. Dagon, R. Edmonds, and W. Lee. Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In Proceedings of the 22nd Annual Computer Security Applications Conference, pages 289--300, Washington, DC, USA, 2006. IEEE Computer Society.
[13]
B. Schwarz, S. Debray, and G. Andrews. Disassembly of executable code revisited. In In Proc. IEEE 2002 Working Conference on Reverse Engineering (WCRE), pages 45--54. IEEE Computer Society, 2002.
[14]
Y. Ye, Y. Mei, and R. Peng. Mcns: Intelligent malware categorization and naming system. In AVAR 2009 Conference, pages 15--25, Kyoto, Japan, 2009.
[15]
Q. Zhang and D. S. Reeves. Metaaware: Identifying metamorphic malware. Computer Security Applications Conference, Annual, 0:411--420, 2007.

Cited By

View all
  • (2025)Convolutional Neural Network for Classification of Image-Based Malware: A Deep Learning ApproachProceedings of the 1st International Conference on Intelligent Healthcare and Computational Neural Modelling10.1007/978-981-99-2832-3_56(469-480)Online publication date: 5-Jan-2025
  • (2024)AMN: Attention-based Multimodal Network for Android Malware Classification2024 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE International Conference on Robotics, Automation and Mechatronics (RAM)10.1109/CIS-RAM61939.2024.10672730(7-13)Online publication date: 8-Aug-2024
  • (2024)Evading Userland API Hooking, Again: Novel Attacks and a Principled Defense MethodDetection of Intrusions and Malware, and Vulnerability Assessment10.1007/978-3-031-64171-8_8(150-173)Online publication date: 9-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AINTEC '12: Proceedings of the 8th Asian Internet Engineering Conference
November 2012
93 pages
ISBN:9781450318143
DOI:10.1145/2402599
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 November 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. API sequence
  2. classification
  3. control-flow analysis
  4. malware
  5. static analysis

Qualifiers

  • Research-article

Conference

AINTEC '12
Sponsor:
AINTEC '12: Asian Internet Engineering Conference
November 14 - 16, 2012
Bangkok, Thailand

Acceptance Rates

Overall Acceptance Rate 15 of 38 submissions, 39%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)4
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Convolutional Neural Network for Classification of Image-Based Malware: A Deep Learning ApproachProceedings of the 1st International Conference on Intelligent Healthcare and Computational Neural Modelling10.1007/978-981-99-2832-3_56(469-480)Online publication date: 5-Jan-2025
  • (2024)AMN: Attention-based Multimodal Network for Android Malware Classification2024 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE International Conference on Robotics, Automation and Mechatronics (RAM)10.1109/CIS-RAM61939.2024.10672730(7-13)Online publication date: 8-Aug-2024
  • (2024)Evading Userland API Hooking, Again: Novel Attacks and a Principled Defense MethodDetection of Intrusions and Malware, and Vulnerability Assessment10.1007/978-3-031-64171-8_8(150-173)Online publication date: 9-Jul-2024
  • (2023)Analysis of Program Representations Based on Abstract Syntax Trees and Higher-Order Markov Chains for Source Code Classification TaskFuture Internet10.3390/fi1509031415:9(314)Online publication date: 18-Sep-2023
  • (2023)Malware detection using Explainable ML models based on Feature Extraction using API calls2023 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD)10.1109/icABCD59051.2023.10220515(1-7)Online publication date: 3-Aug-2023
  • (2023)Ransomware Extraction Using Static Portable Executable (PE) Feature-Based Approach2023 6th International Conference of Computer and Informatics Engineering (IC2IE)10.1109/IC2IE60547.2023.10331246(70-74)Online publication date: 14-Sep-2023
  • (2022)Malware Classification Using Convolutional Fuzzy Neural Networks Based on Feature Fusion and the Taguchi MethodApplied Sciences10.3390/app12241293712:24(12937)Online publication date: 16-Dec-2022
  • (2022)Sequential Embedding-based Attentive (SEA) classifier for malware classification2022 International Conference on Cyber Warfare and Security (ICCWS)10.1109/ICCWS56285.2022.9998431(28-35)Online publication date: 7-Dec-2022
  • (2022)Big Data Privacy and Security Using Abundant Data Recovery Techniques and Data Obliviousness MethodologiesIEEE Access10.1109/ACCESS.2022.321130410(105458-105484)Online publication date: 2022
  • (2022)DACN: Malware Classification Based on Dynamic Analysis and Capsule NetworksFrontiers in Cyber Security10.1007/978-981-19-0523-0_1(3-13)Online publication date: 1-Mar-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media