research-article

Malware classification based on extracted API sequences using static analysis

Authors:

Kazuki Iwamoto,

Katsumi WasakiAuthors Info & Claims

AINTEC '12: Proceedings of the 8th Asian Internet Engineering Conference

Pages 31 - 38

https://doi.org/10.1145/2402599.2402604

Published: 14 November 2012 Publication History

Abstract

In this paper, we propose a highly accurate, automatic malware-classification method, which extracts features by conducting static analysis of malware samples and the structure of malware source code. In the proposed extraction method, the presence and absence of particular pairs of consecutive Application Program Interface function calls (APIs) in the API-sequence graph are compared with those in the executable code for a sample within which malware features have been identified. To determine the degree of similarity between samples, Dice's coefficient is applied. To visualize the grouping of samples with similar features, we use hierarchical cluster analysis based on the extracted features. The results of the analysis are presented as a dendrogram with colored nodes for each family name. To evaluate the proposed method, we set up a malware-analysis system comprising a combination of disassembler, control-flow analyzer, API-sequence extractor, similarity calculator and hierarchical cluster analyzer. We acquired 4,684 malware samples, from 1,821 of which we successfully extracted API sequences to which we applied our proposed classification method. We found that the automatic hierarchical cluster analysis was processed rapidly, with significant clusters of variant groups obtained.

References

[1]

M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario. Automated classification and analysis of internet malware. In Proceedings of the 10th international conference on Recent advances in intrusion detection, pages 178--197, Berlin, Heidelberg, 2007. Springer-Verlag.

Digital Library

[2]

Z. Bu, T. Dirro, P. Greve, Y. Lin, D. Marcus, F. Paget, C. Schmugar, J. Shah, D. Sommer, P. Szor, and A. Wosotowsky. Mcafee threats report: First quarter 2012.

[3]

M. Christodorescu, S. Jha, and C. Kruegel. Mining specifications of malicious behavior. In Proceedings of the 1st India software engineering conference, pages 5--14, New York, NY, USA, 2008. ACM.

Digital Library

[4]

H. Flake. Automated unpacking and malware classification. In Black Hat Japan, pages 61--88, Tokyo, Japan, 2007.

[5]

K. Iwamoto and K. Wasaki. Detecting original entry point based on comparing runtime library codes in malware unpacking. TECHNICAL REPORT OF IEICE. ICSS, 111(82):57--62, 2011.

[6]

S. Josse. Secure and advanced unpacking using computer emulation. In AVAR 2006 Conference, pages 174--190, Auckland, New Zealand, 2006.

[7]

M. G. Kang, P. Poosankam, and H. Yin. Renovo: a hidden code extractor for packed executables. In Proceedings of the 2007 ACM workshop on Recurring malcode, pages 46--53, New York, NY, USA, 2007. ACM.

Digital Library

[8]

H. C. Kim, D. Inoue, M. Eto, Y. Takagi, and K. Nakao. Toward generic unpacking techniques for malware analysis with quantification of code revelation. In Joint Workshop on Information Security 2009, Kaohsiung, Taiwan, 2009.

[9]

C. Kruegel, E. Kirda, D. Mutz, W. Robertson, and G. Vigna. Polymorphic worm detection using structural information of executables. In Proceedings of the 8th international conference on Recent Advances in Intrusion Detection, pages 207--226, Berlin, Heidelberg, 2006. Springer-Verlag.

Digital Library

[10]

C. Lungu and M. Botis. Cj-unpack: Efficient runtime unpacking system. In 19th EICAR Annual Conference, pages 235--253, Paris, France, 2010.

[11]

L. Martignoni, M. Christodorescu, and S. Jha. Omniunpack: Fast, generic, and safe unpacking of malware. In In Proceedings of the Annual Computer Security Applications Conference (ACSAC), 2007.

[12]

P. Royal, M. Halpin, D. Dagon, R. Edmonds, and W. Lee. Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In Proceedings of the 22nd Annual Computer Security Applications Conference, pages 289--300, Washington, DC, USA, 2006. IEEE Computer Society.

Digital Library

[13]

B. Schwarz, S. Debray, and G. Andrews. Disassembly of executable code revisited. In In Proc. IEEE 2002 Working Conference on Reverse Engineering (WCRE), pages 45--54. IEEE Computer Society, 2002.

Digital Library

[14]

Y. Ye, Y. Mei, and R. Peng. Mcns: Intelligent malware categorization and naming system. In AVAR 2009 Conference, pages 15--25, Kyoto, Japan, 2009.

[15]

Q. Zhang and D. S. Reeves. Metaaware: Identifying metamorphic malware. Computer Security Applications Conference, Annual, 0:411--420, 2007.

Cited By

Depuru SSree Divya KMoni MAmala KSakthivel MSivanantham S(2025)Convolutional Neural Network for Classification of Image-Based Malware: A Deep Learning ApproachProceedings of the 1st International Conference on Intelligent Healthcare and Computational Neural Modelling10.1007/978-981-99-2832-3_56(469-480)Online publication date: 5-Jan-2025
https://doi.org/10.1007/978-981-99-2832-3_56
Yang YWu HWang YWang P(2024)AMN: Attention-based Multimodal Network for Android Malware Classification2024 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE International Conference on Robotics, Automation and Mechatronics (RAM)10.1109/CIS-RAM61939.2024.10672730(7-13)Online publication date: 8-Aug-2024
https://doi.org/10.1109/CIS-RAM61939.2024.10672730
Assaiante CNicchi SD’Elia DQuerzoni L(2024)Evading Userland API Hooking, Again: Novel Attacks and a Principled Defense MethodDetection of Intrusions and Malware, and Vulnerability Assessment10.1007/978-3-031-64171-8_8(150-173)Online publication date: 9-Jul-2024
https://doi.org/10.1007/978-3-031-64171-8_8
Show More Cited By

Index Terms

Malware classification based on extracted API sequences using static analysis
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
  2. Systems security
    1. Operating systems security
2. Social and professional topics
  1. Computing / technology policy
    1. Computer crime

Recommendations

Pushdown control-flow analysis for free
POPL '16

Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...
Pushdown control-flow analysis for free
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages

Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...
Obfuscated malware detection using API call dependency
SecurIT '12: Proceedings of the First International Conference on Security of Internet of Things

Malwares pose a grave threat to security of a network and host systems. Many events such as Distributed Denial-of-Service attacks, spam emails etc., often have malwares as their root cause. So a great deal of research is being invested in detection and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AINTEC '12: Proceedings of the 8th Asian Internet Engineering Conference

November 2012

93 pages

ISBN:9781450318143

DOI:10.1145/2402599

General Chairs:
Kanchana Kanchanasut
Asian Institute of Technology, Thailand
,
Sukumal Kitisin
Kasetsart University, Thailand
,
Program Chairs:
Keith W. Ross
Polytechnic Institute of NYU
,
Marcelo Dias Amorim
UPMC, France
,
Rodney Van Meter
Keio University, Japan

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 November 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AINTEC '12

Sponsor:

SIGCOMM

AINTEC '12: Asian Internet Engineering Conference

November 14 - 16, 2012

Bangkok, Thailand

Acceptance Rates

Overall Acceptance Rate 15 of 38 submissions, 39%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
599
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)4

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Depuru SSree Divya KMoni MAmala KSakthivel MSivanantham S(2025)Convolutional Neural Network for Classification of Image-Based Malware: A Deep Learning ApproachProceedings of the 1st International Conference on Intelligent Healthcare and Computational Neural Modelling10.1007/978-981-99-2832-3_56(469-480)Online publication date: 5-Jan-2025
https://doi.org/10.1007/978-981-99-2832-3_56
Yang YWu HWang YWang P(2024)AMN: Attention-based Multimodal Network for Android Malware Classification2024 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE International Conference on Robotics, Automation and Mechatronics (RAM)10.1109/CIS-RAM61939.2024.10672730(7-13)Online publication date: 8-Aug-2024
https://doi.org/10.1109/CIS-RAM61939.2024.10672730
Assaiante CNicchi SD’Elia DQuerzoni L(2024)Evading Userland API Hooking, Again: Novel Attacks and a Principled Defense MethodDetection of Intrusions and Malware, and Vulnerability Assessment10.1007/978-3-031-64171-8_8(150-173)Online publication date: 9-Jul-2024
https://doi.org/10.1007/978-3-031-64171-8_8
Gorchakov ADemidova LSovietov P(2023)Analysis of Program Representations Based on Abstract Syntax Trees and Higher-Order Markov Chains for Source Code Classification TaskFuture Internet10.3390/fi1509031415:9(314)Online publication date: 18-Sep-2023
https://doi.org/10.3390/fi15090314
Banda BGovan BRoy KBryant K(2023)Malware detection using Explainable ML models based on Feature Extraction using API calls2023 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD)10.1109/icABCD59051.2023.10220515(1-7)Online publication date: 3-Aug-2023
https://doi.org/10.1109/icABCD59051.2023.10220515
Matin I(2023)Ransomware Extraction Using Static Portable Executable (PE) Feature-Based Approach2023 6th International Conference of Computer and Informatics Engineering (IC2IE)10.1109/IC2IE60547.2023.10331246(70-74)Online publication date: 14-Sep-2023
https://doi.org/10.1109/IC2IE60547.2023.10331246
Lin CHuang MLee C(2022)Malware Classification Using Convolutional Fuzzy Neural Networks Based on Feature Fusion and the Taguchi MethodApplied Sciences10.3390/app12241293712:24(12937)Online publication date: 16-Dec-2022
https://doi.org/10.3390/app122412937
Ahmed MQureshi AAhmed Shamsi JMarvi M(2022)Sequential Embedding-based Attentive (SEA) classifier for malware classification2022 International Conference on Cyber Warfare and Security (ICCWS)10.1109/ICCWS56285.2022.9998431(28-35)Online publication date: 7-Dec-2022
https://doi.org/10.1109/ICCWS56285.2022.9998431
Funde SSwain G(2022)Big Data Privacy and Security Using Abundant Data Recovery Techniques and Data Obliviousness MethodologiesIEEE Access10.1109/ACCESS.2022.321130410(105458-105484)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3211304
Zou BCao CWang LTao F(2022)DACN: Malware Classification Based on Dynamic Analysis and Capsule NetworksFrontiers in Cyber Security10.1007/978-981-19-0523-0_1(3-13)Online publication date: 1-Mar-2022
https://doi.org/10.1007/978-981-19-0523-0_1
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten