skip to main content
10.1145/3340531.3417467acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A Human-in-the-Loop Approach to Malware Author Classification

Published: 19 October 2020 Publication History

Abstract

For these few decades malwares have been posing a major concern in the cyber security. Recently, a number of "author groups" have been generating lots of newmalwares by sharing source code within a group and exploiting evasive schemes such as polymorphism and metamorphism. This motivates us to study the problem of identifying the author group of a given malware, which would be able to work for not only blocking malwares but also legally punishing suspected malware authors. In this paper, we propose a human-machine collaborative approach for classifying author groups of malwares accurately. We also propose a visualization method for helping human experts to make the decision easily. We verify the superiority of our framework through extensive experiments using real-world malware data.

References

[1]
Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek, Christopher Kruegel, and Engin Kirda. 2009. Scalable, behavior-based malware clustering. In Network and Distributed System Security Symposium. 8--11.
[2]
D. Berrar. 2018. Cross-Validation. https://doi.org/10.1016/B978-0--12--809633--8.20349-X
[3]
Daniel Bilar. 2007. Opcodes as predictor for malware. International Journal of Electronic Security and Digital Forensics, Vol. 1, 2 (2007), 156--168.
[4]
Dong-Kyu Chae, Jiwoon Ha, Sang-Wook Kim, BooJoong Kang, and Eul Gyu Im. 2013. Software plagiarism detection: A graph-based approach. In ACM International Conference on Information & Knowledge Management. 1577--1580.
[5]
Shin-Ming Cheng, Weng Chon Ao, Pin-Yu Chen, and Kwang-Cheng Chen. 2010. On modeling malware propagation in generalized social networks. IEEE Communications Letters, Vol. 15, 1 (2010), 25--27.
[6]
Mihai Christodorescu, Somesh Jha, Sanjit A. Seshia, Dawn Song, and Randal E. Bryant. 2005. Semantics-aware malware detection. In IEEE Symposium on Security and Privacy. 32--46.
[7]
Giulia Costantini, Pietro Ferrara, and Agostino Cortesi. 2011. Static analysis of string values. In International Conference on Formal Engineering Methods. Springer, 505--521.
[8]
Frederik Michel Dekking, Cornelis Kraaikamp, Hendrik Paul Lopuha"a, and Ludolf Erwin Meester. 2005. A modern introduction to probability and statistics: Understanding why and how. Springer Science & Business Media.
[9]
Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. 2008. A survey on automated dynamic malware-analysis techniques and tools. Comput. Surveys, Vol. 44, 2 (2008), 1--42.
[10]
Andre RA Gregio, Dario S Fernandes Filho, Vitor M Afonso, Rafael DC Santos, Mario Jino, and Paulo L de Geus. 2011. Behavioral analysis of malicious code through network traffic and system call monitoring., Vol. 8059 (2011).
[11]
Jiwon Hong, Sung-Jun Park, Taeri Kim, Yung-Kyun Noh, Sang-Wook Kim, Dongphil Kim, and Wonho Kim. 2019. Malware classification for identifying author groups: a graph-based approach. In Proceedings of the Conference on Research in Adaptive and Convergent Systems. 169--174.
[12]
Wenyi Huang and Jack W Stokes. 2016. MtNet: A multi-task neural network for dynamic malware classification. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. 399--418.
[13]
Deguang Kong and Guanhua Yan. 2013. Discriminant malware distance learning on structural information for automated malware classification. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1357--1365.
[14]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 701--710.
[15]
Gregorio Pitolli, Leonardo Aniello, Giuseppe Laurenza, Leonardo Querzoni, and Roberto Baldoni. 2017. Malware family identification with BIRCH clustering. In International Carnahan Conference on Security Technology. 1--6.
[16]
Daniel Plohmann, Martin Clauss, Steffen Enders, and Elmar Padilla. 2017. Malpedia: A collaborative effort to inventorize the malware landscape. The Journal on Cybercrime & Digital Investigations, Vol. 3, 1 (2017), 1--19.
[17]
M Zubair Rafique and Juan Caballero. 2013. Firma: Malware clustering and network signature generation with mixed network behaviors. In International Workshop on Recent Advances in Intrusion Detection. 144--163.
[18]
Michael Sikorski and Andrew Honig. 2012. Practical malware analysis: The hands-on guide to dissecting malicious software. No Starch Press.
[19]
Sekaran Sneha, Lakshmanan Malathi, and R Saranya. 2015. A survey on malware propagation analysis and prevention model. International Journal of Advancements in Technology, Vol. 6, 02 (2015).
[20]
Marina Sokolova and Guy Lapalme. 2009. A systematic analysis of performance measures for classification tasks. Information Processing & Management, Vol. 45, 4 (2009), 427--437.

Cited By

View all
  • (2024)What do malware analysts want from academia? A survey on the state-of-the-practice to guide research developmentsProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678892(77-96)Online publication date: 30-Sep-2024
  • (2022)Binary code traceability of multigranularity information fusion from the perspective of software genesComputers and Security10.1016/j.cose.2022.102607114:COnline publication date: 6-May-2022

Index Terms

  1. A Human-in-the-Loop Approach to Malware Author Classification

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
      October 2020
      3619 pages
      ISBN:9781450368599
      DOI:10.1145/3340531
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 October 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. human-in-the-loop approach
      2. malware author groups
      3. malware classification

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CIKM '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)17
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)What do malware analysts want from academia? A survey on the state-of-the-practice to guide research developmentsProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678892(77-96)Online publication date: 30-Sep-2024
      • (2022)Binary code traceability of multigranularity information fusion from the perspective of software genesComputers and Security10.1016/j.cose.2022.102607114:COnline publication date: 6-May-2022

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media