Abstract:
Construction of malware families from behavioral properties of malware in wild is challenging. The understanding of behavioral properties of malware families is key in de...Show MoreMetadata
Abstract:
Construction of malware families from behavioral properties of malware in wild is challenging. The understanding of behavioral properties of malware families is key in detection of malware attacks. In this work, a methodology is described for unsupervised construction of malware families using two phase approach. The first phase includes natural language processing techniques such as term-frequency and inverse document frequency are applied on trace data to compute similarities. A graph of textual similarities of trace sequences is constructed. The second phase, consists of application of minimum spanning tree and community detection algorithms for construction of malware families. Experiments employing the proposed methodology are conducted on a published dataset and the results are reported. Machine learning algorithms are evaluated on the constructed malware families. The results are promising in automated detection of variants of malware from malware families.
Date of Conference: 17-20 December 2022
Date Added to IEEE Xplore: 26 January 2023
ISBN Information: