Abstract:
The widespread use of network encryption in communication technologies is becoming the standard of communication due to its privacy preservation capability. On the contra...Show MoreMetadata
Abstract:
The widespread use of network encryption in communication technologies is becoming the standard of communication due to its privacy preservation capability. On the contrary, this method of hiding data from unintentional parties introduces another problem in the communication system. Malware authors utilize this technique for their intention of malicious code deployment and infiltration. In this study, we used machine learning algorithms to detect and classify encrypted malware. We analyze raw encrypted traffic from different sources to study and extract different feature sets that can discriminate malicious flows from benign flows. This includes handshake, certificate, inter-arrival time and length of packets, statistical features, Meta connection features, and the cipher suite used. We preprocessed 14152 unique flows and extracted 304 features. Two ensemble learning models, namely extreme gradient boosting (XGBoost) and random forest (RF) models, were used for detection and classification. The detection model achieved 99. 78% precision, 0. 467% false detection rate, 99. 39% precision, 99. 33% recall and 99. 63% F1 score. However, for a family classification model, the accuracy was 98.65%, 98.28% Fl-score, 98.39% precision, and 98.16% recall.
Published in: 2023 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)
Date of Conference: 26-28 October 2023
Date Added to IEEE Xplore: 06 November 2023
ISBN Information: