Exploring Dataset Manipulation via Machine Learning for Botnet Traffic

https://doi.org/10.1016/j.procs.2021.11.082Get rights and content
Under a Creative Commons license
open access

Abstract

Botnets are responsible for some of the major malicious traffic on the Internet: DDoS attacks, Mail SPAM, brute force attacks, portscans, and others. Its dangerousness is due to the coordinated amount of infected hosts focusing on a single target. More contributions are in need, considering that (A) ML has been used for cyberattacks identification with better accuracy than standard NIDS equipments, (B) Botnet attacks are one of the most dangerous threats on the Internet. (C) the difficulties in getting representative datasets on some Botnets, and (D) Botnet traffic can be misunderstood by its infrastructure protocol.

In this paper, we focus on the identification of Botnet traffic, preventing the communication from the Botmaster to the infected hosts and consequently the Botnet cyberattacks. CICFlowMeter and Machine Learning algorithms were used to analyse Botnet2014 public dataset on four different scenarios: all Botnet traffic on a single class, each class per Botnet traffic and the influence of the IPs address fields Botnet traffic detection.

The results shows that Random Forest (RF) and Decision Tree (CART) archived similar accuracies on Botnet traffic classification. Important to say that CART obtained similar results with 10-20% of machine time. The metrics shown that the analysis per specific Botnet has higher accuracy than Any Botnet Traffic analysis. Also, the analysis with the IP addresses and L4 Ports scenario has higher accuracy but lower F1-Score that the equivalent without IP addresses or L4 Ports. At last, Feature Importance results confirms the literature, that Botnet traffic is not a single uniform protocol, but a collection of very different ways of communications between the botmaster and the infected hosts.

Keywords

CICFlowMeter
Botnet2014
Botnet Traffic
Machine Learning
Random Forest Classifier
Decision Tree Classifier

Cited by (0)