Authors:
Laurens D’hooge
;
Tim Wauters
;
Bruno Volckaert
and
Filip De Turck
Affiliation:
Ghent University - imec, IDLab, Department of Information Technology, Technologiepark-Zwijnaarde 126, Gent and Belgium
Keyword(s):
Intrusion Detection, CICIDS2017, Supervised Machine Learning, Binary Classification.
Abstract:
This paper describes the process and results of analyzing CICIDS2017, a modern, labeled data set for testing intrusion detection systems. The data set is divided into several days, each pertaining to different attack classes (Dos, DDoS, infiltration, botnet, etc.). A pipeline has been created that includes nine supervised learning algorithms. The goal was binary classification of benign versus attack traffic. Cross-validated parameter optimization, using a voting mechanism that includes five classification metrics, was employed to select optimal parameters. These results were interpreted to discover whether certain parameter choices were dominant for most (or all) of the attack classes. Ultimately, every algorithm was retested with optimal parameters to obtain the final classification scores. During the review of these results, execution time, both on consumer- and corporate-grade equipment, was taken into account as an additional requirement. The work detailed in this paper establis
hes a novel supervised machine learning performance baseline for CICIDS2017. Graphics of the results as well as the raw tables are publicly available at https://gitlab.ilabt.imec.be/lpdhooge/cicids2017-ml-graphics.
(More)