Abstract
Cyber security is a critical area in computer systems especially when dealing with sensitive data. At present, it is becoming increasingly important to assure that computer systems are secured from attacks due to modern society dependence from those systems. To prevent these attacks, nowadays most organizations make use of anomaly-based intrusion detection systems (IDS). Usually, IDS contain machine learning algorithms which aid in predicting or detecting anomalous patterns in computer systems. Most of these algorithms are supervised techniques, which contain gaps in the detection of unknown patterns or zero-day exploits, since these are not present in the algorithm learning phase. To address this problem, we present in this paper an empirical study of several unsupervised learning algorithms used in the detection of unknown attacks. In this study we evaluated and compared the performance of different types of anomaly detection techniques in two public available datasets: the NSL-KDD and the ISCX. The aim of this evaluation allows us to understand the behavior of these techniques and understand how they could be fitted in an IDS to fill the mentioned flaw. Also, the present evaluation could be used in the future, as a comparison of results with other unsupervised algorithms applied in the cybersecurity field. The results obtained show that the techniques used are capable of carrying out anomaly detection with an acceptable performance and thus making them suitable candidates for future integration in intrusion detection tools.
Similar content being viewed by others
Notes
Monitoring system that collects data from network communications in real time through network sensors.
References
Aleroud A, Karabatis G (2013) Toward zero-day attack identification using linear data transformation techniques. In: Proceedings of 7th international conference on software security and reliability, SERE 2013, pp 159–68. https://doi.org/10.1109/SERE.2013.16
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.679.1104&rep=rep1&type=pdf
Casale P, Pujol O, Radeva P (2011) Approximate convex hulls family for one-class classification. In: International workshop on multiple classifier systems, pp 106–115. https://doi.org/10.1007/978-3-642-21557-5_13
Casas P, Mazel J, Owezarski P (2012) Unsupervised network intrusion detection systems: detecting the unknown without knowledge. Comput Commun 35(7):772–783. https://doi.org/10.1016/j.comcom.2012.01.016
Castillo E, Peteiro-Barral D, Berdiñas BG, Fontenla-Romero O (2015) Distributed one-class support vector machine. Int J Neural Syst 25(07):1550029. https://doi.org/10.1142/S012906571550029X
Chen J, Sathe S, Aggarwal C, Turaga D (2017) Outlier detection with autoencoder ensembles. In: Proceedings of the 2017 SIAM international conference on data mining, pp 90–98. https://doi.org/10.1137/1.9781611974973.11
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Deng L, Yu D et al (2014) Deep learning: methods and applications. Found Trends Signal Process 7(3–4):197–387. https://doi.org/10.1561/2000000039
Dhanabal L, Shantharajah SP (2015) A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. Int J Adv Res Comput Commun Eng. https://doi.org/10.17148/IJARCCE.2015.4696
Ester M, Kriegel H-P, Sander J, Xiaowei X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96:226–231
Fernández-Francos D, Fontenla-Romero Ó, Alonso-Betanzos A (2017) One-class convex hull-based algorithm for classification in distributed environments. In: IEEE transactions on systems, man, and cybernetics: systems. https://doi.org/10.1109/TSMC.2017.2771341
Fred ALN, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850
Gama J, de Leon Carvalho AP, Faceli K, Lorena AC, Oliveira M (2015) Extração de Conhecimento de Dados. http://www.silabo.pt/Conteudos/8117_PDF.pdf
Gardner AB, Krieger AM, Vachtsevanos G, Litt B (2006) One-class novelty detection for seizure analysis from intracranial EEG. J Mach Learn Res 7:1025–1044
Giacinto G, Perdisci R, Del Rio M, Roli F (2008) Intrusion detection in computer networks by a modular ensemble of one-class classifiers. Inf Fusion 9(1):69–82. https://doi.org/10.1016/j.inffus.2006.10.002
Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS One. https://doi.org/10.7910/DVN/OPQMVF
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28. https://doi.org/10.1109/5254.708428
Japkowicz N (1999) Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Rutgers University, New Brunswick. https://pdfs.semanticscholar.org/03ed/0a73d1f7a7b16505d6cb9c8bfbeeef7b19bb.pdf
Khan SS, Madden MG (2014) One-class classification: taxonomy of study and review of techniques. Knowl Eng Rev 29(3):345–374
Liu H, Hussain F, Tan CL, Dash M (2002) Discretization: an enabling technique, pp 393–423. https://pdfs.semanticscholar.org/2d18/73800b294a104a836168ac5bba11edeadc7f.pdf
Liu Z, Liu JG, Pan C, Wang G (2009) A novel geometric approach to binary classification base. IEEE Trans Neural Networks 20(7):1215–1220. https://doi.org/10.1109/TNN.2009.2022399
Liu FT, Ting KM, Zhou Z-H (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data 6(1):3:1–3:39. https://doi.org/10.1145/2133360.2133363
Manevitz LM, Yousef M (2001) One-class SVMs for document classification. J Mach Learn Res 2:139–154
Mazhelis O (2016) One-class classifiers: a review and analysis of suitability in the context of mobile-masquerader detection Oleksiy Mazhelis to cite this version: HAL Id: Hal-01262354 One-Class Classifiers: a review and analysis of suitability in the context of mobile. https://hal.inria.fr/hal-01262354/document
Niyaz Q, Sun W, Javaid AY, Alam M (2015) A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI international conference on bio-inspired information and communications technologies. https://doi.org/10.4108/eai.3-12-2015.2262516
Noto K, Brodley C, Slonim D (2012) FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Min Knowl Discov 25(1):109–133. https://doi.org/10.1007/s10618-011-0234-x
Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor Newsl 6(1):90–105. https://doi.org/10.1145/1007730.1007731
Schölkopf B, Williamson R, Smola A, Shawe-Taylor J, Platt J (2000) Support vector method for novelty detection. Adv Neural Inf Process Syst 12:582–588
Shin HJ, Eom D-H, Kim S-S (2005) One-class support vector machines—an application in machine fault detection and classification. Comput Ind Eng 48(2):395–408. https://doi.org/10.1016/j.cie.2005.01.009
Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur 31(3):357–374. https://doi.org/10.1016/j.cose.2011.12.012
Stadler T (2011) R Topics Documented. Package ‚ÄòTreePar‚ Äô, 2. https://doi.org/10.2307/2533043
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP 99 data set. In: IEEE symposium on computational intelligence for security and defense applications, CISDA 2009, pp 1–6. https://doi.org/10.1109/CISDA.2009.5356528
Tax DMJ (2001) One-class classification: concept learning in the absence of counter-examples. http://homepage.tudelft.nl/n9d04/thesis.pdf
Tieleman T, Hinton G (2012) Lecture 6.5-Rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw Mach Learn 4(2):26–31
Tsai CF, Hsu YF, Lin CY, Lin WY (2009) Intrusion Detection by machine learning: a review. Expert Syst Appl 36(10):11994. https://doi.org/10.1016/j.eswa.2009.05.029
Acknowledgements
This work was supported by SASSI Project (ANI|P2020 17775) and has received funding from FEDER Funds through P2020 program and from National Funds through FCT-Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) under the project UID/EEA/00760/2019. This work has also received financial support from MINECO (Grant TIN2015-65069), the Xunta de Galicia (Grants ED431C 2018/34, and Centro Singular de Investigación de Galicia, accreditation 2016–2019, Ref. ED431G/01) and the European Union (European Regional Development Fund—ERDF).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Meira, J., Andrade, R., Praça, I. et al. Performance evaluation of unsupervised techniques in cyber-attack anomaly detection. J Ambient Intell Human Comput 11, 4477–4489 (2020). https://doi.org/10.1007/s12652-019-01417-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-019-01417-9