Performance evaluation of unsupervised techniques in cyber-attack anomaly detection

Meira, Jorge; Andrade, Rui; Praça, Isabel; Carneiro, João; Bolón-Canedo, Verónica; Alonso-Betanzos, Amparo; Marreiros, Goreti

doi:10.1007/s12652-019-01417-9

Performance evaluation of unsupervised techniques in cyber-attack anomaly detection

Original Research
Published: 07 August 2019

Volume 11, pages 4477–4489, (2020)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Jorge Meira ORCID: orcid.org/0000-0002-1502-780X²,
Rui Andrade¹,
Isabel Praça¹,
João Carneiro¹,
Verónica Bolón-Canedo²,
Amparo Alonso-Betanzos² &
…
Goreti Marreiros¹

1424 Accesses
35 Citations
Explore all metrics

Abstract

Cyber security is a critical area in computer systems especially when dealing with sensitive data. At present, it is becoming increasingly important to assure that computer systems are secured from attacks due to modern society dependence from those systems. To prevent these attacks, nowadays most organizations make use of anomaly-based intrusion detection systems (IDS). Usually, IDS contain machine learning algorithms which aid in predicting or detecting anomalous patterns in computer systems. Most of these algorithms are supervised techniques, which contain gaps in the detection of unknown patterns or zero-day exploits, since these are not present in the algorithm learning phase. To address this problem, we present in this paper an empirical study of several unsupervised learning algorithms used in the detection of unknown attacks. In this study we evaluated and compared the performance of different types of anomaly detection techniques in two public available datasets: the NSL-KDD and the ISCX. The aim of this evaluation allows us to understand the behavior of these techniques and understand how they could be fitted in an IDS to fill the mentioned flaw. Also, the present evaluation could be used in the future, as a comparison of results with other unsupervised algorithms applied in the cybersecurity field. The results obtained show that the techniques used are capable of carrying out anomaly detection with an acceptable performance and thus making them suitable candidates for future integration in intrusion detection tools.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cybersecurity data science: an overview from machine learning perspective

Article Open access 01 July 2020

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Article Open access 19 September 2022

Role of Artificial Intelligence in the Internet of Things (IoT) cybersecurity

Article Open access 24 February 2021

Notes

https://archive.ics.uci.edu/ml/datasets/KDD+Cup+1999+Data.
https://cran.r-project.org/web/packages/h2o/index.html.
https://shiring.github.io/machine_learning/2017/05/02/fraud_2.
Monitoring system that collects data from network communications in real time through network sensors.

References

Aleroud A, Karabatis G (2013) Toward zero-day attack identification using linear data transformation techniques. In: Proceedings of 7th international conference on software security and reliability, SERE 2013, pp 159–68. https://doi.org/10.1109/SERE.2013.16
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.679.1104&rep=rep1&type=pdf
Casale P, Pujol O, Radeva P (2011) Approximate convex hulls family for one-class classification. In: International workshop on multiple classifier systems, pp 106–115. https://doi.org/10.1007/978-3-642-21557-5_13
Casas P, Mazel J, Owezarski P (2012) Unsupervised network intrusion detection systems: detecting the unknown without knowledge. Comput Commun 35(7):772–783. https://doi.org/10.1016/j.comcom.2012.01.016
Article Google Scholar
Castillo E, Peteiro-Barral D, Berdiñas BG, Fontenla-Romero O (2015) Distributed one-class support vector machine. Int J Neural Syst 25(07):1550029. https://doi.org/10.1142/S012906571550029X
Article Google Scholar
Chen J, Sathe S, Aggarwal C, Turaga D (2017) Outlier detection with autoencoder ensembles. In: Proceedings of the 2017 SIAM international conference on data mining, pp 90–98. https://doi.org/10.1137/1.9781611974973.11
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Deng L, Yu D et al (2014) Deep learning: methods and applications. Found Trends Signal Process 7(3–4):197–387. https://doi.org/10.1561/2000000039
Article MathSciNet MATH Google Scholar
Dhanabal L, Shantharajah SP (2015) A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. Int J Adv Res Comput Commun Eng. https://doi.org/10.17148/IJARCCE.2015.4696
Article Google Scholar
Ester M, Kriegel H-P, Sander J, Xiaowei X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96:226–231
Google Scholar
Fernández-Francos D, Fontenla-Romero Ó, Alonso-Betanzos A (2017) One-class convex hull-based algorithm for classification in distributed environments. In: IEEE transactions on systems, man, and cybernetics: systems. https://doi.org/10.1109/TSMC.2017.2771341
Fred ALN, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850
Article Google Scholar
Gama J, de Leon Carvalho AP, Faceli K, Lorena AC, Oliveira M (2015) Extração de Conhecimento de Dados. http://www.silabo.pt/Conteudos/8117_PDF.pdf
Gardner AB, Krieger AM, Vachtsevanos G, Litt B (2006) One-class novelty detection for seizure analysis from intracranial EEG. J Mach Learn Res 7:1025–1044
MathSciNet MATH Google Scholar
Giacinto G, Perdisci R, Del Rio M, Roli F (2008) Intrusion detection in computer networks by a modular ensemble of one-class classifiers. Inf Fusion 9(1):69–82. https://doi.org/10.1016/j.inffus.2006.10.002
Article Google Scholar
Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS One. https://doi.org/10.7910/DVN/OPQMVF
Article Google Scholar
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28. https://doi.org/10.1109/5254.708428
Article Google Scholar
Japkowicz N (1999) Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Rutgers University, New Brunswick. https://pdfs.semanticscholar.org/03ed/0a73d1f7a7b16505d6cb9c8bfbeeef7b19bb.pdf
Google Scholar
Khan SS, Madden MG (2014) One-class classification: taxonomy of study and review of techniques. Knowl Eng Rev 29(3):345–374
Article Google Scholar
Liu H, Hussain F, Tan CL, Dash M (2002) Discretization: an enabling technique, pp 393–423. https://pdfs.semanticscholar.org/2d18/73800b294a104a836168ac5bba11edeadc7f.pdf
Liu Z, Liu JG, Pan C, Wang G (2009) A novel geometric approach to binary classification base. IEEE Trans Neural Networks 20(7):1215–1220. https://doi.org/10.1109/TNN.2009.2022399
Article Google Scholar
Liu FT, Ting KM, Zhou Z-H (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data 6(1):3:1–3:39. https://doi.org/10.1145/2133360.2133363
Article Google Scholar
Manevitz LM, Yousef M (2001) One-class SVMs for document classification. J Mach Learn Res 2:139–154
MATH Google Scholar
Mazhelis O (2016) One-class classifiers: a review and analysis of suitability in the context of mobile-masquerader detection Oleksiy Mazhelis to cite this version: HAL Id: Hal-01262354 One-Class Classifiers: a review and analysis of suitability in the context of mobile. https://hal.inria.fr/hal-01262354/document
Niyaz Q, Sun W, Javaid AY, Alam M (2015) A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI international conference on bio-inspired information and communications technologies. https://doi.org/10.4108/eai.3-12-2015.2262516
Noto K, Brodley C, Slonim D (2012) FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Min Knowl Discov 25(1):109–133. https://doi.org/10.1007/s10618-011-0234-x
Article MathSciNet Google Scholar
Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor Newsl 6(1):90–105. https://doi.org/10.1145/1007730.1007731
Article Google Scholar
Schölkopf B, Williamson R, Smola A, Shawe-Taylor J, Platt J (2000) Support vector method for novelty detection. Adv Neural Inf Process Syst 12:582–588
Google Scholar
Shin HJ, Eom D-H, Kim S-S (2005) One-class support vector machines—an application in machine fault detection and classification. Comput Ind Eng 48(2):395–408. https://doi.org/10.1016/j.cie.2005.01.009
Article Google Scholar
Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur 31(3):357–374. https://doi.org/10.1016/j.cose.2011.12.012
Article Google Scholar
Stadler T (2011) R Topics Documented. Package ‚ÄòTreePar‚ Äô, 2. https://doi.org/10.2307/2533043
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP 99 data set. In: IEEE symposium on computational intelligence for security and defense applications, CISDA 2009, pp 1–6. https://doi.org/10.1109/CISDA.2009.5356528
Tax DMJ (2001) One-class classification: concept learning in the absence of counter-examples. http://homepage.tudelft.nl/n9d04/thesis.pdf
Tieleman T, Hinton G (2012) Lecture 6.5-Rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw Mach Learn 4(2):26–31
Google Scholar
Tsai CF, Hsu YF, Lin CY, Lin WY (2009) Intrusion Detection by machine learning: a review. Expert Syst Appl 36(10):11994. https://doi.org/10.1016/j.eswa.2009.05.029
Article Google Scholar

Download references

Acknowledgements

This work was supported by SASSI Project (ANI|P2020 17775) and has received funding from FEDER Funds through P2020 program and from National Funds through FCT-Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) under the project UID/EEA/00760/2019. This work has also received financial support from MINECO (Grant TIN2015-65069), the Xunta de Galicia (Grants ED431C 2018/34, and Centro Singular de Investigación de Galicia, accreditation 2016–2019, Ref. ED431G/01) and the European Union (European Regional Development Fund—ERDF).

Author information

Authors and Affiliations

GECAD-Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development, Institute of Engineering, Polytechnic of Porto (ISEP/IPP), Porto, Portugal
Rui Andrade, Isabel Praça, João Carneiro & Goreti Marreiros
LIDIA-Laboratory for Research and Development in Artificial Intelligence, Department of Computer Science, University of A Coruña, A Coruña, Spain
Jorge Meira, Verónica Bolón-Canedo & Amparo Alonso-Betanzos

Authors

Jorge Meira
View author publications
You can also search for this author in PubMed Google Scholar
Rui Andrade
View author publications
You can also search for this author in PubMed Google Scholar
Isabel Praça
View author publications
You can also search for this author in PubMed Google Scholar
João Carneiro
View author publications
You can also search for this author in PubMed Google Scholar
Verónica Bolón-Canedo
View author publications
You can also search for this author in PubMed Google Scholar
Amparo Alonso-Betanzos
View author publications
You can also search for this author in PubMed Google Scholar
Goreti Marreiros
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jorge Meira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meira, J., Andrade, R., Praça, I. et al. Performance evaluation of unsupervised techniques in cyber-attack anomaly detection. J Ambient Intell Human Comput 11, 4477–4489 (2020). https://doi.org/10.1007/s12652-019-01417-9

Download citation

Received: 28 November 2018
Accepted: 01 August 2019
Published: 07 August 2019
Issue Date: November 2020
DOI: https://doi.org/10.1007/s12652-019-01417-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance evaluation of unsupervised techniques in cyber-attack anomaly detection

Abstract

Access this article

Similar content being viewed by others

Cybersecurity data science: an overview from machine learning perspective

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Role of Artificial Intelligence in the Internet of Things (IoT) cybersecurity

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance evaluation of unsupervised techniques in cyber-attack anomaly detection

Abstract

Access this article

Similar content being viewed by others

Cybersecurity data science: an overview from machine learning perspective

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Role of Artificial Intelligence in the Internet of Things (IoT) cybersecurity

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation