Abstract
Intrusion detection systems (IDSs) must be capable of detecting new and unknown attacks, or anomalies. We study the problem of building detection models for both pure anomaly detection and combined misuse and anomaly detection (i.e., detection of both known and unknown intrusions). We show the necessity of artificial anomalies by discussing the failure to use conventional inductive learning methods to detect anomalies. We propose an algorithm to generate artificial anomalies to coerce the inductive learner into discovering an accurate boundary between known classes (normal connections and known intrusions) and anomalies. Empirical studies show that our pure anomaly-detection model trained using normal and artificial anomalies is capable of detecting more than 77% of all unknown intrusion classes with more than 50% accuracy per intrusion class. The combined misuse and anomaly-detection models are as accurate as a pure misuse detection model in detecting known intrusions and are capable of detecting at least 50% of unknown intrusion classes with accuracy measurements between 75 and 100% per class.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Cohen W (1995) Fast effective rule induction. In: Proceedings of 12th international conference on machine learning (ICML-95), Kaufmann, Tahoe City, CA, pp 115–123
Fan W, Lee W, Stolfo S, Miller M (2000) A multiple model approach for cost-sensitive intrusion detection. In: Proceedings of 11th European conference on machine learning (ECML-00), Barcelona, Spain, pp 142–153
Forrest S, Hofmeyr SA, Somayaji A, Longstaff TA (1996) A sense of self for UNIX processes. In: Proceedings of IEEE symposium on security and privacy, pp 120–128
Ghosh AK, Schwartzbard A (1999) A study in using neural networks for anomaly and misuse detection. In: Proceedings of USENIX security symposium, pp 51–62
Javitz H, Valdes A (1991) The SRI IDES statistical anomaly detector. In: Proceedings of IEEE symposium on security and privacy, p 1991
Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one sided selection. In: Proceedings of 14th international conference on machine learning (ICML-97), Kaufmann, Nashville, TN, pp 179–186
Lane T, Brodley C (1998) Approaches to online learning and concept drift for user identification in computer security. In: Proceedings of 4th international conference on knowledge discovery and data mining (KDD-98), pp 259–263
Lee W (1999) A data mining framework for constructing features and models for intrusion detection systems. Thesis, Columbia University
Lee W, Xiang D (2001) Information-theoretic measures for anomaly detection. In: 2001 IEEE symposium on security and privacy, Oakland, CA, pp 130–143
Maxion RA, Tan KM (2000) Benchmarking anomaly-based detection systems. In: International conference on dependable systems and networks, pp 623–630
Neumann PG, Porras PA (1999) Experiments with EMERALD to date. In: Proceedings of 1999 USENIX workshop on intrusion detection, pp 73–80
Nigam K, McCallum A, Thrun S, Mitchell T (1998) Learning to classify text from labeled and unlabeled documents. In: Proceedings of 15th national conference on artificial intelligence (AAAI-98), pp 792–799
SunSoft (1995) SunSHIELD Basic Security module guide. SunSoft, Mountain View, CA
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fan, W., Miller, M., Stolfo, S. et al. Using artificial anomalies to detect unknown and known network intrusions. Know. Inf. Sys. 6, 507–527 (2004). https://doi.org/10.1007/s10115-003-0132-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-003-0132-7