Abstract
Anomaly-based intrusion detection has been pursued as an alternative to standard signature-based methods since the seminal work of Denning in 1987. Despite the length of time for which it has been studied, the high level of activity in this area, and the remarkable success of machine learning techniques in other areas, anomaly-based IDSs remain rarely used in practice, and none appear to have the same widespread popularity as more common misuse detectors such as Bro and Snort. We examine a potential cause of this observation, the “semantic gap” identified by Sommer and Paxson in 2010, in some detail, with reference to several common building blocks for anomaly-based intrusion detection systems. Finally, we revisit tree-based structures for rule construction similar to those first discussed by Vaccaro and Liepins in 1989 in light of modern results in ensemble learning, and suggest how such constructions could be used generate anomaly-based intrusion detection systems that retain acceptable performance while producing output that is more actionable for human analysts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
It is also worth noting that [6] use and provide access to a “KDD-like” set of data—that is, data aggregated on flow-like structures containing labels—that contains real data gathered from honeypots between 2006 and 2009 rather than synthetic attacks that predate 1999; this may provide a more useful and realistic alternative to the KDD’99 set.
- 2.
A possibly instructive exercise for the reader: obscure the labels and ask a knowledgeable colleague to attempt to divine which two packets are ‘normal’ and which is ‘anomalous’, and why.
- 3.
Note another critical feature: if adversarial actors are capable of crafting their traffic to approximate \( f_{0} \), such that the quantity \( \left| {1 - \frac{{f_{1} \left( x \right)}}{{f_{0} \left( x \right)}}} \right| \;\le \; \in \) for some small \( \in \;> \;0 \), and can control the rate of malicious traffic they send and hence \( P\left( I \right) \), then they may craft their traffic such that the defenders have no \( x^{ \star } \) that satisfies the above relationship and so cannot perform cost-effective anomaly detection. We do not discuss this problem in detail, but reserve it for future work.
- 4.
In this case, any outgoing traffic to a relatively high destination port was deemed by an analyst to be unusual, but “certainly not a red flag”; the fact that it was non-TCP and did not originate from the lower end of the range of registered ports suggested a UDP streaming protocol, which often communicate across ephemeral ports; the analyst volunteered the suggestion that if it were in fact UDP it would likely not warrant further analysis. When the same analyst was presented with the outputs given in Fig. 1 through Fig. 3, they were of the opinion that it was not terribly useful, and that it not provide them with any guidance as to why it appeared suspicious; the semantic gap in action.
- 5.
Due to the large size of the test set, it was not loaded into memory all at once, and instead was read sequentially from disk. Total time elapsed was 1023.3 s, of which profiling indicated that roughly 88 % was consumed by disk I/O operations. As our interest was in offhand comparison and not production use, we did not attempt to optimize this further.
References
R. Sommer, V. Paxson, Outside the closed world: on using machine learning for network intrusion detection,” in 2010 IEEE Symposium on Security and Privacy (SP), 2010
P. Laskov, P. DÃŒssel, C. SchÀfer, K. Rieck, in Learning Intrusion Detection: Supervised or Unsupervised?, ed. by F. Roli, S. Vitulano (Springer, Berlin, 2005), pp. 50–57
M. Roesch, Snort – lightweight intrusion detection for networks, in Proceedings of the 13th USENIX Conference on System Administration, 1999, pp. 229–238
V. Paxson, Bro: a system for detecting network intruders in real time. Comput. Netw. 31(23–24), 2435–2463 (1999)
J. Long, D. Schwartz, S. Stoecklin, Distinguishing false from true alerts in Snort by data mining patterns of alerts, in Proceedings of 2006 SPIE Defense and Security Symposium, 2006
M. Sato, H. Yamaki, H. Takakura, Unknown attacks detection using feature extraction from anomaly-based IDS alerts, in 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet (SAINT), 2012
Y. Song, M.E. Locasto, A. Stavrou, A.D. Keromytis, S.J. Stolfo, On the infeasibility of modeling polymorphic shellcode – Re-thinking…, in MACH LEARN, 2009
H. Debar, M. Dacier, A. Wespi, Towards a taxonomy of intrusion-detection systems. Comput. Netw. 31(8), 805–822 (1999)
O. Depren, M. Topallar, E. Anarim, M.K. Ciliz, An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks. Expert Syst. Appl. 29(4), 713–722 (2005)
J. Zhang, M. Zulkernine, A. Haque, Random-forests-based network intrusion detection systems. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 38(5), 649–659 (2008)
N. Abe, B. Zadrozny, J. Langford, Outlier detection by active learning, in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2006
S. Axelsson, The base-rate fallacy and the difficulty of intrusion detection. ACM Trans. Inf. Syst. Secur. 3(3), 186–205 (2000)
A. Koufakou, E.G. Ortiz, M. Georgiopoulos, G.C. Anagnostopoulos, K.M. Reynolds, A scalable and efficient outlier detection strategy for categorical data, in 19th IEEE International Conference on Tools with Artificial Intelligence, 2007. ICTAI 2007
M.E. Otey, A. Ghoting, S. Parthasarathy, Fast distributed outlier detection in mixed-attribute data sets. Data Min. Knowl. Discov. 12(2–3), 203–228 (2006)
X. Song, M. Wu, C. Jermaine, S. Ranka, Conditional anomaly detection. IEEE Trans. Knowl. Data Eng. 19(5), 631–645 (2007)
C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
K. Wang, S. Stolfo, One-class training for Masquerade detection, in Workshop on Data Mining for Computer Security, 2003
R. Perdisci, G. Gu, W. Lee, Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems, in Sixth International Conference on Data Mining, 2006. ICDM’06. 2006
S. Mukkamala, G. Janoski, A. Sung, Intrusion detection using neural networks and support vector machines, in Proceedings of the 2002 International Joint Conference on Neural Networks, 2002
J. Weston, C. Watkins, Technical Report CSD-TR-98-04, Department of Computer Science, Multi-class Support Vector Machines, Royal Holloway, University of London, 1998
R. Chen, K. Cheng, Y. Chen, C. Hsieh, Using rough set and support vector machine for network intrusion detection system, in First Asian Conference on Intelligent Information and Database Systems, 2009
T. Shon, Y. Kim, C. Lee, J. Moon, A machine learning framework for network anomaly detection using SVM and GA, in Information Assurance Workshop, 2005. IAW’05. Proceedings from the Sixth Annual IEEE SMC, 2005
K. Wang, S. Stolfo, Anomalous payload-based network intrusion detection, in Recent Advances in Intrusion Detection, 2004
B. Sangster, T. O’Connor, T. Cook, R. Fanelli, E. Dean, J. Adams, C. Morrell, G. Conti, Toward instrumenting network warfare competitions to generate labeled datasets, in USENIX Security’s Workshop on Cyber Security Experimentation and Test (CSET), 2009
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
P. Biondi, Scapy, a powerful interactive packet manipulation program. , Scapy, 2011, http://www.secdev.org/projects/scapy/
V. Frias-Martinez, J. Sherrick, S.J. Stolfo, A.D. Keromytis, A network access control mechanism based on behavior profiles, in Computer Security Applications Conference, 2009. ACSAC’09. Annual, 2009
L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)
A. Criminisi, J. Shotton, E. Konukoglu, Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning, Microsoft Technical Report, 2011
D.S. Kim, S.M. Lee, J.S. Park, Building Lightweight Intrusion Detection System Based on Random Forest, ed. by J. Wang, Z. Yi, J.M. Zurada, B. Lu, H. Yin (Springer, Berlin, 2006), pp. 224–230
F.T. Liu, K.M. Ting, Z.-H. Zhou, Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data 6(1), 3:1–3:39 (2012)
S.C. Tan, K.M. Ting, T.F. Liu, Fast anomaly detection for streaming data, in Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2, 2011
H.S. Vaccaro, G.E. Liepins, Detection of anomalous computer session activity, in Proceedings of 1989 IEEE Symposium on Security and Privacy, 1989
D.E. Denning, An intrusion-detection model. IEEE Trans. Softw. Eng. 13(2), 222–232 (1987)
M. Mahoney, P. Chan, An analysis of the 1999 DARPA/Lincoln laboratory evaluation data for network anomaly detection, in Recent Advances in Intrusion Detection, 2003
J. McHugh, Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Trans. Inf. Syst. Secur. 3(4), 262–294 (2000)
T. Lunt, A. Tamaru, F. Gilham, R. Jagannathan, C. Jalali, P. Neumann, H. Javitz, A. Valdes, T. Garvey, A real-time intrusion-detection expert system (IDES), SRI International, Computer Science Laboratory, 1992
M. Molina, I. Paredes-Oliva, W. Routly, P. Barlet-Ros, Operational experiences with anomaly detection in backbone networks. Comput. Secur. 31(3), 273–285 (2012)
K.M. Tan, R.A. Maxion, “Why 6?” Defining the operational limits of stide, an anomaly-based intrusion detector, in Proceedings of the IEEE Symposium on Security and Privacy, 2001
L. Sassaman, M.L. Patterson, S. Bratus, A. Shubina, The Halting problems of network stack insecurity, in USENIX, 2011
Z. Zhou, Ensemble Methods: Foundations and Algorithms (Chapman & Hall, 2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A
Appendix A
Random decision tree classification of KDD’99 data was performed using the Scikit-learn [25] package under Python 2.7.2 on a desktop commodity workstation. Training was performed using the file kddcup.data_10_percent_corrected, and testing was done on the file kddcup.data.corrected. 494,021 training records were used, and 4,898,431 test records. The three fields “Count”, “diff_srv_rate”, and “dst_bytes” were extracted along with the label field in both data sets; all other data was discarded. The random decision forest was trained with the following parameters:
-
Classification threshold: simple majority
-
No bootstrapping used
-
Features per node: 2
-
Node splitting by information gain
-
Minimum leaf samples: 1
-
Minimum samples to split: 2
-
Max tree depth: 9
-
Number of trees: 11
Training the classifier required 4.4 s using a single processor, testing required approximately 122.8 s.Footnote 5 The following confusion matrix was produced (note that we have omitted correct classifications on the diagonal for compactness, and that we have also omitted rows corresponding to predictions that the classifier never produced).
A) Normal | Guess_passwd | Nmap | B) Loadmodule | Rootkit | Warezclient | C) Smurf | Pod | D) Neptune | Spy | ftp_write | Phf | E) Portsweep | Teardrop | Buffer_overflow | Land | Imap | F) Warezmaster | Perl | Multihop | Back | Ipsweep | G) Satan | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | 0 | 53 | 2315 | 8 | 10 | 1020 | 971 | 264 | 5537 | 2 | 8 | 4 | 9558 | 752 | 28 | 20 | 12 | 5 | 3 | 7 | 2197 | 12480 | 950 |
B | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
C | 2210 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 9672 | 0 | 0 | 0 | 1 | 119 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 95 |
D | 402 | 0 | 0 | 0 | 0 | 0 | 40 | 0 | 0 | 0 | 0 | 0 | 42 | 108 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 970 |
E | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 |
F | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
G | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Total false negatives: \( 36204/4898431 \approx 0.007 \)
Total false positives: \( 2622/4898431 \approx 0.0005 \)
The most common errors were misclassification of the IPsweep attack as normal traffic, and classification of flows corresponding to the Neptune attack as the Smurf attack. Random inspection of the IPsweep misclassifications suggests that each “attack” had several records associated with it; while many individual records were not correctly labeled, all instances that were examined by hand had at least one record in the total attack correctly classified. As the Smurf and Neptune attacks are both denial of service attacks, some confusion between the two is to be expected.
While these results certainly demonstrate that random decision forests are accurate and efficient classifiers, the alternative that the KDD’99 data is simply not a terribly representative data set for IDS research should not be excluded.
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Harang, R. (2014). Bridging the Semantic Gap: Human Factors in Anomaly-Based Intrusion Detection Systems. In: Pino, R. (eds) Network Science and Cybersecurity. Advances in Information Security, vol 55. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7597-2_2
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7597-2_2
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7596-5
Online ISBN: 978-1-4614-7597-2
eBook Packages: Computer ScienceComputer Science (R0)