Abstract
Dimensionality reduction is crucial when data mining techniques are applied for intrusion detection. Usually, the Host based intrusion detection problem is formulated as a classification problem and different classification algorithms are applied to high dimensional vectors that represent the system call sequences. Any such classification algorithm demands repeated computation of similarity between pairs of vectors and the computational overhead increases with the increase in the dimensionality of the vectors. Here, we believe that dimensionality reduction of these vectors will help in classification. However, the choice of dimensionality reduction method critically depends on preservation of similarity for efficient classification. We show that Locally Linear Embedding (LLE) preserves the similarity in this context. In this paper, we examine its applicability in two different approaches for system call data with benchmark dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cabrera, J.B.D., Ravichandran, B., Mehra, R.K.: Detection and classification of intrusions and faults using sequences of system calls. ACM SIGMOD Record, Special Issue: Special Section on Data Mining for Intrusion Detection and Threat Analysis 30(4), 25–34 (2001)
DARPA 1998 Data Set, MIT Lincoln Laboratory (1998), available at http://www.ll.mit.edu/IST/ideval/data/data_index.html
Dash, S.K., Reddy, K.S., Pujari, A.K.: Episode based masquerade detection. In: Jajodia, S., Mazumdar, C. (eds.) ICISS 2005. LNCS, vol. 3803, pp. 251–262. Springer, Heidelberg (2005)
Dash, S.K., Rawat, S., Pujari, A.K.: LLE on System Calls for Host Based Intrusion Detection. In: Proceedings of the 2006 International Conference on Computational Intelligence and Security, Guangzhou, vol. 1, pp. 609–612 (2006)
Forrest, S., Hofmeyr, S.A., Somayaji, A.: Computer Immunology. Communications of the ACM 40(10), 88–96 (1997)
Forrest, S., Hofmeyr, S.A., Somayaji, A., Longstaff, T.A.: A Sense of Self for Unix Processes. In: Proceedings of the 1996 IEEE Symposium on Research in Security and Privacy, pp. 120–128. IEEE Computer Society Press, Los Alamitos, CA (1996)
Ghosh, A.K., Schwartzbard, A.: A Study in Using Neural Networks for Anomaly and Misuse Detection. In: Proceedings of the 8th USENIX security Symposium, Washington, DC, USA, pp. 141–151 (August 23-26, 1999)
Hofmeyr, S.A., Forrest, A., Somayaji, A.: Intrusion Detection Using Sequences of System Calls. Journal of Computer Security 6, 151–180 (1998)
Kang, D.-K., Fuller, D., Honavar, V.: Learning Classifiers for Misuse and Anomaly Detection Using a Bag of System Calls Representation. In: Proceedings of the 2005 IEEE workshop on Information Assurance and Security, pp. 118–125 (2005)
Kouropteva, O., Okun, O., Pietiknen, M.: Incremental locally linear embedding. Pattern Recognition 38, 1764–1767 (2005)
Lee, W., Stolfo, S., Chan, P.: Learning Patterns from Unix Process Execution Traces for Intrusion Detection. In: Proceedings of the AAAI 1997 workshop on AI methods in Fraud and risk management, pp. 50–56. AAAI Press, Stanford (1997)
Lee, W., Stolfo Salvatore, J.: Data Mining Approaches for Intrusion Detection. In: SECURITY 1998. Proceedings of the 7th USENIX Security Symposium, pp. 79–94. Usenix Association (January 26-29, 1998)
Liao, Y., Vemuri, V.R.: Use of K-Nearest Neighbor Classifier for Intrusion Detection. Computers & Security 21(5), 439–448 (2002)
Mordohai, P., Medioni, G.: Unsupervised Dimensionality Estimation and Manifold Learning in high-dimensional Spaces by Tensor Voting. In: 19th International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, pp. 798–803 (2005)
Mukkamala, R., Gagnon, J., Jajodia, S.: Integrating Data Mining Techniques with Intrusion detection Methods. In: Research Advances in database and Information System Security: IFIPTCII, 13th working conference on Database security, July, Kluwer Academic Publishers, USA (2000)
Patwari, N., Hero, A.O., Pacholski, A.: Manifold learning visualization of network traffic data. In: Proc of the 2005 ACM SIGCOMM workshop on mining network data, Philadelphia, PA, pp. 191–196 (2005)
Rawat, S., Gulati, V.P., Pujari, A.K.: A Fast Host-Based Intrusion Detection System Using Rough Set Theory. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets IV. LNCS, vol. 3700, pp. 144–161. Springer, Heidelberg (2005)
Rawat, S., Gulati, V.P., Pujari, A.K., Vemuri, V.R.: Intrusion Detection Using Text Processing Techniques with a Binary-Weighted Cosine Metric. Journal of Information Assurance and Security 1, 43–50 (2006)
Roweis, S.T., Lawrance, K.S.: Nonlinear Dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Tandon, G., Chan, P.: Learning Rules from System Calls Arguments and Sequences for Anomaly Detection. In: DMSEC 2003. ICDM Workshop on Data Mining for Computer Security, Melbourne, FL, pp. 20–29 (2003)
Tandon, G., Chan, P.K.: On the Learning of System Call Attributes for Host-Based Anomaly Detection. International Journal on Artificial Intelligence Tools 15(6), 875–892 (2006)
Warrender, C., Forrest, S., Pearlmutter, B.: Detecting Intrusions Using System Calls: Alternative Data Models. In: IEEE Symposium on Security and Privacy (1999)
Wespi, A., Dacier, M., Debar, H.: Intrusion Detection Using Variable-Length Audit Trail Pattern. In: Debar, H., Mé, L., Wu, S.F. (eds.) RAID 2000. LNCS, vol. 1907, pp. 110–129. Springer, Heidelberg (2000)
Zhang, J., Li, S.Z., Wang, J.: Manifold learning and applications in recognition. In: Intelligent Multimedia Processing with Soft Computing, Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dash, S.K., Rawat, S., Pujari, A.K. (2007). Use of Dimensionality Reduction for Intrusion Detection. In: McDaniel, P., Gupta, S.K. (eds) Information Systems Security. ICISS 2007. Lecture Notes in Computer Science, vol 4812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77086-2_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-77086-2_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77085-5
Online ISBN: 978-3-540-77086-2
eBook Packages: Computer ScienceComputer Science (R0)