Abstract
With the development of intrusion detection systems (IDSs), a number of machine learning approaches have been applied to intrusion detection. For a traditional supervised learning algorithm, training examples with ground-truth labels should be given in advance. However, in real applications, the number of labeled examples is limited whereas a lot of unlabeled data is widely available, because labeling data requires a large amount of human efforts and is thus very expensive. To mitigate this issue, several semi-supervised learning algorithms, which aim to label data automatically without human intervention, have been proposed to utilize unlabeled data in improving the performance of IDSs. In this paper, we attempt to apply disagreement-based semi-supervised learning algorithm to anomaly detection. Based on our previous work, we further apply this approach to constructing a false alarm filter and investigate its performance of alarm reduction in a network environment. The experimental results show that the disagreement-based scheme is very effective in detecting intrusions and reducing false alarms by automatically labeling unlabeled data, and that its performance can further be improved by co-working with active learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Axelsson, S.: The Base-Rate Fallacy and The Difficulty of Intrusion Detection. ACM Transactions on Information and System Security 3(3), 186–205 (2000)
Scarfone, K., Mell, P.: Guide to Intrusion Detection and Prevention Systems (IDPS), pp. 800–894. NIST Special Publication (2007)
Vigna, G., Kemmerer, R.A.: NetSTAT: A Network-based Intrusion Detection Approach. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 25–34. IEEE Press, New York (1998)
Roesch, M.: Snort: Lightweight Intrusion Detection for Networks. In: Proceedings of the 13th Large Installation System Administration Conference (LISA), pp. 229–238 (1999)
Valdes, A., Anderson, D.: Statistical Methods for Computer Usage Anomaly Detection Using NIDES. Technical Report, SRI International (January 1995)
Ghosh, A.K., Wanken, J., Charron, F.: Detecting Anomalous and Unknown Intrusions Against Programs. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 259–267 (1998)
Tombini, E., Debar, H., Me, L., Ducasse, M.: A Serial Combination of Anomaly and Misuse IDSes Applied to HTTP Traffic. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 428–437 (December 2004)
Zhang, J., Zulkernine, M.: A Hybrid Network Intrusion Detection Technique Using Random Forests. In: Proceedings of the International Conference on Availability, Reliability and Security (ARES), pp. 20–22 (April 2006)
Zhou, Z.-H., Li, M.: Tri-training: Exploiting Unlabeled Data Using Three Classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)
Zhou, Z.-H., Li, M.: Semi-Supervised Learning by Disagreement. Knowledge and Information Systems 24(3), 415–439 (2010)
Zhou, Z.-H.: Unlabeled Data and Multiple Views. In: Schwenker, F., Trentin, E. (eds.) PSL 2011. LNCS, vol. 7081, pp. 1–7. Springer, Heidelberg (2012)
McHugh, J.: Testing Intrusion Detection Systems: A Critique of the 1998 and 1999 DARPA Intrusion Detection System Evaluations As Performed by Lincoln Laboratory. ACM Transactions on Information System Security 3(4), 262–294 (2000)
Snort. Homepage, http://www.snort.org/ (accessed on May 25, 2012)
Lippmann, R.P., Fried, D.J., Graf, I., Haines, J.W., Kendall, K.R., McClung, D., Weber, D., Webster, S.E., Wyschogrod, D., Cunningham, R.K., Zissman, M.A.: Evaluating Intrusion Detection Systems: the 1998 DARPA Off-Line Intrusion Detection Evaluation. In: Proceedings of DARPA Information Survivability Conference and Exposition, pp. 12–26 (2000)
Miller, D.J., Uyar, H.S.: A Mixture of Experts Classifier with Learning based on both Labelled and Unlabelled Data. In: Advances in Neural Information Processing Systems 9, pp. 571–577. MIT Press, Cambridge (1997)
Chapelle, O., Zien, A.: Semi-Supervised Learning by Low Density Separation. In: Proceedings of the International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)
Belkin, M., Niyogi, P.: Semi-Supervised Learning on Riemannian Manifolds. Machine Learning 56(1-3), 209–239 (2004)
Sommer, R., Paxson, V.: Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In: IEEE Symposium on Security and Privacy, pp. 305–316 (2010)
Shahshahani, B., Landgrebe, D.: The Effect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigating the Hughes Phenomenon. IEEE Transactions on Geoscience and Remote Sensing 32(5), 1087–1095 (1994)
Meng, Y., Kwok, L.F.: Adaptive False Alarm Filter Using Machine Learning in Intrusion Detection. In: Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 573–584. Springer (December 2011)
Zhu, X.: Semi-Supervised Learning Literature Survey. Technical Report 1530, Computer Science Department, University of Wisconsin, Madison (2006)
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
Blum, A., Chawla, S.: Learning from Labeled and Unlabeled Data Using Graph Mincuts. In: Proceedings of the 18th International Conference on Machine Learning, pp. 19–26 (2001)
Lee, W., Stolfo, S.J., Mok, K.W.: A Data Mining Framework for Building Intrusion Detection Models. In: IEEE Symposium on Security and Privacy, pp. 120–132 (1999)
Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-Training. In: Proceedings of the Annual Conference on Computational Learning Theory, pp. 92–100 (1998)
Pietraszek, T.: Using Adaptive Alert Classification to Reduce False Positives in Intrusion Detection. In: Jonsson, E., Valdes, A., Almgren, M. (eds.) RAID 2004. LNCS, vol. 3224, pp. 102–124. Springer, Heidelberg (2004)
Law, K.H., Kwok, L.-F.: IDS False Alarm Filtering Using KNN Classifier. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 114–121. Springer, Heidelberg (2005)
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text Classification from Labeled and Unlabeled Documents Using EM. Machine Learning 39(2-3), 103–134 (2000)
Alharby, A., Imai, H.: IDS False Alarm Reduction Using Continuous and Discontinuous Patterns. In: Ioannidis, J., Keromytis, A.D., Yung, M. (eds.) ACNS 2005. LNCS, vol. 3531, pp. 192–205. Springer, Heidelberg (2005)
Lane, T.: A Decision-Theoretic, Semi-Supervised Model for Intrusion Detection. In: Machine Learning and Data Mining for Computer Security: Methods and Applications, pp. 1–19 (2006)
Chen, C., Gong, Y., Tian, Y.: Semi-Supervised Learning Methods for Network Intrusion Detection. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 2603–2608 (2008)
Panda, M., Patra, M.R.: Semi-Naïve Bayesian Method for Network Intrusion Detection System. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009, Part I. LNCS, vol. 5863, pp. 614–621. Springer, Heidelberg (2009)
Mao, C.H., Lee, H.M., Parikh, D., Chen, T., Huang, S.Y.: Semi-Supervised Co-Training and Active Learning based Approach for Multi-View Intrusion Detection. In: Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), pp. 2042–2048 (2009)
Zhang, M., Mei, H.: A New Method for Filtering IDS False Positives with Semi-supervised Classification. In: Huang, D.-S., Jiang, C., Bevilacqua, V., Figueroa, J.C. (eds.) ICIC 2012. LNCS, vol. 7389, pp. 513–519. Springer, Heidelberg (2012)
Chiu, C.-Y., Lee, Y.-J., Chang, C.-C., Luo, W.-Y., Huang, H.-C.: Semi-Supervised Learning for False Alarm Reduction. In: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM), pp. 595–605 (2010)
Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble Selection from Libraries of Models. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 137–144 (2004)
Nigam, K., Ghani, R.: Analyzing the Effectiveness and Applicability of Co-Training. In: Proceedings of the 9th ACM International Conference on Information and Knowledge Management, pp. 86–93 (2000)
Almgren, M., Jonsson, E.: Using Active Learning in Intrusion Detection. In: Proceedings of the IEEE Computer Security Foundations Workshop (CSFW), pp. 88–98 (2004)
Görnitz, N., Kloft, M., Rieck, K., Brefeld, U.: Active Learning for Network Intrusion Detection. In: Proceedings of the ACM Workshop on Security and Artificial Intelligence (AISec), pp. 47–54 (2009)
WEKA - Waikato Environment for Knowledge Analysis, http://www.cs.waikato.ac.nz/ml/weka/ (accessed on May 20, 2012)
Wireshark, Homepage, http://www.wireshark.org (accessed on April 10, 2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meng, Y., Kwok, Lf. (2012). Intrusion Detection Using Disagreement-Based Semi-supervised Learning: Detection Enhancement and False Alarm Reduction. In: Xiang, Y., Lopez, J., Kuo, CC.J., Zhou, W. (eds) Cyberspace Safety and Security. CSS 2012. Lecture Notes in Computer Science, vol 7672. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35362-8_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-35362-8_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35361-1
Online ISBN: 978-3-642-35362-8
eBook Packages: Computer ScienceComputer Science (R0)