Intrusion Detection Using Disagreement-Based Semi-supervised Learning: Detection Enhancement and False Alarm Reduction

Meng, Yuxin; Kwok, Lam-for

doi:10.1007/978-3-642-35362-8_36

Yuxin Meng¹⁹ &
Lam-for Kwok¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 7672))

Included in the following conference series:

International Symposium on Cyberspace Safety and Security

2545 Accesses
9 Citations

Abstract

With the development of intrusion detection systems (IDSs), a number of machine learning approaches have been applied to intrusion detection. For a traditional supervised learning algorithm, training examples with ground-truth labels should be given in advance. However, in real applications, the number of labeled examples is limited whereas a lot of unlabeled data is widely available, because labeling data requires a large amount of human efforts and is thus very expensive. To mitigate this issue, several semi-supervised learning algorithms, which aim to label data automatically without human intervention, have been proposed to utilize unlabeled data in improving the performance of IDSs. In this paper, we attempt to apply disagreement-based semi-supervised learning algorithm to anomaly detection. Based on our previous work, we further apply this approach to constructing a false alarm filter and investigate its performance of alarm reduction in a network environment. The experimental results show that the disagreement-based scheme is very effective in detecting intrusions and reducing false alarms by automatically labeling unlabeled data, and that its performance can further be improved by co-working with active learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Axelsson, S.: The Base-Rate Fallacy and The Difficulty of Intrusion Detection. ACM Transactions on Information and System Security 3(3), 186–205 (2000)
Article Google Scholar
Scarfone, K., Mell, P.: Guide to Intrusion Detection and Prevention Systems (IDPS), pp. 800–894. NIST Special Publication (2007)
Google Scholar
Vigna, G., Kemmerer, R.A.: NetSTAT: A Network-based Intrusion Detection Approach. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 25–34. IEEE Press, New York (1998)
Google Scholar
Roesch, M.: Snort: Lightweight Intrusion Detection for Networks. In: Proceedings of the 13th Large Installation System Administration Conference (LISA), pp. 229–238 (1999)
Google Scholar
Valdes, A., Anderson, D.: Statistical Methods for Computer Usage Anomaly Detection Using NIDES. Technical Report, SRI International (January 1995)
Google Scholar
Ghosh, A.K., Wanken, J., Charron, F.: Detecting Anomalous and Unknown Intrusions Against Programs. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 259–267 (1998)
Google Scholar
Tombini, E., Debar, H., Me, L., Ducasse, M.: A Serial Combination of Anomaly and Misuse IDSes Applied to HTTP Traffic. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 428–437 (December 2004)
Google Scholar
Zhang, J., Zulkernine, M.: A Hybrid Network Intrusion Detection Technique Using Random Forests. In: Proceedings of the International Conference on Availability, Reliability and Security (ARES), pp. 20–22 (April 2006)
Google Scholar
Zhou, Z.-H., Li, M.: Tri-training: Exploiting Unlabeled Data Using Three Classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)
Article Google Scholar
Zhou, Z.-H., Li, M.: Semi-Supervised Learning by Disagreement. Knowledge and Information Systems 24(3), 415–439 (2010)
Article MathSciNet Google Scholar
Zhou, Z.-H.: Unlabeled Data and Multiple Views. In: Schwenker, F., Trentin, E. (eds.) PSL 2011. LNCS, vol. 7081, pp. 1–7. Springer, Heidelberg (2012)
Chapter Google Scholar
McHugh, J.: Testing Intrusion Detection Systems: A Critique of the 1998 and 1999 DARPA Intrusion Detection System Evaluations As Performed by Lincoln Laboratory. ACM Transactions on Information System Security 3(4), 262–294 (2000)
Article Google Scholar
Snort. Homepage, http://www.snort.org/ (accessed on May 25, 2012)
Lippmann, R.P., Fried, D.J., Graf, I., Haines, J.W., Kendall, K.R., McClung, D., Weber, D., Webster, S.E., Wyschogrod, D., Cunningham, R.K., Zissman, M.A.: Evaluating Intrusion Detection Systems: the 1998 DARPA Off-Line Intrusion Detection Evaluation. In: Proceedings of DARPA Information Survivability Conference and Exposition, pp. 12–26 (2000)
Google Scholar
Miller, D.J., Uyar, H.S.: A Mixture of Experts Classifier with Learning based on both Labelled and Unlabelled Data. In: Advances in Neural Information Processing Systems 9, pp. 571–577. MIT Press, Cambridge (1997)
Google Scholar
Chapelle, O., Zien, A.: Semi-Supervised Learning by Low Density Separation. In: Proceedings of the International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)
Google Scholar
Belkin, M., Niyogi, P.: Semi-Supervised Learning on Riemannian Manifolds. Machine Learning 56(1-3), 209–239 (2004)
Article Google Scholar
Sommer, R., Paxson, V.: Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In: IEEE Symposium on Security and Privacy, pp. 305–316 (2010)
Google Scholar
Shahshahani, B., Landgrebe, D.: The Effect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigating the Hughes Phenomenon. IEEE Transactions on Geoscience and Remote Sensing 32(5), 1087–1095 (1994)
Article Google Scholar
Meng, Y., Kwok, L.F.: Adaptive False Alarm Filter Using Machine Learning in Intrusion Detection. In: Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 573–584. Springer (December 2011)
Google Scholar
Zhu, X.: Semi-Supervised Learning Literature Survey. Technical Report 1530, Computer Science Department, University of Wisconsin, Madison (2006)
Google Scholar
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Blum, A., Chawla, S.: Learning from Labeled and Unlabeled Data Using Graph Mincuts. In: Proceedings of the 18th International Conference on Machine Learning, pp. 19–26 (2001)
Google Scholar
Lee, W., Stolfo, S.J., Mok, K.W.: A Data Mining Framework for Building Intrusion Detection Models. In: IEEE Symposium on Security and Privacy, pp. 120–132 (1999)
Google Scholar
Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-Training. In: Proceedings of the Annual Conference on Computational Learning Theory, pp. 92–100 (1998)
Google Scholar
Pietraszek, T.: Using Adaptive Alert Classification to Reduce False Positives in Intrusion Detection. In: Jonsson, E., Valdes, A., Almgren, M. (eds.) RAID 2004. LNCS, vol. 3224, pp. 102–124. Springer, Heidelberg (2004)
Chapter Google Scholar
Law, K.H., Kwok, L.-F.: IDS False Alarm Filtering Using KNN Classifier. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 114–121. Springer, Heidelberg (2005)
Chapter Google Scholar
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text Classification from Labeled and Unlabeled Documents Using EM. Machine Learning 39(2-3), 103–134 (2000)
Article Google Scholar
Alharby, A., Imai, H.: IDS False Alarm Reduction Using Continuous and Discontinuous Patterns. In: Ioannidis, J., Keromytis, A.D., Yung, M. (eds.) ACNS 2005. LNCS, vol. 3531, pp. 192–205. Springer, Heidelberg (2005)
Chapter Google Scholar
Lane, T.: A Decision-Theoretic, Semi-Supervised Model for Intrusion Detection. In: Machine Learning and Data Mining for Computer Security: Methods and Applications, pp. 1–19 (2006)
Google Scholar
Chen, C., Gong, Y., Tian, Y.: Semi-Supervised Learning Methods for Network Intrusion Detection. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 2603–2608 (2008)
Google Scholar
Panda, M., Patra, M.R.: Semi-Naïve Bayesian Method for Network Intrusion Detection System. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009, Part I. LNCS, vol. 5863, pp. 614–621. Springer, Heidelberg (2009)
Chapter Google Scholar
Mao, C.H., Lee, H.M., Parikh, D., Chen, T., Huang, S.Y.: Semi-Supervised Co-Training and Active Learning based Approach for Multi-View Intrusion Detection. In: Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), pp. 2042–2048 (2009)
Google Scholar
Zhang, M., Mei, H.: A New Method for Filtering IDS False Positives with Semi-supervised Classification. In: Huang, D.-S., Jiang, C., Bevilacqua, V., Figueroa, J.C. (eds.) ICIC 2012. LNCS, vol. 7389, pp. 513–519. Springer, Heidelberg (2012)
Chapter Google Scholar
Chiu, C.-Y., Lee, Y.-J., Chang, C.-C., Luo, W.-Y., Huang, H.-C.: Semi-Supervised Learning for False Alarm Reduction. In: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM), pp. 595–605 (2010)
Google Scholar
Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble Selection from Libraries of Models. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 137–144 (2004)
Google Scholar
Nigam, K., Ghani, R.: Analyzing the Effectiveness and Applicability of Co-Training. In: Proceedings of the 9th ACM International Conference on Information and Knowledge Management, pp. 86–93 (2000)
Google Scholar
Almgren, M., Jonsson, E.: Using Active Learning in Intrusion Detection. In: Proceedings of the IEEE Computer Security Foundations Workshop (CSFW), pp. 88–98 (2004)
Google Scholar
Görnitz, N., Kloft, M., Rieck, K., Brefeld, U.: Active Learning for Network Intrusion Detection. In: Proceedings of the ACM Workshop on Security and Artificial Intelligence (AISec), pp. 47–54 (2009)
Google Scholar
WEKA - Waikato Environment for Knowledge Analysis, http://www.cs.waikato.ac.nz/ml/weka/ (accessed on May 20, 2012)
Wireshark, Homepage, http://www.wireshark.org (accessed on April 10, 2012)

Download references

Author information

Authors and Affiliations

Department of Computer Science, City University of Hong Kong, Hong Kong, China
Yuxin Meng & Lam-for Kwok

Authors

Yuxin Meng
View author publications
You can also search for this author in PubMed Google Scholar
Lam-for Kwok
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology, Deakin University, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Yang Xiang & Wanlei Zhou &
Computer Science Department, ETSI Informatica, University of Malaga, Campus de Teatinos, 29170, Malaga, Spain
Javier Lopez
Ming Hsieh Department of Electrical Engineering, University of Southern California, 3740 McClintock Ave., 90089-2564, Los Angeles, CA, USA
C.-C. Jay Kuo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meng, Y., Kwok, Lf. (2012). Intrusion Detection Using Disagreement-Based Semi-supervised Learning: Detection Enhancement and False Alarm Reduction. In: Xiang, Y., Lopez, J., Kuo, CC.J., Zhou, W. (eds) Cyberspace Safety and Security. CSS 2012. Lecture Notes in Computer Science, vol 7672. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35362-8_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-35362-8_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35361-1
Online ISBN: 978-3-642-35362-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics