Skip to main content

Intrusion Detection Using Disagreement-Based Semi-supervised Learning: Detection Enhancement and False Alarm Reduction

  • Conference paper
Cyberspace Safety and Security (CSS 2012)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 7672))

Included in the following conference series:

Abstract

With the development of intrusion detection systems (IDSs), a number of machine learning approaches have been applied to intrusion detection. For a traditional supervised learning algorithm, training examples with ground-truth labels should be given in advance. However, in real applications, the number of labeled examples is limited whereas a lot of unlabeled data is widely available, because labeling data requires a large amount of human efforts and is thus very expensive. To mitigate this issue, several semi-supervised learning algorithms, which aim to label data automatically without human intervention, have been proposed to utilize unlabeled data in improving the performance of IDSs. In this paper, we attempt to apply disagreement-based semi-supervised learning algorithm to anomaly detection. Based on our previous work, we further apply this approach to constructing a false alarm filter and investigate its performance of alarm reduction in a network environment. The experimental results show that the disagreement-based scheme is very effective in detecting intrusions and reducing false alarms by automatically labeling unlabeled data, and that its performance can further be improved by co-working with active learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Axelsson, S.: The Base-Rate Fallacy and The Difficulty of Intrusion Detection. ACM Transactions on Information and System Security 3(3), 186–205 (2000)

    Article  Google Scholar 

  2. Scarfone, K., Mell, P.: Guide to Intrusion Detection and Prevention Systems (IDPS), pp. 800–894. NIST Special Publication (2007)

    Google Scholar 

  3. Vigna, G., Kemmerer, R.A.: NetSTAT: A Network-based Intrusion Detection Approach. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 25–34. IEEE Press, New York (1998)

    Google Scholar 

  4. Roesch, M.: Snort: Lightweight Intrusion Detection for Networks. In: Proceedings of the 13th Large Installation System Administration Conference (LISA), pp. 229–238 (1999)

    Google Scholar 

  5. Valdes, A., Anderson, D.: Statistical Methods for Computer Usage Anomaly Detection Using NIDES. Technical Report, SRI International (January 1995)

    Google Scholar 

  6. Ghosh, A.K., Wanken, J., Charron, F.: Detecting Anomalous and Unknown Intrusions Against Programs. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 259–267 (1998)

    Google Scholar 

  7. Tombini, E., Debar, H., Me, L., Ducasse, M.: A Serial Combination of Anomaly and Misuse IDSes Applied to HTTP Traffic. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 428–437 (December 2004)

    Google Scholar 

  8. Zhang, J., Zulkernine, M.: A Hybrid Network Intrusion Detection Technique Using Random Forests. In: Proceedings of the International Conference on Availability, Reliability and Security (ARES), pp. 20–22 (April 2006)

    Google Scholar 

  9. Zhou, Z.-H., Li, M.: Tri-training: Exploiting Unlabeled Data Using Three Classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)

    Article  Google Scholar 

  10. Zhou, Z.-H., Li, M.: Semi-Supervised Learning by Disagreement. Knowledge and Information Systems 24(3), 415–439 (2010)

    Article  MathSciNet  Google Scholar 

  11. Zhou, Z.-H.: Unlabeled Data and Multiple Views. In: Schwenker, F., Trentin, E. (eds.) PSL 2011. LNCS, vol. 7081, pp. 1–7. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. McHugh, J.: Testing Intrusion Detection Systems: A Critique of the 1998 and 1999 DARPA Intrusion Detection System Evaluations As Performed by Lincoln Laboratory. ACM Transactions on Information System Security 3(4), 262–294 (2000)

    Article  Google Scholar 

  13. Snort. Homepage, http://www.snort.org/ (accessed on May 25, 2012)

  14. Lippmann, R.P., Fried, D.J., Graf, I., Haines, J.W., Kendall, K.R., McClung, D., Weber, D., Webster, S.E., Wyschogrod, D., Cunningham, R.K., Zissman, M.A.: Evaluating Intrusion Detection Systems: the 1998 DARPA Off-Line Intrusion Detection Evaluation. In: Proceedings of DARPA Information Survivability Conference and Exposition, pp. 12–26 (2000)

    Google Scholar 

  15. Miller, D.J., Uyar, H.S.: A Mixture of Experts Classifier with Learning based on both Labelled and Unlabelled Data. In: Advances in Neural Information Processing Systems 9, pp. 571–577. MIT Press, Cambridge (1997)

    Google Scholar 

  16. Chapelle, O., Zien, A.: Semi-Supervised Learning by Low Density Separation. In: Proceedings of the International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)

    Google Scholar 

  17. Belkin, M., Niyogi, P.: Semi-Supervised Learning on Riemannian Manifolds. Machine Learning 56(1-3), 209–239 (2004)

    Article  Google Scholar 

  18. Sommer, R., Paxson, V.: Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In: IEEE Symposium on Security and Privacy, pp. 305–316 (2010)

    Google Scholar 

  19. Shahshahani, B., Landgrebe, D.: The Effect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigating the Hughes Phenomenon. IEEE Transactions on Geoscience and Remote Sensing 32(5), 1087–1095 (1994)

    Article  Google Scholar 

  20. Meng, Y., Kwok, L.F.: Adaptive False Alarm Filter Using Machine Learning in Intrusion Detection. In: Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 573–584. Springer (December 2011)

    Google Scholar 

  21. Zhu, X.: Semi-Supervised Learning Literature Survey. Technical Report 1530, Computer Science Department, University of Wisconsin, Madison (2006)

    Google Scholar 

  22. Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  23. Blum, A., Chawla, S.: Learning from Labeled and Unlabeled Data Using Graph Mincuts. In: Proceedings of the 18th International Conference on Machine Learning, pp. 19–26 (2001)

    Google Scholar 

  24. Lee, W., Stolfo, S.J., Mok, K.W.: A Data Mining Framework for Building Intrusion Detection Models. In: IEEE Symposium on Security and Privacy, pp. 120–132 (1999)

    Google Scholar 

  25. Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-Training. In: Proceedings of the Annual Conference on Computational Learning Theory, pp. 92–100 (1998)

    Google Scholar 

  26. Pietraszek, T.: Using Adaptive Alert Classification to Reduce False Positives in Intrusion Detection. In: Jonsson, E., Valdes, A., Almgren, M. (eds.) RAID 2004. LNCS, vol. 3224, pp. 102–124. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  27. Law, K.H., Kwok, L.-F.: IDS False Alarm Filtering Using KNN Classifier. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 114–121. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  28. Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text Classification from Labeled and Unlabeled Documents Using EM. Machine Learning 39(2-3), 103–134 (2000)

    Article  Google Scholar 

  29. Alharby, A., Imai, H.: IDS False Alarm Reduction Using Continuous and Discontinuous Patterns. In: Ioannidis, J., Keromytis, A.D., Yung, M. (eds.) ACNS 2005. LNCS, vol. 3531, pp. 192–205. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  30. Lane, T.: A Decision-Theoretic, Semi-Supervised Model for Intrusion Detection. In: Machine Learning and Data Mining for Computer Security: Methods and Applications, pp. 1–19 (2006)

    Google Scholar 

  31. Chen, C., Gong, Y., Tian, Y.: Semi-Supervised Learning Methods for Network Intrusion Detection. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 2603–2608 (2008)

    Google Scholar 

  32. Panda, M., Patra, M.R.: Semi-Naïve Bayesian Method for Network Intrusion Detection System. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009, Part I. LNCS, vol. 5863, pp. 614–621. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  33. Mao, C.H., Lee, H.M., Parikh, D., Chen, T., Huang, S.Y.: Semi-Supervised Co-Training and Active Learning based Approach for Multi-View Intrusion Detection. In: Proceedings of the 2009 ACM Symposium on Applied Computing (SAC), pp. 2042–2048 (2009)

    Google Scholar 

  34. Zhang, M., Mei, H.: A New Method for Filtering IDS False Positives with Semi-supervised Classification. In: Huang, D.-S., Jiang, C., Bevilacqua, V., Figueroa, J.C. (eds.) ICIC 2012. LNCS, vol. 7389, pp. 513–519. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  35. Chiu, C.-Y., Lee, Y.-J., Chang, C.-C., Luo, W.-Y., Huang, H.-C.: Semi-Supervised Learning for False Alarm Reduction. In: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM), pp. 595–605 (2010)

    Google Scholar 

  36. Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble Selection from Libraries of Models. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 137–144 (2004)

    Google Scholar 

  37. Nigam, K., Ghani, R.: Analyzing the Effectiveness and Applicability of Co-Training. In: Proceedings of the 9th ACM International Conference on Information and Knowledge Management, pp. 86–93 (2000)

    Google Scholar 

  38. Almgren, M., Jonsson, E.: Using Active Learning in Intrusion Detection. In: Proceedings of the IEEE Computer Security Foundations Workshop (CSFW), pp. 88–98 (2004)

    Google Scholar 

  39. Görnitz, N., Kloft, M., Rieck, K., Brefeld, U.: Active Learning for Network Intrusion Detection. In: Proceedings of the ACM Workshop on Security and Artificial Intelligence (AISec), pp. 47–54 (2009)

    Google Scholar 

  40. WEKA - Waikato Environment for Knowledge Analysis, http://www.cs.waikato.ac.nz/ml/weka/ (accessed on May 20, 2012)

  41. Wireshark, Homepage, http://www.wireshark.org (accessed on April 10, 2012)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Meng, Y., Kwok, Lf. (2012). Intrusion Detection Using Disagreement-Based Semi-supervised Learning: Detection Enhancement and False Alarm Reduction. In: Xiang, Y., Lopez, J., Kuo, CC.J., Zhou, W. (eds) Cyberspace Safety and Security. CSS 2012. Lecture Notes in Computer Science, vol 7672. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35362-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35362-8_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35361-1

  • Online ISBN: 978-3-642-35362-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics