Skip to main content

Reconstructing Classification to Enhance Machine-Learning Based Network Intrusion Detection by Embracing Ambiguity

  • Conference paper
  • First Online:
Silicon Valley Cybersecurity Conference (SVCC 2020)

Abstract

Network intrusion detection systems (IDS) has efficiently identified the profiles of normal network activities, extracted intrusion patterns, and constructed generalized models to evaluate (un)known attacks using a wide range of machine learning approaches. In spite of the effectiveness of machine learning-based IDS, it has been still challenging to reduce high false alarms due to data misclassification. In this paper, by using multiple decision mechanisms, we propose a new classification method to identify misclassified data and then to classify them into three different classes, called a malicious, benign, and ambiguous dataset. In other words, the ambiguous dataset contains a majority of the misclassified dataset and is thus the most informative for improving the model and anomaly detection because of the lack of confidence for the data classification in the model. We evaluate our approach with the recent real-world network traffic data, Kyoto2006+ datasets, and show that the ambiguous dataset contains 77.2% of the previously misclassified data. Re-evaluating the ambiguous dataset effectively reduces the false prediction rate with minimal overhead and improves accuracy by 15%.

This work is supported by NSF Award #1723663.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lane, T., Brodley, C.E.: An application of machine learning to anomaly detection. In: Proceedings of the 20th National Information Systems Security Conference, vol. 377, pp. 366–380. Baltimore, USA (1997)

    Google Scholar 

  2. Ghosh, A.K., Wanken, J., Charron, F.: Detecting anomalous and unknown intrusions against programs. In: 14th Annual Computer Security Applications Conference: Proceedings, pp. 259–267. IEEE (1998)

    Google Scholar 

  3. Cannady, J.: Artificial neural networks for misuse detection. In: National Information Systems Security Conference, pp. 368–381 (1998)

    Google Scholar 

  4. Sinclair, C., Pierce, L., Matzner, S.: An application of machine learning to network intrusion detection. In: Computer Security Applications Conference (ACSAC 1999) Proceedings. 15th Annual, pp. 371–377. IEEE (1999)

    Google Scholar 

  5. Kumar, S., Spafford, E.H.: A software architecture to support misuse intrusion detection (1995)

    Google Scholar 

  6. Ilgun, K., Kemmerer, R.A., Porras, P.A.: State transition analysis: a rule-based intrusion detection approach. IEEE Trans. Softw. Eng. 21(3), 181–199 (1995)

    Article  Google Scholar 

  7. Lunt, T.F., Tamaru, A., Gillham, F.: A Real-Time Intrusion-Detection Expert System (IDES). SRI International, Computer Science Laboratory (1992)

    Google Scholar 

  8. Paxson, V.: Bro: a system for detecting network intruders in real-time. Comput. Netw. 31(23), 2435–2463 (1999)

    Article  Google Scholar 

  9. Roesch, M., et al.: Snort: lightweight intrusion detection for networks. Lisa 99(1), 229–238 (1999)

    MathSciNet  Google Scholar 

  10. Mukkamala, S., Sung, A., Abraham, A.: Cyber security challenges: designing efficient intrusion detection systems and antivirus tools. Vemuri, V. Rao, Enhancing Computer Security with Smart Technology. (Auerbach, 2006), pp. 125–163 (2005)

    Google Scholar 

  11. Nguyen, T.T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. IEEE Commun. Surv. Tutorials 10(4), 56–76 (2008)

    Google Scholar 

  12. Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28(1), 18–28 (2009)

    Google Scholar 

  13. Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010)

    Article  Google Scholar 

  14. Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutorials 16(1), 303–336 (2014)

    Article  Google Scholar 

  15. Dua, S., Du, X.: Data Mining and Machine Learning in Cybersecurity. CRC Press (2016)

    Google Scholar 

  16. Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutorials 18(2), 1153–1176 (2016)

    Article  Google Scholar 

  17. Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. 2, 222–232 (1987)

    Article  Google Scholar 

  18. Mukherjee, B., Heberlein, L.T., Levitt, K.N.: Network intrusion detection. IEEE Netw. 8(3), 26–41 (1994)

    Article  Google Scholar 

  19. Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: IEEE Symposium on Security and Privacy (SP), pp. 305–316. IEEE (2010)

    Google Scholar 

  20. Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)

    Google Scholar 

  21. Lane, T., Brodley, C.E.: Approaches to online learning and concept drift for user identification in computer security. In: KDD, pp. 259–263 (1998)

    Google Scholar 

  22. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6(3), 21–45 (2006)

    Article  Google Scholar 

  23. Giacinto, G., Roli, F., Didaci, L.: Fusion of multiple classifiers for intrusion detection in computer networks. Pattern Recogn. Lett. 24(12), 1795–1803 (2003)

    Article  Google Scholar 

  24. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  25. Quinlan, J.R.: C4. 5: programs for machine learning. Elsevier (2014)

    Google Scholar 

  26. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)

    Google Scholar 

  27. Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)

    Google Scholar 

  28. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  29. Breiman, L.: Out-of-bag estimation (1996)

    Google Scholar 

  30. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    Google Scholar 

  31. Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)

    Article  Google Scholar 

  32. Mitchell, T.M.: Machine learning and data mining. Commun. ACM 42(11), 30–36 (1999)

    Article  Google Scholar 

  33. Kruegel, C., Toth, T.: Using decision trees to improve signature-based intrusion detection. In: Vigna, G., Kruegel, C., Jonsson, E. (eds.) RAID 2003. LNCS, vol. 2820, pp. 173–191. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45248-5_10

    Chapter  Google Scholar 

  34. Panda, M., Patra, M.R.: Network intrusion detection using Naive Bayes. Int. J. Comput. Sci. Netw. Secur. 7(12), 258–263 (2007)

    Google Scholar 

  35. Zhang, J., Zulkernine, M., Haque, A.: Random-forests-based network intrusion detection systems. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38(5), 649–659 (2008)

    Article  Google Scholar 

  36. Song, J., Takakura, H., Okabe, Y., Eto, M., Inoue, D., Nakao, K.: Statistical analysis of honeypot data and building of kyoto 2006+ dataset for nids evaluation. In: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, pp. 29–36. ACM (2011)

    Google Scholar 

  37. Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of ip flow-based intrusion detection. IEEE Commun. Surv. Tutorials 12(3), 343–356 (2010)

    Article  Google Scholar 

  38. Lee, J.-H., Lee, J.-H., Sohn, S.-G., Ryu, J.-H., Chung, T.-M.: Effective value of decision tree with kdd 99 intrusion detection datasets for intrusion detection system. In: 10th International Conference on Advanced Communication Technology, ICACT 2008, vol. 2, pp. 1170–1175. IEEE (2008)

    Google Scholar 

  39. Sahu, S., Mehtre, B.M.: Network intrusion detection system using j48 decision tree. In: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2023–2026. IEEE (2015)

    Google Scholar 

  40. Amor, N.B., Benferhat, S., Elouedi, Z.: Naive bayes vs decision trees in intrusion detection systems. In: Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 420–424. ACM (2004)

    Google Scholar 

  41. Sato, M., Yamaki, H., Takakura, H.: Unknown attacks detection using feature extraction from anomaly-based ids alerts. In: IEEE/IPSJ 12th International Symposium on Applications and the Internet (SAINT), pp. 273–277. IEEE (2012)

    Google Scholar 

  42. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1), 1–39 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgment

This research was supported in part by Colorado State Bill 18-086.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Younghee Park .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, C., Fan, W., Chang, SY., Park, Y. (2021). Reconstructing Classification to Enhance Machine-Learning Based Network Intrusion Detection by Embracing Ambiguity. In: Park, Y., Jadav, D., Austin, T. (eds) Silicon Valley Cybersecurity Conference. SVCC 2020. Communications in Computer and Information Science, vol 1383. Springer, Cham. https://doi.org/10.1007/978-3-030-72725-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-72725-3_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72724-6

  • Online ISBN: 978-3-030-72725-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics