Abstract
Network intrusion detection systems (IDS) has efficiently identified the profiles of normal network activities, extracted intrusion patterns, and constructed generalized models to evaluate (un)known attacks using a wide range of machine learning approaches. In spite of the effectiveness of machine learning-based IDS, it has been still challenging to reduce high false alarms due to data misclassification. In this paper, by using multiple decision mechanisms, we propose a new classification method to identify misclassified data and then to classify them into three different classes, called a malicious, benign, and ambiguous dataset. In other words, the ambiguous dataset contains a majority of the misclassified dataset and is thus the most informative for improving the model and anomaly detection because of the lack of confidence for the data classification in the model. We evaluate our approach with the recent real-world network traffic data, Kyoto2006+ datasets, and show that the ambiguous dataset contains 77.2% of the previously misclassified data. Re-evaluating the ambiguous dataset effectively reduces the false prediction rate with minimal overhead and improves accuracy by 15%.
This work is supported by NSF Award #1723663.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lane, T., Brodley, C.E.: An application of machine learning to anomaly detection. In: Proceedings of the 20th National Information Systems Security Conference, vol. 377, pp. 366–380. Baltimore, USA (1997)
Ghosh, A.K., Wanken, J., Charron, F.: Detecting anomalous and unknown intrusions against programs. In: 14th Annual Computer Security Applications Conference: Proceedings, pp. 259–267. IEEE (1998)
Cannady, J.: Artificial neural networks for misuse detection. In: National Information Systems Security Conference, pp. 368–381 (1998)
Sinclair, C., Pierce, L., Matzner, S.: An application of machine learning to network intrusion detection. In: Computer Security Applications Conference (ACSAC 1999) Proceedings. 15th Annual, pp. 371–377. IEEE (1999)
Kumar, S., Spafford, E.H.: A software architecture to support misuse intrusion detection (1995)
Ilgun, K., Kemmerer, R.A., Porras, P.A.: State transition analysis: a rule-based intrusion detection approach. IEEE Trans. Softw. Eng. 21(3), 181–199 (1995)
Lunt, T.F., Tamaru, A., Gillham, F.: A Real-Time Intrusion-Detection Expert System (IDES). SRI International, Computer Science Laboratory (1992)
Paxson, V.: Bro: a system for detecting network intruders in real-time. Comput. Netw. 31(23), 2435–2463 (1999)
Roesch, M., et al.: Snort: lightweight intrusion detection for networks. Lisa 99(1), 229–238 (1999)
Mukkamala, S., Sung, A., Abraham, A.: Cyber security challenges: designing efficient intrusion detection systems and antivirus tools. Vemuri, V. Rao, Enhancing Computer Security with Smart Technology. (Auerbach, 2006), pp. 125–163 (2005)
Nguyen, T.T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. IEEE Commun. Surv. Tutorials 10(4), 56–76 (2008)
Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28(1), 18–28 (2009)
Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010)
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutorials 16(1), 303–336 (2014)
Dua, S., Du, X.: Data Mining and Machine Learning in Cybersecurity. CRC Press (2016)
Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutorials 18(2), 1153–1176 (2016)
Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. 2, 222–232 (1987)
Mukherjee, B., Heberlein, L.T., Levitt, K.N.: Network intrusion detection. IEEE Netw. 8(3), 26–41 (1994)
Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: IEEE Symposium on Security and Privacy (SP), pp. 305–316. IEEE (2010)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Lane, T., Brodley, C.E.: Approaches to online learning and concept drift for user identification in computer security. In: KDD, pp. 259–263 (1998)
Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6(3), 21–45 (2006)
Giacinto, G., Roli, F., Didaci, L.: Fusion of multiple classifiers for intrusion detection in computer networks. Pattern Recogn. Lett. 24(12), 1795–1803 (2003)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Quinlan, J.R.: C4. 5: programs for machine learning. Elsevier (2014)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)
Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Breiman, L.: Out-of-bag estimation (1996)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)
Mitchell, T.M.: Machine learning and data mining. Commun. ACM 42(11), 30–36 (1999)
Kruegel, C., Toth, T.: Using decision trees to improve signature-based intrusion detection. In: Vigna, G., Kruegel, C., Jonsson, E. (eds.) RAID 2003. LNCS, vol. 2820, pp. 173–191. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45248-5_10
Panda, M., Patra, M.R.: Network intrusion detection using Naive Bayes. Int. J. Comput. Sci. Netw. Secur. 7(12), 258–263 (2007)
Zhang, J., Zulkernine, M., Haque, A.: Random-forests-based network intrusion detection systems. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38(5), 649–659 (2008)
Song, J., Takakura, H., Okabe, Y., Eto, M., Inoue, D., Nakao, K.: Statistical analysis of honeypot data and building of kyoto 2006+ dataset for nids evaluation. In: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, pp. 29–36. ACM (2011)
Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of ip flow-based intrusion detection. IEEE Commun. Surv. Tutorials 12(3), 343–356 (2010)
Lee, J.-H., Lee, J.-H., Sohn, S.-G., Ryu, J.-H., Chung, T.-M.: Effective value of decision tree with kdd 99 intrusion detection datasets for intrusion detection system. In: 10th International Conference on Advanced Communication Technology, ICACT 2008, vol. 2, pp. 1170–1175. IEEE (2008)
Sahu, S., Mehtre, B.M.: Network intrusion detection system using j48 decision tree. In: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2023–2026. IEEE (2015)
Amor, N.B., Benferhat, S., Elouedi, Z.: Naive bayes vs decision trees in intrusion detection systems. In: Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 420–424. ACM (2004)
Sato, M., Yamaki, H., Takakura, H.: Unknown attacks detection using feature extraction from anomaly-based ids alerts. In: IEEE/IPSJ 12th International Symposium on Applications and the Internet (SAINT), pp. 273–277. IEEE (2012)
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1), 1–39 (2010)
Acknowledgment
This research was supported in part by Colorado State Bill 18-086.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, C., Fan, W., Chang, SY., Park, Y. (2021). Reconstructing Classification to Enhance Machine-Learning Based Network Intrusion Detection by Embracing Ambiguity. In: Park, Y., Jadav, D., Austin, T. (eds) Silicon Valley Cybersecurity Conference. SVCC 2020. Communications in Computer and Information Science, vol 1383. Springer, Cham. https://doi.org/10.1007/978-3-030-72725-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-72725-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72724-6
Online ISBN: 978-3-030-72725-3
eBook Packages: Computer ScienceComputer Science (R0)