Reconstructing Classification to Enhance Machine-Learning Based Network Intrusion Detection by Embracing Ambiguity

Song, Chungsik; Fan, Wenjun; Chang, Sang-Yoon; Park, Younghee

doi:10.1007/978-3-030-72725-3_13

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1383))

Included in the following conference series:

Silicon Valley Cybersecurity Conference

578 Accesses
3 Citations

Abstract

Network intrusion detection systems (IDS) has efficiently identified the profiles of normal network activities, extracted intrusion patterns, and constructed generalized models to evaluate (un)known attacks using a wide range of machine learning approaches. In spite of the effectiveness of machine learning-based IDS, it has been still challenging to reduce high false alarms due to data misclassification. In this paper, by using multiple decision mechanisms, we propose a new classification method to identify misclassified data and then to classify them into three different classes, called a malicious, benign, and ambiguous dataset. In other words, the ambiguous dataset contains a majority of the misclassified dataset and is thus the most informative for improving the model and anomaly detection because of the lack of confidence for the data classification in the model. We evaluate our approach with the recent real-world network traffic data, Kyoto2006+ datasets, and show that the ambiguous dataset contains 77.2% of the previously misclassified data. Re-evaluating the ambiguous dataset effectively reduces the false prediction rate with minimal overhead and improves accuracy by 15%.

This work is supported by NSF Award #1723663.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lane, T., Brodley, C.E.: An application of machine learning to anomaly detection. In: Proceedings of the 20th National Information Systems Security Conference, vol. 377, pp. 366–380. Baltimore, USA (1997)
Google Scholar
Ghosh, A.K., Wanken, J., Charron, F.: Detecting anomalous and unknown intrusions against programs. In: 14th Annual Computer Security Applications Conference: Proceedings, pp. 259–267. IEEE (1998)
Google Scholar
Cannady, J.: Artificial neural networks for misuse detection. In: National Information Systems Security Conference, pp. 368–381 (1998)
Google Scholar
Sinclair, C., Pierce, L., Matzner, S.: An application of machine learning to network intrusion detection. In: Computer Security Applications Conference (ACSAC 1999) Proceedings. 15th Annual, pp. 371–377. IEEE (1999)
Google Scholar
Kumar, S., Spafford, E.H.: A software architecture to support misuse intrusion detection (1995)
Google Scholar
Ilgun, K., Kemmerer, R.A., Porras, P.A.: State transition analysis: a rule-based intrusion detection approach. IEEE Trans. Softw. Eng. 21(3), 181–199 (1995)
Article Google Scholar
Lunt, T.F., Tamaru, A., Gillham, F.: A Real-Time Intrusion-Detection Expert System (IDES). SRI International, Computer Science Laboratory (1992)
Google Scholar
Paxson, V.: Bro: a system for detecting network intruders in real-time. Comput. Netw. 31(23), 2435–2463 (1999)
Article Google Scholar
Roesch, M., et al.: Snort: lightweight intrusion detection for networks. Lisa 99(1), 229–238 (1999)
MathSciNet Google Scholar
Mukkamala, S., Sung, A., Abraham, A.: Cyber security challenges: designing efficient intrusion detection systems and antivirus tools. Vemuri, V. Rao, Enhancing Computer Security with Smart Technology. (Auerbach, 2006), pp. 125–163 (2005)
Google Scholar
Nguyen, T.T., Armitage, G.: A survey of techniques for internet traffic classification using machine learning. IEEE Commun. Surv. Tutorials 10(4), 56–76 (2008)
Google Scholar
Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28(1), 18–28 (2009)
Google Scholar
Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010)
Article Google Scholar
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutorials 16(1), 303–336 (2014)
Article Google Scholar
Dua, S., Du, X.: Data Mining and Machine Learning in Cybersecurity. CRC Press (2016)
Google Scholar
Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutorials 18(2), 1153–1176 (2016)
Article Google Scholar
Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. 2, 222–232 (1987)
Article Google Scholar
Mukherjee, B., Heberlein, L.T., Levitt, K.N.: Network intrusion detection. IEEE Netw. 8(3), 26–41 (1994)
Article Google Scholar
Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: IEEE Symposium on Security and Privacy (SP), pp. 305–316. IEEE (2010)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Google Scholar
Lane, T., Brodley, C.E.: Approaches to online learning and concept drift for user identification in computer security. In: KDD, pp. 259–263 (1998)
Google Scholar
Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6(3), 21–45 (2006)
Article Google Scholar
Giacinto, G., Roli, F., Didaci, L.: Fusion of multiple classifiers for intrusion detection in computer networks. Pattern Recogn. Lett. 24(12), 1795–1803 (2003)
Article Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4. 5: programs for machine learning. Elsevier (2014)
Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)
Google Scholar
Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Breiman, L.: Out-of-bag estimation (1996)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Google Scholar
Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)
Article Google Scholar
Mitchell, T.M.: Machine learning and data mining. Commun. ACM 42(11), 30–36 (1999)
Article Google Scholar
Kruegel, C., Toth, T.: Using decision trees to improve signature-based intrusion detection. In: Vigna, G., Kruegel, C., Jonsson, E. (eds.) RAID 2003. LNCS, vol. 2820, pp. 173–191. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45248-5_10
Chapter Google Scholar
Panda, M., Patra, M.R.: Network intrusion detection using Naive Bayes. Int. J. Comput. Sci. Netw. Secur. 7(12), 258–263 (2007)
Google Scholar
Zhang, J., Zulkernine, M., Haque, A.: Random-forests-based network intrusion detection systems. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38(5), 649–659 (2008)
Article Google Scholar
Song, J., Takakura, H., Okabe, Y., Eto, M., Inoue, D., Nakao, K.: Statistical analysis of honeypot data and building of kyoto 2006+ dataset for nids evaluation. In: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, pp. 29–36. ACM (2011)
Google Scholar
Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of ip flow-based intrusion detection. IEEE Commun. Surv. Tutorials 12(3), 343–356 (2010)
Article Google Scholar
Lee, J.-H., Lee, J.-H., Sohn, S.-G., Ryu, J.-H., Chung, T.-M.: Effective value of decision tree with kdd 99 intrusion detection datasets for intrusion detection system. In: 10th International Conference on Advanced Communication Technology, ICACT 2008, vol. 2, pp. 1170–1175. IEEE (2008)
Google Scholar
Sahu, S., Mehtre, B.M.: Network intrusion detection system using j48 decision tree. In: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2023–2026. IEEE (2015)
Google Scholar
Amor, N.B., Benferhat, S., Elouedi, Z.: Naive bayes vs decision trees in intrusion detection systems. In: Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 420–424. ACM (2004)
Google Scholar
Sato, M., Yamaki, H., Takakura, H.: Unknown attacks detection using feature extraction from anomaly-based ids alerts. In: IEEE/IPSJ 12th International Symposium on Applications and the Internet (SAINT), pp. 273–277. IEEE (2012)
Google Scholar
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1), 1–39 (2010)
Article MathSciNet Google Scholar

Download references

Acknowledgment

This research was supported in part by Colorado State Bill 18-086.

Author information

Authors and Affiliations

Computer Science Department, University of Colorado Colorado Springs, Colorado Springs, CO, 80918, USA
Wenjun Fan & Sang-Yoon Chang
Computer Engineering Department, San Jose State University, San Jose, CA, 95192, USA
Chungsik Song & Younghee Park

Authors

Chungsik Song
View author publications
You can also search for this author in PubMed Google Scholar
Wenjun Fan
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Yoon Chang
View author publications
You can also search for this author in PubMed Google Scholar
Younghee Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Younghee Park .

Editor information

Editors and Affiliations

San Jose State University, San Jose, CA, USA
Younghee Park
IBM Almaden Research Center, San Jose, CA, USA
Divyesh Jadav
San Jose State University, San Jose, CA, USA
Thomas Austin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, C., Fan, W., Chang, SY., Park, Y. (2021). Reconstructing Classification to Enhance Machine-Learning Based Network Intrusion Detection by Embracing Ambiguity. In: Park, Y., Jadav, D., Austin, T. (eds) Silicon Valley Cybersecurity Conference. SVCC 2020. Communications in Computer and Information Science, vol 1383. Springer, Cham. https://doi.org/10.1007/978-3-030-72725-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-72725-3_13
Published: 02 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72724-6
Online ISBN: 978-3-030-72725-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics