Abstract
Network anomaly detection is one of the most challenging fields in cyber security. Most of the proposed techniques have high computation complexity or based on heuristic approaches. This paper proposes a novel two-tier classification models based on machine learning approaches Naïve Bayes, certainty factor voting version of KNN classifiers and also Linear Discriminant Analysis for dimension reduction. Experimental results show a desirable and promising gain in detection rate and false alarm compared with other existing models. The model also trained by two generated balance training sets using SMOTE method to evaluate the chosen similarity measure for dealing with imbalanced network anomaly data sets. The two-tier model provides low computation time due to optimal dimension reduction and feature selection, as well as good detection rate against rare and complex attack types which are so dangerous because of their close similarity to normal behaviors like User to Root and Remote to Local. All evaluation processes experimented by NSL-KDD data set.
Similar content being viewed by others
References
Bouzida, Y., & Cuppens, F. (2006). Neural networks vs. decision trees for intrusion detection. In IEEE/IST Workshop on Monitoring, Attack Detection and Mitigation (MonAM), Tuebingen (pp. 28–29).
Chan, P.K., Mahoney, M.V., & Arshad, M.H. (2005). Learning Rules and Clusters for Anomaly Detection in Network Traffic. Managing Cyber Threats: Issues, Approaches and Challenges, 5, 81–99.
Chawla, N.V., Bowyer, K.W., Hall, L.O., & Kegelmeyer, W.P. (2011). SMOTE: synthetic minority over-sampling technique, arXiv:11061813.
Dua, S., & Du, X. (2011). Data Mining and Machine Learning in Cybersecurity. USA: CRC Press.
Friedman, J.H., Bentley, J.L., & Finkel, R.A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software TOMS, 3(3), 209–226.
Gu, G., Fogla, P., Dagon, D., Lee, W., & Skori, B. (2006). Measuring intrusion detection capability: An information-theoretic approach. In Proceedings of the ACM Symposium on Information, computer and communications security (pp. 90–101).
Han, J., & Kamber, M. (2006). Data mining concepts and techniques. Amsterdam; Boston; San Francisco: Elsevier; Morgan Kaufmann.
Horng, S.J., Su, M.Y., Chen, Y.H., Kao, T.W., Chen, R.J., Lai, J.L., & Perkasa, C.D. (2011). A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Systems with Applications, 38(1), 306–313.
Ibrahim, L.M., Basheer, D.T., & Mahmod, M.S. (2013). A Comparison Study for Intrusion Database (KDD99, NSL-KDD) Based on Self Organization Map (SOM) Artificial Neural Network. Journal of Engineering, Science and Technology, 8(1), 107–119.
Izenman, A.J. (2008). Modern Multivariate Statistical Techniques, (pp. 237–280). New York: Springer.
KDD Cup (1999). Data, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, Accessed 17 September 2014.
Kent, K., & Mell, P. (2006). Guide to Intrusion Detection and Prevention (IDP) Systems, Natl. Inst. Stand. Technol., USA.
Kim, E., & Kim, S. (2014). A Novel Anomaly Detection System Based on HFR-MLR Method. Mobile, Ubiquitous and Intelligent Computing, 274, 279–286.
Kromer, P., Platos, J., Snasel, V., & Abraham, A. (2011). Fuzzy classification by evolutionary algorithms. In IEEE International Conference on Systems, Man and Cybernetics (SMC) (pp. 313–318).
Leung, K., & Leckie, C. (2005). Unsupervised anomaly detection in network intrusion detection using clusters. In Proceedings of the Twenty-eighth Australasian conference on Computer Science, (Vol. 38 pp. 333–342).
Li, Y., Xia, J., Zhang, S., Yan, J., Ai, X., & Dai, K. (2012). An efficient intrusion detection system based on support vector machines and gradually feature removal method. Expert Systems with Applications, 39(1), 424–430.
Li, T., Zhu, S., & Ogihara, M. (2006). Using discriminant analysis for multi-class classification: an experimental investigation. Knowledge and Information Systems, 10 (4), 453–472.
Lu, H., & Xu, J. (2009). Three-Level Hybrid Intrusion Detection System. In International Conference on Information Engineering and Computer Science, ICIECS09 (pp. 1–4).
McHugh, J. (2000). Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Information and System Security, 3(4), 262–294.
Panda, M., Abraham, A., & Patra, M.R. (2010). Discriminative multinomial naive bayes for network intrusion detection. In Sixth International Conference on Information Assurance and Security (IAS) (pp. 5–10).
Pervez, M.S., & Md Farid, D. (2014). Feature selection and intrusion classification in NSL-KDD cup 99 dataset employing SVMs. In 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA) (pp. 1–6).
Tan, Z., Jamdagni, A., He, X., & Nanda, P. (2010). Network Intrusion Detection based on LDA for payload feature selection. In GLOBECOM Workshops (GC Wkshps) (pp. 1545–1549). Miami: IEEE.
Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A.-A. (2009). A detailed analysis of the KDD CUP 99 data set. In Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and Defense Applications.
Toosi, A.N., & Kahani, M. (2007). A new approach to intrusion detection based on an evolutionary soft computing model using neuro-fuzzy classifiers. Computer and Communications, 30(10), 2201–2212.
Xuren, W., Famei, H., & Rongsheng, X. (2006). Modeling Intrusion Detection System by Discovering Association Rule in Rough Set Theory Framework. In International Conference on Computational Intelligence for Modeling, Control and Automation, and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (p. 2424).
Zhang, S. (2010). KNN-CF Approach: Incorporating Certainty Factor to kNN Classification. IEEE Intell. Inform. Bull., 11(1), 24–33.
Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: an efficient data clustering method for very large databases. In ACM SIGMOD Record, (Vol. 25 pp. 103–114).
Zhang, J., & Zulkernine, M. (2006). Anomaly based network intrusion detection with unsupervised outlier detection. In IEEE International Conference on Communications, ICC06, (Vol. 5 pp. 2388–2393).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pajouh, H.H., Dastghaibyfard, G. & Hashemi, S. Two-tier network anomaly detection model: a machine learning approach. J Intell Inf Syst 48, 61–74 (2017). https://doi.org/10.1007/s10844-015-0388-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-015-0388-x