research-article

Bi-LSTM: Finding Network Anomaly Based on Feature Grouping Clustering

Authors:
Mengbo Xiong

Capital Normal University, China

Capital Normal University, China
View Profile

,
Huizhen Ma

Zhengzhou Jiean Hi-Tech Co. Ltd, China

Zhengzhou Jiean Hi-Tech Co. Ltd, China
View Profile

,
Zhou Fang

State Grid Zhejiang Electric Power Co. Ltd. Information and Communication Branch, China

State Grid Zhejiang Electric Power Co. Ltd. Information and Communication Branch, China
View Profile

,
Dong Wang

State Grid Electronic Commerce Co. LTD. (State Grid Xiong'an Financial Technology Group Co. LTD.), China

State Grid Electronic Commerce Co. LTD. (State Grid Xiong'an Financial Technology Group Co. LTD.), China
View Profile

,
Qiuyun Wang

Institute of Information Engineering Chinese Academy of Sciences, China

Institute of Information Engineering Chinese Academy of Sciences, China
View Profile

,
Xuren Wang

Information Engineering College Capital Normal University, China

Information Engineering College Capital Normal University, China
View Profile

MLMI '20: Proceedings of the 2020 3rd International Conference on Machine Learning and Machine IntelligenceSeptember 2020Pages 88–94https://doi.org/10.1145/3426826.3426843

Published:17 December 2020Publication History

MLMI '20: Proceedings of the 2020 3rd International Conference on Machine Learning and Machine Intelligence

Pages 88–94

ABSTRACT

Intrusion detection is one of the key technologies to ensure the security of cyberspace. In this paper, a detection model of Bi-LSTM, whose powerful serialization modeling function can discover the time series characteristics from network data, combined with machine learning algorithm K-means is proposed. We know that the data collected by network sensor or audit log has many attributes. In order to achieve a successful classification with low computational cost, it is important to employing the most relevant and discriminating features. How to extract useful information from those attributes to improve detection rate and reduce false detection are challenging. First, we group attributes according to the conditions on which they are collected or more generally, evenly. Then we cluster attributes of each group with K-means. So, we got the same number of hyper-features as the number of the groups. On the one side data reduction is significant and the data volume was greatly declined up to 85%. On the other side, the extracted features, also called hyper features, are more concentrated and informative than the low-level attributes. Detection rate on the high-level features is better than that on original attributes, both with traditional machine learning classification of C4.5 or our hybrid model. The intrusion detection rate of the powerful serialization model, Bi-LSTM based on K-means, is as high as 99.93%, the accuracy rate as high as 98.84%, and the false detection rate is 0. Moreover, experiments show that our Bi-LSTM model plus K-means works well with new attacks only appeared in test data too, which is meaningful for intrusion detection.

References

Bouzida Y, Cuppens F, Boulahia N C, et. Efficient intrusion detection using principal component analysis. In 3eme Conference sur la Securite et Architectures Reseaux (SAR), La Londe, France, 2004.Google Scholar
Kayacik H G, Zincir-Heywood A N, Heywood M I. Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets. in Proc. 3rd Annual Conference Privacy, Security and Trust, 2005.Google Scholar
Staudemeyer C. Applying long short-term memory recurrent neural networks to intrusion detection. South African Computer Journal. 2015, 56(1), pp. 136-154.Google Scholar
Haq N F, Onik A R, Shah F M. An ensemble framework of anomaly detection using hybridized feature selection approach (HFSA). 2015 SAI Intelligent Systems Conference (IntelliSys), 2015, pp. 989-995.Google ScholarCross Ref
Gharaee H, Hosseinvand H. A new feature selection IDS based on genetic algorithm and SVM. 2016 8th International Symposium on Telecommunications (IST), 2016, pp. 139-144.Google ScholarCross Ref
Salman T, Bhamare D, Erbad A, et. Machine Learning for Anomaly Detection and Categorization in Multi-Cloud Environments. 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud), 2017, pp. 97-103.Google Scholar
Shone N, Ngoc T N, Phai V D, et. A deep learning approach to network intrusion detection. IEEE Transactions on Emerging Topics in Computational Intelligence, 2018, 2(1): 41-50.Google ScholarCross Ref
Muda Z, Yassin W, Sulaiman MN, Intrusion detection based on K-means clustering and Naïve Bayes classification. International Conference on Information Technology in Asia, USA: IEEE, 2011, pp. 1-6.Google Scholar
Yassin W, Udzir N, Muda Z, Anomaly-based intrusion detection through K-means clustering and naives bayes classification. Proceedings of the 4th International Conference on Computing and Informatics, Sarawak, Malaysia: University Utara Malaysia, 2013, pp. 298-303.Google Scholar
LUO Min,WANG Li-na, ZHANG Huan-guo. An Unsupervised Clustering-Based Intrusion Detection Metho. Acta Electronica Sinica, 2003, 31(11), pp. 1713-1716.Google Scholar
Wang Huaibin, Yang Hongliang, Xu Zhijian, A clustering algorithm use SOM and K-means in intrusion detection. Proceedings of the International Conference on E-Business and E-Government, USA: IEEE, 2010, pp. 1281-1284.Google Scholar
Zhang Yuqing, Dong Ying, Liu Caiyun. Situation, Trends and Prospects of Deep Learning Applied to Cyberspace Security. Journal of computer research and development, 2018, 55(6), pp. 1117-1142.Google Scholar
Roy S, Mallik A, Gulati R, A deep learning based artificial neural network approach for intrusion detection. Mathematics and Computing-3rd International Conference, Berlin: Springer Verlag , 2017, pp. 44-53.Google ScholarCross Ref
Gao Ni, Gao Ling, Gao Quanli, An intrusion detection model based on deep belief networks. Int Conf on Advanced Cloud and Big Data. NJ: IEEE, 2014, pp. 247-252.Google ScholarCross Ref
Staudemeyer C. Applying long short-term memory recurrent neural networks to intrusion detection. South African Computer Journal. 2015, 56(1), pp. 136-154.Google Scholar
Maleki, Mina, Rueda, Luis. Classification via correlation-based feature grouping// Computational Intelligence in Bioinformatics & Computational Biology. IEEE, 2015.Google Scholar
Kim J, Kim J, Thu H L T, Long Short Term Memory Recurrent Neural Network Classifier for Intrusion Detection. Int Conf on Platform Technology and Service. NJ: IEEE, 2016, pp. 1-5.Google Scholar
Abebe Diro, Naveen Chilamkurti. Leveraging LSTM Networks for Attack Detection in Fog-to-Things Communications. IEEE Communications Magazine,2018,pp.124-130Google Scholar
Althubiti S A , Jones E M , Roy K . LSTM for Anomaly-Based Network Intrusion Detection. 2018 28th International Telecommunication Networks and Applications Conference (ITNAC). 2018.Google Scholar
Sahin C B , Diri B . Robust Feature Selection with LSTM Recurrent Neural Networks for Artificial Immune Recognition System. IEEE Access, 2019:24165-24178.Google Scholar
Werbos P J. Back propagation through time: what it does and how to do it. Proceedings of the IEEE, 1990, 78(10), pp. 1550-1560.Google ScholarCross Ref
Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 1994, 5(2), pp. 157-166.Google ScholarDigital Library
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8), pp. 1735-1780.Google ScholarDigital Library
Kim J, Kim J, Thu H L T, Long Short Term Memory Recurrent Neural Network Classifier for Intrusion Detection. Int Conf on Platform Technology and Service. NJ: IEEE, 2016, pp. 1-5.Google Scholar
Li Songru. Research on Text Sentiment Analysis of Public Opinion based on Recurrent Neural Network. Huaqiao University, 2017.Google Scholar
Kingma D, Ba J. Adam: amethod for stochastic optimization. International Conference for Learning Representations, 2015: 1-15.Google Scholar
Hettich S, Bay S D. KDD cup 1999 data. 1999. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.Google Scholar

Recommendations

Research and Implementation of an Anomaly Detection Model Based on Clustering Analysis
IPTC '10: Proceedings of the 2010 International Symposium on Intelligence Information Processing and Trusted Computing

IDS (Intrusion Detection system) is an active and driving defense technology. This paper mainly focuses on intrusion detection based on data mining. The aim is to improve the detection rate and decrease the false alarm rate, and the main research method ...
Read More
K-strings algorithm, a new approach based on Kmeans
RACS '15: Proceedings of the 2015 Conference on research in adaptive and convergent systems

K-means is a popular clustering algorithm which is widely used in anomaly-based intrusion detection. It tries to classify a given data set into k (a predefined number) categories. However, to apply to a high dimensional dataset, we believe that the ...
Read More
Detecting Network Anomalies Using CUSUM and EM Clustering
ISICA '09: Proceedings of the 4th International Symposium on Advances in Computation and Intelligence

Intrusion detection has been extensively studied in the last two decades. However, most existing intrusion detection techniques detect limited number of attack types and report a huge number of false alarms. The hybrid approach has been proposed ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MLMI '20: Proceedings of the 2020 3rd International Conference on Machine Learning and Machine Intelligence
September 2020
138 pages
ISBN:9781450388344
DOI:10.1145/3426826

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 December 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Bi-LSTM
Clustering
Feature Extracting
Intrusion Detection
K-means
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 81
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Bi-LSTM: Finding Network Anomaly Based on Feature Grouping Clustering

MLMI '20: Proceedings of the 2020 3rd International Conference on Machine Learning and Machine Intelligence

ABSTRACT

References

Cited By

Recommendations

Research and Implementation of an Anomaly Detection Model Based on Clustering Analysis

K-strings algorithm, a new approach based on Kmeans

Detecting Network Anomalies Using CUSUM and EM Clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Bi-LSTM: Finding Network Anomaly Based on Feature Grouping Clustering

MLMI '20: Proceedings of the 2020 3rd International Conference on Machine Learning and Machine Intelligence

ABSTRACT

References

Cited By

Recommendations

Research and Implementation of an Anomaly Detection Model Based on Clustering Analysis

K-strings algorithm, a new approach based on Kmeans

Detecting Network Anomalies Using CUSUM and EM Clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media