Toward an efficient fuzziness based instance selection methodology for intrusion detection system

Ashfaq, Rana Aamir Raza; He, Yu-lin; Chen, De-gang

doi:10.1007/s13042-016-0557-4

Toward an efficient fuzziness based instance selection methodology for intrusion detection system

Original Article
Published: 27 June 2016

Volume 8, pages 1767–1776, (2017)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Rana Aamir Raza Ashfaq^1,2,
Yu-lin He¹ &
De-gang Chen³

667 Accesses
18 Citations
Explore all metrics

Abstract

Building a high quality classifier is one of the key problems in the field of machine learning (ML) and pattern recognition. Many ML algorithms have suffered from high computational power in the presence of large scale data sets. This paper proposes a fuzziness based instance selection technique for the large data sets to increase the efficiency of supervised learning algorithms by improving the shortcomings of designing an effective intrusion detection system (IDS). The proposed methodology is dependent on a new kind of single layer feed-forward neural network (SLFN), called random weight neural network (RWNN). At the first stage, a membership vector corresponding to every training instance is obtained by using RWNN for computing the fuzziness. Secondly, the training instances (along with their fuzziness values) according to the actual class labels are grouped separately. After this, the instances having low fuzziness values in each group are extracted, which are used to build a reduced data set. The instances outputted by the proposed method are used as an input for ML classifiers, which result in reducing the learning time and also increasing the learning capability. The proposed methodology exhibits that the reduced data set can easily learn the boundaries between class labels. The most obvious finding from this study is a considerable increase in the accuracy rate with unseen examples when compared with other instance selection method, i.e., IB2. The proposed method provides the better generalization and fast learning capability. The reasonability of the proposed methodology is theoretically explained and experiments on well known ID data sets support its usefulness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An Efficient Feature Selection for Intrusion Detection System Using B-HKNN and C2 Search Based Learning Model

Article 15 May 2022

V. R. Balasaraswathi, L. Mary Shamala, … Muthukumarasamy Sugumaran

Intrusion detection using Highest Wins feature selection algorithm

Article 09 February 2021

Rami Mustafa A. Mohammad & Mutasem K. Alsmadi

IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset

Article Open access 05 February 2023

Yuhua Yin, Julian Jang-Jaccard, … Jin Kwak

References

Aamir Raza Ashfaq R, Wang X, Huang J, Abbas H, He Y (2016) Fuzziness based semisupervised learning approach for intrusion detection system, Information Sciences. in press, doi: 10.1016/j.ins.2016.04.019
Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
Google Scholar
Anand K, Ganapathy S, Kulothungan K, Yogesh P, Kannan A (2012) A rule based approach for attribute selection and intrusion detection in wireless sensor networks. Proc Eng 38:1658–1664
Article Google Scholar
Anderson P (1980) Computer security threat monitoring and surveillance, technical report. James P Anderson Co., Fort Washington
Google Scholar
Bezdek J, Kuncheva L (2001) Nearest prototype classifier designs: an experimental study. Int J Intell Syst 16(12):1445–1473
Article MATH Google Scholar
Caises Y, Gonzalez A, Leyva E, Prez R (2009) SCIS: combining instance selection methods to increase their effectiveness over a wide range of domains. Intell Data Eng Autom Learn IDEAL 2009:17–24
Google Scholar
Cao FL, Ye HL, Wang DH (2015) A probabilistic learning algorithm for robust modeling using neural networks with random weights. Inf Sci 313:62–78
Article Google Scholar
Chen W, Hsu S, Shen H (2005) Application of SVM and ANN for intrusion detection. Comput Oper Res 32(10):2617–2634
Article MATH Google Scholar
Chou C, Kuo B, Chang F (2006) The generalized condensed nearest neighbor rule as a data reduction method. In: Proceedings of the 18th international conference on pattern recognition (ICPR’06), vol 2, pp 556–559
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Article MATH Google Scholar
De Luca A, Termini S (1972) A definition of a non-probabilistic entropy in the setting of fuzzy sets theory. Inf Control 20(4):301–312
Article MATH Google Scholar
Denning D (1987) An intrusion-detection model. IEEE Trans Softw Eng 13(2):222–232
Article Google Scholar
Devijver P, Kittler J (1980) On the edited nearest neighbor rule. In: Proceedings of the 5th international conference on pattern recognition. Pattern Recognition Society, Los Alamitos, CA, pp 72–80
Elbasiony R, Sallam E, Eltobely T, Fahmy M (2013) A hybrid network intrusion detection framework based on random forests and weighted k-means. Ain Shams Eng J 4(4):753–762
Article Google Scholar
Hart P (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14(3):515–516
Article Google Scholar
He S, Chen H, Zhu Z, Ward D, Cooper H, Viant M, Heath J, Yao X (2015) Robust twin boosting for feature selection from high-dimensional omics data with label noise. Inf Sci 291:1–18
Article Google Scholar
He YL, Wang XZ, Huang JZX (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364-365:222–240
Article Google Scholar
Hofmann A, Horeis T, Sick B (2004) Feature selection for intrusion detection: an evolutionary wrapper approach. In: Proceedings of the 2004 IEEE international joint conference on neural networks, vol 2, pp 1563–1568
Igelnik B, Pao Yoh-Han (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329
Article Google Scholar
KDDCup 1999 Data, 2016. Available at: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Keller J, Gray M, Givens J (1985) A fuzzy K-nearest neighbor algorithm. IEEE Trans Syst Man Cybern 15(4):580–585
Article Google Scholar
Kemmerer R, Vigna G (2002) Intrusion detection: a brief history and overview. Computer 35(4):27–30
Article Google Scholar
Li Y, Hu Z, Cai Y, Zhang W (2005) Support vector based prototype selection method for nearest neighbor rules. In: Wang L, Chen K, Ong YS (eds) Advances in natural computation. Lecture notes in computer science, vol 3610. Springer, Berlin, Heidelberg, pp 528–535
Liao Y, Vemuri V (2002) Use of K-Nearest Neighbor classifier for intrusion detection. Comput Secur 21(5):439–448
Article Google Scholar
Liu H, Motoda H (2002) On issues of instance selection. Data Min Knowl Discov 6(2):115–130
Article MathSciNet Google Scholar
Liu Q, Yin J, Leung V, Zhai J, Cai Z, Lin J (2014) Applying a new localized generalization error model to design neural networks. Neural Comput Appl 27(1):59–66
Article Google Scholar
Liu F, Zhang D, Shen LL (2015) Study on novel curvature features for 3D fingerprint recognition. Neurocomputing 168:599–608
Article Google Scholar
Mukherjee S, Sharma N (2012) Intrusion detection using naive bayes classifier with feature reduction. Proc Technol 4:119–128
Article Google Scholar
Neter J (1996) Applied linear statistical models. WCB/MacGraw-Hill, Boston
Google Scholar
ISCX NSL-KDD dataset | UNB. Available at: http://www.unb.ca/research/iscx/dataset/iscx-NSL-KDD-dataset.html
Pereira C, Nakamura R, Costa K, Papa J (2012) An optimum-path forest framework for intrusion detection in computer networks. Eng Appl Artif Intell 25(6):1226–1234
Article Google Scholar
Qiu M, Zhang L, Ming Z, Chen Z, Qin X, Yang L (2013) Security-aware optimization for ubiquitous computing systems with SEAT graph approach. J Comput Syst Sci 79(5):518–529
Article MATH MathSciNet Google Scholar
Sanchez D, Trillas E (2012) Measures of fuzziness under different uses of fuzzy sets. Commun Comput Inf Sci 298:25–34
MATH Google Scholar
Schmidt W, Kraaijveld M, Duin R (1992) Feedforward neural networks with random weights. In: Proceedings of 11th IAPR international conference on pattern recognition, conference B: pattern recognition methodology and systems, pp 1–4
Schultz M, Eskin E, Zadok F, Stolfo S (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE symposium on security and privacy, pp 38–49
Shi J, Jiang Q, Mao R, Lu M, Wang T (2015) FR-KECA: fuzzy robust kernel entropy component analysis. Neurocomputing 149:1415–1423
Article Google Scholar
Spillmann B, Neuhaus M, Bunke H, Pkalska E, Duin R (2006) Transforming strings to vector spaces using prototype selection. Lecture notes in computer science, pp 287–296
Tavallaee M, Bagheri E, Lu W, Ghorbani A (2009) A detailed analysis of the KDD CUP 99 data set. In: Proceedings of the 2009 IEEE symposium on computational intelligence for security and defense applications. Available at: http://nparc.cisti-icist.nrc-cnrc.gc.ca/eng/view/accepted/?id=649fb606-4a97-47d0-b373-082cb3ac0259
Te Braake H, Van Straten G (1995) Random activation weight neural net (RAWN) for east non-iterative training. Eng Appl Artif Intell 8(1):71–80
Article Google Scholar
Tomek I (1976) An experiment with the edited nearest-neighbor rule. IEEE Trans Syst Man Cybern 6(6):448–452
MATH MathSciNet Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Book MATH Google Scholar
Wang XZ, Aamir R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196
Article MathSciNet Google Scholar
Wang XZ, Miao Q, Zhai M, Zhai J (2012) Instance selection based on sample entropy for efficient data classification with ELM. In: Proceedings of the 2012 IEEE international conference on systems, man, and cybernetics (SMC), pp 970–974
Wang XZ (2015) Learning from big data with uncertainty-editorial. J Intell Fuzzy Syst 28(5):2329–2330
Article MathSciNet Google Scholar
Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Article Google Scholar
Wilson D (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern 2(3):408–421
Article MATH MathSciNet Google Scholar
Xie J, Hone K, Xie W, Gao X, Shi Y, Liu X (2013) Extending twin support vector machine classifier for multi-category classification problems. Intell Data Anal 17(4):649–664
Google Scholar
Yan Q, Yu F (2015) Distributed denial of service attacks in software-defined networking with cloud computing. IEEE Commun Mag 53(4):52–59
Article Google Scholar
Yang M, Zhu PF, Liu F, Shen LL (2015) Joint representation and pattern learning for robust face recognition. Neurocomputing 168:70–80
Article Google Scholar
Yao Y, Wei Y, Gao FX, Ge Y (2006) Anomaly intrusion detection approach using hybrid MLP/CNN neural network. In: Sixth international conference on intelligent systems design and applications, vol 2, pp 1095–1102
You ZH, Lei YK, Zhu L, Xia JF, Wang B (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinf 14(Suppl 8):S10
Article Google Scholar
You ZH, Yu JZ, Zhu L, Li S, Wen ZK (2014) A mapreduce based parallel SVM for large-scale predicting proteinprotein interactions. Neurocomputing 145:37–43
Article Google Scholar
Zadeh L (1968) Probability measures of fuzzy events. J Math Anal Appl 23(2):421–427
Article MATH MathSciNet Google Scholar
Zhang Z, Shen H (2005) Application of online-training SVMs for real-time intrusion detection with different considerations. Comput Commun 28(12):1428–1442
Article Google Scholar
Zhao W, Wang ZH, Cao FL, Wang DH (2015) A local learning algorithm for random weights networks. Knowl Based Syst 74:159–166
Article Google Scholar

Download references

Acknowledgments

This research is supported by China Postdoctoral Science Foundations (2015M572361 and 2016T90799), Basic Research Project of Knowledge Innovation Program in Shenzhen (JCYJ20150324140036825), and National Natural Science Foundations of China (61503252 and 71371063).

Author information

Authors and Affiliations

College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, Guangdong, China
Rana Aamir Raza Ashfaq & Yu-lin He
Department of Computer Science, Bahauddin Zakariya University, Multan, Pakistan
Rana Aamir Raza Ashfaq
Department of Mathematics and Physics, North China Electric Power University, Beijing, 102206, China
De-gang Chen

Authors

Rana Aamir Raza Ashfaq
View author publications
You can also search for this author in PubMed Google Scholar
Yu-lin He
View author publications
You can also search for this author in PubMed Google Scholar
De-gang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu-lin He.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ashfaq, R.A.R., He, Yl. & Chen, Dg. Toward an efficient fuzziness based instance selection methodology for intrusion detection system. Int. J. Mach. Learn. & Cyber. 8, 1767–1776 (2017). https://doi.org/10.1007/s13042-016-0557-4

Download citation

Received: 03 May 2016
Accepted: 02 June 2016
Published: 27 June 2016
Issue Date: December 2017
DOI: https://doi.org/10.1007/s13042-016-0557-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Toward an efficient fuzziness based instance selection methodology for intrusion detection system

Abstract

Access this article

Similar content being viewed by others

An Efficient Feature Selection for Intrusion Detection System Using B-HKNN and C2 Search Based Learning Model

Intrusion detection using Highest Wins feature selection algorithm

IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Toward an efficient fuzziness based instance selection methodology for intrusion detection system

Abstract

Access this article

Similar content being viewed by others

An Efficient Feature Selection for Intrusion Detection System Using B-HKNN and C2 Search Based Learning Model

Intrusion detection using Highest Wins feature selection algorithm

IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation