Abstract
Network traffic text classification plays an important role in network security. Traditional classification methods based on machine learning, such as supervised learning algorithms and semi-supervised algorithms, are insufficient: classification mode is too simple, unable to adapt to diverse classification requirements; text feature selection method is simple, text classification lacks diversity, and classification accuracy is low. And the classification speed is slow, not suitable for environments with high traffic and real-time. Multi-instance learning classification can describe the characteristics of the sample more accurately and comprehensively, and can improve the classification effect. In this paper, we combined the multi-instance learning classification with principal component analysis (PCA) to select text features of data sets, and removed the redundant and uncorrelated features in the original data, obtained a better classification accuracy.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Guo X (2017) Feature weighting and distance metric learning for multiple-instance classification. Taiyuan University of Technology
Liu F (2019) Weighted KNN text classification algorithm for variable precision rough sets. Comput Eng Des, 1339–1364
Liu Y (2019) Research and application of text classification based on improved random forest algorithm. Comput Syst Appl 28(5):220–225
C Li, Zhang Z-K (2019) Improved MIMLSVM algorithm based on global and local label correlations. Comput Syst Appl 28(4):131–138
Zhou P, Qi Z, Zheng S et al (2016) Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In: Computational linguistics, COLING 2016, Osaka, Japan, pp 3485–3495
Johnson R, Zhang T (2016) Supervised and semi-supervised text categorization using LSTM for region embeddings. In: International conference on machine learning, ICML 2016, New York City, USA, pp 526–534
Song P, Jing L (2018) Exploiting label relationships in multi-label classification with neural networks. J Comput Res Dev 55(8):1751–1759
Sheng Liang (2010) Identification method of internet streaming based on SVM and clustering. Comput Eng Des 31(7):1566–1569
Ouyang G, Li Q, Man J (2013) The network traffic classification techniques based on DDAG-SVM. Math Pract Theory 43(8):197–203
Liu X, Yang J, Lu K et al (2019) Research on different feature extraction and algorithms for ultra-short text classification. Inf Technol Netw Secur 38(5):48–52
Ramesh B, Sathiaseelan JGR (2015) An advanced multi class instance selection based support vector machine for text classification. In: 3rd international conference on recent trends in computing 2015 (ICRTC-2015), pp 1124–1130
Pascoal C, Rosario de Oliveira M, Valadas R et al (2012) Robust feature selection and robust PCA for internet traffic anomaly detection. In: INFOCOM, 2012 Proceedings IEEE, Orlando, pp 1755–1763
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, H., Luo, Q., Shang, Z., Li, G., Shi, X. (2020). Network Traffic Text Classification Based on Multi-instance Learning and Principal Component Analysis. In: Liang, Q., Wang, W., Liu, X., Na, Z., Jia, M., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2019. Lecture Notes in Electrical Engineering, vol 571. Springer, Singapore. https://doi.org/10.1007/978-981-13-9409-6_307
Download citation
DOI: https://doi.org/10.1007/978-981-13-9409-6_307
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9408-9
Online ISBN: 978-981-13-9409-6
eBook Packages: EngineeringEngineering (R0)