Feature Evaluation for Early Stage Internet Traffic Identification

Peng, Lizhi; Zhang, Hongli; Yang, Bo; Chen, Yuehui

doi:10.1007/978-3-319-11197-1_39

Lizhi Peng^24,25,
Hongli Zhang²⁴,
Bo Yang²⁵ &
…
Yuehui Chen²⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8630))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2576 Accesses
6 Citations

Abstract

Identifying a network traffic at its early stage accurately is very important for the application of traffic identification. And this has caught a lot of interests in recent years. Packet sizes and statistical features are effective features that widely used in early stage traffic identification. However, an important issue is still unconcerned, that is whether there exists essential differences between using the packet sizes and derived features such as statistics in early stage traffic identification. In this paper, we set out to evaluate the effectiveness of different kinds of early stage traffic features. We firstly extract the packet sizes and their derived features of the first 10 packets on 3 traffic data sets. Then the mutual information between each feature and the corresponding traffic type label is computed to show the effectiveness of the feature. And then we execute a set of crossover identification experiments with different feature sets using 7 well-known classifiers. Our experimental results show that most classifiers get almost the same performances using packet sizes and derived features for early stage traffic identification. And the combined feature set selected by mutual information can obtain high identification performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bernaille, L., Teixeira, R., Akodkenou, I., Soule, A., Salamatian, K.: Traffic Classification On The Fly. In: ACM SIGCOMM 2006, pp. 23–26 (2006)
Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Article Google Scholar
Dainotti, A., Pescapé, A., Claffy, K.C.: Issues and future directions in traffic classification. IEEE Network 26(1), 35–40 (2012)
Article Google Scholar
Dainotti, A., Pescapé, A., Sansone, C.: Early classification of network traffic through multi-classification. In: Domingo-Pascual, J., Shavitt, Y., Uhlig, S. (eds.) TMA 2011. LNCS, vol. 6613, pp. 122–135. Springer, Heidelberg (2011)
Chapter Google Scholar
Estan, C., Varghese, G.: New Directions in Traffic Measurement and Accounting: Focusing on the Elephants, Ignoring the Mice. ACM Transactions on Computer Systems 21(3), 270–313 (2003)
Article Google Scholar
Este, A., Gringoli, F., Salgarelli, L.: On the Stability of the Information Carried by Traffic Flow Features at the Packet Level. In: ACM SIGCOMM 2009, pp. 13–18 (2009)
Google Scholar
Este, A., Gringoli, F., Salgarelli, L.: Support Vector Machines for TCP traffic classification. Computer Networks 53, 2476–2490 (2009)
Article MATH Google Scholar
Huang, N., Jai, G., Chao, H.: Early identifying application traffic with application characteristics. In: IEEE Int. Conference on Communications (ICC 2008), pp. 5788–5792 (2008)
Google Scholar
Huang, N., Jai, G., Chao, H., et al.: Application traffic classification at the early stage by characterizing application rounds. Information Sciences 232(20), 130–142 (2013)
Article Google Scholar
Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17, 299–310 (2005)
Article Google Scholar
Hullár, B., Laki, S., Gyorgy, A.: Early identification of peer-to-peer traffic. In: 2011 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE Press (2011)
Google Scholar
Gringoli, F., Salgarelli, L., Dusi, M., et al.: Gt: picking up the truth from the ground for internet traffic. ACM SIGCOMM Computer Communication Review 39(5), 12–18 (2009)
Article Google Scholar
Li, W., Moore, A.W.: A Machine Learning Approach for Efficient Traffic Classification. In: Proceedings of IEEE MASCOTS 2007, pp. 310–317 (2007)
Google Scholar
Moore, A.W., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification, Intel Research Tech. Rep. (2005)
Google Scholar
Moore, A.W., Zuev, D.: Internet Traffic Classification Using Bayesian Analysis Techniques. In: ACM SIGMETRICS 2005, pp. 50–60 (2005)
Google Scholar
Nguyen, T.T.T., Armitage, G., Branch, P., et al.: Timely and continuous machine-learning-based classification for interactive IP traffic. IEEE/ACM Transactions on Networking (TON) 20(6), 1880–1894 (2012)
Article Google Scholar
Peng, H.: Mutual infomation Matlab toolbox, http://www.mathworks.com/matlabcentral/fileexchange/14888-mutual-information-computation
Peng, L., Zhang, H., Yang, B., et al.: Traffic Labeller: Collecting Internet Traffic Samples with Accurate Application Information. China Communications 11(1), 67–78 (2014)
Article Google Scholar
Qu, B., Zhang, Z., Guo, L., et al.: On accuracy of early traffic classification. In: IEEE 7th International Conference on Networking, Architecture and Storage (NAS), pp. 348–354. IEEE Press (2012)
Google Scholar
Tcpdump/Libpcap, http://www.tcpdump.org
UNIBS: Data sharing, http://www.ing.unibs.it/ntw/tools/traces/
Waikato Internet Traffic Storage (WITS), http://www.wand.net.nz/wits
Weka 3: Data Mining Software in Java, http://www.cs.waikato.ac.nz/ml/weka/
Zhang, J., Xiang, Y., Wang, Y., et al.: Network traffic classification using correlation information. IEEE Transactions on Parallel and Distributed Systems 24(1), 104–117 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150002, P.R. China
Lizhi Peng & Hongli Zhang
Provincial Key Laboratory for Network Based Intelligent Computing, University of Jinan, Jinan, 250022, P.R. China
Lizhi Peng, Bo Yang & Yuehui Chen

Authors

Lizhi Peng
View author publications
You can also search for this author in PubMed Google Scholar
Hongli Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuehui Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Illinois Institute of Technology, 60616-3793, Chicago, IL, USA
Xian-he Sun
School of Computer Science and Technology, Dalian Maritime University, 1 Linghai Road, 116026, Dalian, China
Wenyu Qu
University of Ottawa, SEECS, 8, King Edward Ave, K1N 6N5, Ottawa, ON, Canada
Ivan Stojmenovic
Deakin University, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Wanlei Zhou
Dalian Maritime University, NO.1 Linhai Road, 116026, Dailian, China
Zhiyang Li & Tingting Yang &
BeiHang University, XueYuan Road No.37,HaiDian District, Beijing, China
Hua Guo
University of Bradford, BD7 1DP, Bradford, West Yorkshire, United Kingdom
Geyong Min
Computer Network Information Center, Chinese Academy of Sciences, 100190, Beijing, China
Yulei Wu
27 Shanda Nanlu, 250100, Jinan City, Shandong Province, China
Lei Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, L., Zhang, H., Yang, B., Chen, Y. (2014). Feature Evaluation for Early Stage Internet Traffic Identification. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8630. Springer, Cham. https://doi.org/10.1007/978-3-319-11197-1_39

Download citation

DOI: https://doi.org/10.1007/978-3-319-11197-1_39
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11196-4
Online ISBN: 978-3-319-11197-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics