Abstract
Identifying a network traffic at its early stage accurately is very important for the application of traffic identification. And this has caught a lot of interests in recent years. Packet sizes and statistical features are effective features that widely used in early stage traffic identification. However, an important issue is still unconcerned, that is whether there exists essential differences between using the packet sizes and derived features such as statistics in early stage traffic identification. In this paper, we set out to evaluate the effectiveness of different kinds of early stage traffic features. We firstly extract the packet sizes and their derived features of the first 10 packets on 3 traffic data sets. Then the mutual information between each feature and the corresponding traffic type label is computed to show the effectiveness of the feature. And then we execute a set of crossover identification experiments with different feature sets using 7 well-known classifiers. Our experimental results show that most classifiers get almost the same performances using packet sizes and derived features for early stage traffic identification. And the combined feature set selected by mutual information can obtain high identification performances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bernaille, L., Teixeira, R., Akodkenou, I., Soule, A., Salamatian, K.: Traffic Classification On The Fly. In: ACM SIGCOMM 2006, pp. 23–26 (2006)
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Dainotti, A., Pescapé, A., Claffy, K.C.: Issues and future directions in traffic classification. IEEE Network 26(1), 35–40 (2012)
Dainotti, A., Pescapé, A., Sansone, C.: Early classification of network traffic through multi-classification. In: Domingo-Pascual, J., Shavitt, Y., Uhlig, S. (eds.) TMA 2011. LNCS, vol. 6613, pp. 122–135. Springer, Heidelberg (2011)
Estan, C., Varghese, G.: New Directions in Traffic Measurement and Accounting: Focusing on the Elephants, Ignoring the Mice. ACM Transactions on Computer Systems 21(3), 270–313 (2003)
Este, A., Gringoli, F., Salgarelli, L.: On the Stability of the Information Carried by Traffic Flow Features at the Packet Level. In: ACM SIGCOMM 2009, pp. 13–18 (2009)
Este, A., Gringoli, F., Salgarelli, L.: Support Vector Machines for TCP traffic classification. Computer Networks 53, 2476–2490 (2009)
Huang, N., Jai, G., Chao, H.: Early identifying application traffic with application characteristics. In: IEEE Int. Conference on Communications (ICC 2008), pp. 5788–5792 (2008)
Huang, N., Jai, G., Chao, H., et al.: Application traffic classification at the early stage by characterizing application rounds. Information Sciences 232(20), 130–142 (2013)
Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17, 299–310 (2005)
Hullár, B., Laki, S., Gyorgy, A.: Early identification of peer-to-peer traffic. In: 2011 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE Press (2011)
Gringoli, F., Salgarelli, L., Dusi, M., et al.: Gt: picking up the truth from the ground for internet traffic. ACM SIGCOMM Computer Communication Review 39(5), 12–18 (2009)
Li, W., Moore, A.W.: A Machine Learning Approach for Efficient Traffic Classification. In: Proceedings of IEEE MASCOTS 2007, pp. 310–317 (2007)
Moore, A.W., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification, Intel Research Tech. Rep. (2005)
Moore, A.W., Zuev, D.: Internet Traffic Classification Using Bayesian Analysis Techniques. In: ACM SIGMETRICS 2005, pp. 50–60 (2005)
Nguyen, T.T.T., Armitage, G., Branch, P., et al.: Timely and continuous machine-learning-based classification for interactive IP traffic. IEEE/ACM Transactions on Networking (TON) 20(6), 1880–1894 (2012)
Peng, H.: Mutual infomation Matlab toolbox, http://www.mathworks.com/matlabcentral/fileexchange/14888-mutual-information-computation
Peng, L., Zhang, H., Yang, B., et al.: Traffic Labeller: Collecting Internet Traffic Samples with Accurate Application Information. China Communications 11(1), 67–78 (2014)
Qu, B., Zhang, Z., Guo, L., et al.: On accuracy of early traffic classification. In: IEEE 7th International Conference on Networking, Architecture and Storage (NAS), pp. 348–354. IEEE Press (2012)
Tcpdump/Libpcap, http://www.tcpdump.org
UNIBS: Data sharing, http://www.ing.unibs.it/ntw/tools/traces/
Waikato Internet Traffic Storage (WITS), http://www.wand.net.nz/wits
Weka 3: Data Mining Software in Java, http://www.cs.waikato.ac.nz/ml/weka/
Zhang, J., Xiang, Y., Wang, Y., et al.: Network traffic classification using correlation information. IEEE Transactions on Parallel and Distributed Systems 24(1), 104–117 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Peng, L., Zhang, H., Yang, B., Chen, Y. (2014). Feature Evaluation for Early Stage Internet Traffic Identification. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8630. Springer, Cham. https://doi.org/10.1007/978-3-319-11197-1_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-11197-1_39
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11196-4
Online ISBN: 978-3-319-11197-1
eBook Packages: Computer ScienceComputer Science (R0)