Abstract
Traffic monitoring and traffic characterization are essential for network planning and operation. Machine learning has been readily applied to high-speed network traffic classification. Evaluating the viability and performance of various classifiers for different application scenarios, particularly for Internet traffic, is critical. Based on our research, commonly used metrics (such as accuracy) can’t accurately reflect the classifier performance in imbalanced data sets. Other methods, like ROC curve, or simply AUC can’t perform well in multi-objective classification. Network traffic is an imbalanced multi-class data set. To the best of our knowledge, no research has been conducted to quantitatively compare performance in the network traffic classification. To address these issues, we propose an evaluation method aiming for the case of imbalanced multi-class network traffic classification. This proposal is based on the multi-objective metric, an area under the ROC (receiver operating characteristic) curve, or simply AUC (the area under the ROC). We conduct our experiments with the traffic trace captured from real network. The experiment results show that our method is capable of evaluating the case of the classes being misclassified. Particularly, it is more sensitive for the case of small proportion (Minority classes) being misclassified into large proportion (Majority classes) than the case of Majority classes being misclassified into Minority classes. Hence, we recommend to leverage the measure proposed in this paper to evaluate the classifier performance in distinguishing of misclassifying the Minority applications into the Majority applications.
Similar content being viewed by others
References
Wang, C., Zhou, X., You, F., & Chen, H. (2009). Design of P2P traffic identification based on DPI and DFI. In Computer Network and Multimedia Technology, 2009. CNMT 2009. International Symposium on, 2009 (pp. 1–4). IEEE.
Lim, Y.-S., Kim, H.-C., Jeong, J., Kim, C.-K., Kwon, T. T., & Choi, Y. (2010). Internet traffic classification demystified: On the sources of the discriminative power. In Proceedings of the 6th International Conference, 2010 (p. 9). ACM.
Mohr, J., Seo, S., & Obermayer, K. (2014). A classifier-based association test for imbalanced data derived from prediction theory. In Neural Networks (IJCNN), 2014 International Joint Conference on, 2014 (pp. 487–493). IEEE.
Xue, J., & Hall, P. (2014). Why does rebalancing class-unbalanced dataImprove AUC for linear discriminant analysis? In Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2014 (pp. 1). IEEE.
Kuchibhotla, S., Vankayalapati, H. D., Yalamanchili, B. S., & Anne, K. R. (2014). ROC analysis of class dependent and class independent linear discriminant classifiers using frequency domain features. In Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on, 2014 (pp. 1916–1920). IEEE.
Kononenko, I., & Bratko, I. (1991). Information-based evaluation criterion for classifier’s performance. Machine Learning, 6(1), 67–80.
Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. Knowledge and Data Engineering, IEEE Transactions on, 17(3), 299–310.
Elazmeh, W., Japkowicz, N., & Matwin, S. (2006). Evaluating misclassifications in imbalanced data. In Machine learning: ECML 2006 (pp. 126–137). Berlin: Springer.
Weng, C. G., & Poon, J. (2008). A new evaluation measure for imbalanced datasets. In Proceedings of the 7th Australasian Data Mining Conference, 2008 (Vol. 87, pp. 27–32). Australian Computer Society Inc.
Provost, F. J., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In Icml, 1998 (Vol. 98, pp. 445–453).
Ling, C. X., Huang, J., & Zhang, H. (2003). AUC: A statistically consistent and more discriminating measure than accuracy. In Ijcai, 2003 (Vol. 3, pp. 519–524).
Dong, Y. X., Li, Y. X., Li, J. H., & Zhao, H. (2012). Analysis on weighted AUC for imbalanced data learning through isometrics. Journal of Computational Information, 8, 371–378.
Chawla, N.V. (2005). Data mining for imbalanced datasets: An overview. In Data mining and knowledge discovery handbook (pp. 853–867): Berlin: Springer.
Raeder, T., Forman, G., & Chawla, N.V. (2012). Learning from imbalanced data: Evaluation matters. In Data mining: Foundations and intelligent paradigms (pp. 315–331). Berlin: Springer.
Seliya, N., Khoshgoftaar, T.M., & Van Hulse, J. (2009). A study on the relationships of classifier performance metrics. In Tools with Artificial Intelligence, 2009. ICTAI’09. 21st International Conference on, 2009 (pp. 59–66). IEEE.
Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, 2006 (pp. 233–240). ACM.
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.
Hand, D. J., & Till, R. J. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45(2), 171–186.
Witten, I.H., Frank, E., Trigg, L.E., Hall, M.A., Holmes, G., & Cunningham, S.J. (1999). Weka: Practical machine learning tools and techniques with Java implementations.
WiKi. (2014). Generating ROC curve. http://weka.wikispaces.com/Generating+ROC+curve
Yang, J., Ma, J., Cheng, G., Wang, Y., Yuan, L., & Dong, C. (2012). An empirical investigation of filter attribute selection techniques for high-speed network traffic flow classification. Wireless Personal Communications, 66(3), 541–558.
Quinlan, J. R. (1993). C4. 5: Programs for machine learning (Vol. 1). Los Altos: Morgan kaufmann.
Acknowledgments
This work was supported in part by National Natural Science Foundation of China (61072061), EU FP7 IRSES MobileCloud Project (Grant No. 612212) and 111 Project of China (B08004).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, J., Wang, YX., Qiao, YY. et al. On Evaluating Multi-class Network Traffic Classifiers Based on AUC. Wireless Pers Commun 83, 1731–1750 (2015). https://doi.org/10.1007/s11277-015-2473-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-015-2473-4