ABSTRACT
Machine learning (ML) algorithms have been shown to be effective in classifying the dynamic internet traffic today. Using additional features and sophisticated ML techniques can improve accuracy and can classify a broad range of application classes. Realizing such classifiers to meet high data rates is challenging. In this paper, we propose two architectures to realize complete online traffic classifier using flow-level features. First, we develop a traffic classifier based on C4.5 decision tree algorithm and Entropy-MDL discretization algorithm. It achieves an accuracy of 97.92% when classifying a traffic trace consisting of eight application classes. Next, we accelerate our classifier using two architectures on FPGA. One architecture stores the classifier in on-chip distributed RAM. It is designed to sustain a high throughput. The other architecture stores the classifier in block RAM. It is designed to operate with small hardware footprint and thus built at low hardware cost. Experimental results show that our high throughput architecture can sustain a throughput of $550$ Gbps assuming 40 Byte packet size. Our low cost architecture demonstrates a 22% better resource efficiency than the high throughput design. It can be easily replicated to achieve $449$ Gbps while supporting 160 input traffic streams concurrently. Both architectures are parameterizable and programmable to support any binary-tree-based traffic classifier. We develop a tool which allows users to easily map a binary-tree-based classifier to hardware. The tool takes a classifier as input and automatically generates the Verilog code for the corresponding hardware architecture.
- R. Alshammari and A. N. Zincir-Heywood. Machine learning based encrypted traffic classification: identifying ssh and skype. In the proc. of CISDA, pages 289-296, 2009. Google ScholarDigital Library
- D. Angevine and A. Zincir-Heywood. A preliminary investigation of skype traffic classification using a minimalist feature set. In the proc. of ARES, pages 1075-1079, 2008. Google ScholarDigital Library
- L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian. Traffic classification on the fly. ACM SIGCOMM Computer Communication Review, 36:23-26, April 2006. Google ScholarDigital Library
- D. Bonfiglio, M. Mellia, M. Meo, D. Rossi, and P. Tofanelli. Revealing skype traffic: when randomness plays with you. In the proc. of SIGCOMM, pages 37-48, 2007. Google ScholarDigital Library
- J. Erman, M. Arlitt, and A. Mahanti. Traffic classification using clustering algorithms. In the proc. of MineNet, pages 281-286, 2006. Google ScholarDigital Library
- U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous valued attributes for classification learning. In the proc. of IJCAI, pages 1022-1027, 1993.Google Scholar
- M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. SIGKDD Explor. Newsl., 11(1):10-18, Nov. 2009. Google ScholarDigital Library
- M.-H. Ho, Y.-Q. Ai, T. C.-P. Chau, S. C. L. Yuen, C.-S. Choy, P. H. W. Leong, and K.-P. Pun. Architecture and design flow for a highly efficient structured asic. VLSI, PP(99):1, 2012.Google Scholar
- W. Jiang and M. Gokhale. Real-time classification of multimedia traffic using fpga. In the proc. of FPL, 2010. Google ScholarDigital Library
- T. Karagiannis, A. Broido, M. Faloutsos, and K. Claffy. Transport layer identification of p2p traffic. In the proc. of IMC, 2004. Google ScholarDigital Library
- T. Karagiannis, K. Papagiannaki, and M. Faloutsos. Blinc: multilevel traffic classification in the dark. In the proc. of SIGCOMM, 2005. Google ScholarDigital Library
- R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In the proc. of IJCAI, pages 1137-1143. Morgan Kaufmann, 1995. Google ScholarDigital Library
- Y.-S. Lim, H.-C. Kim, J. Jeong, C.-K. Kim, T. T. Kwon, and Y. Choi. Internet traffic classification demystified: on the sources of the discriminative power. In the Proc. of ACM Co-NEXT, '10, pages 9:1-9:12, 2010. Google ScholarDigital Library
- Y. Luo, K. Xiang, and S. Li. Acceleration of decision tree searching for ip traffic classification. In the proc. of ANCS, 2008. Google ScholarDigital Library
- A. Monemi, R. Zarei, M. Marsono, and M. Khalil-Hani. Parameterizable decision tree classifier on netfpga. Advances in Intelligent Systems and Computing, 182:119-128, 2013.Google ScholarCross Ref
- A. W. Moore and K. Papagiannaki. Toward the accurate identification of network applications. LNCS, 3431/2005:41-54, 2005. Google ScholarDigital Library
- J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., 1993. Google ScholarDigital Library
- S. Sen, O. Spatscheck, and D. Wang. Accurate, scalable innetwork identification of p2p traffic using application signatures. In the proc. of WWW, 2004. Google ScholarDigital Library
- T. Traces. http://tstat.tlc.polito.it/traces.shtml.Google Scholar
- N. Williams, S. Zander, and G. Armitage. A preliminary performance comparison of five machine learning algorithms for practical ip traffic flow classification. SIGCOMM Comput. Commun. Rev., 36(5):5-16, 2006. Google ScholarDigital Library
- S. Zander, T. Nguyen, and G. Armitage. Automated traffic classification and application identification using machine learning. In the proc. of LCN, pages 250-257, 2005. Google ScholarDigital Library
Index Terms
- High throughput and programmable online trafficclassifier on FPGA
Recommendations
Enabling High Throughput and Virtualization for Traffic Classification on FPGA
FCCM '15: Proceedings of the 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing MachinesAs an important network management task, Internet traffic classification requires high throughput. Virtualization is a technique sharing the same piece of hardware for multiple users. We present a high-throughput and virtualized architecture for online ...
Online NetFPGA decision tree statistical traffic classifier
Classifying online network traffic is becoming critical in network management and security. Recently, new classification methods based on analysis of statistical features of transport layer traffic have been proposed. While these new methods address the ...
Bit vector-coded simple CART structure for low latency traffic classification on FPGAs
AbstractTraffic classification is the determination of the application types during real-time flow of internet traffic. Machine learning (ML) based classification approaches that can classify internet traffic using statistical properties of ...
Comments