Abstract
Distinguishing the prosperous network application is a challenging task in network management that has been extensively studied for many years. Unfortunately, previous work on HTTP traffic classification rely heavily on prior knowledge with coarse grained thus are limited in detecting the evolution of new emerging application and network behaviors. In this paper, we propose HSLF, a hierarchical system that employs application fingerprint to classify HTTP traffic. Specifically, we employ local-sensitive hashing algorithm to obtain the importance of each field in HTTP header, from which a rational weight allocation scheme and fingerprint of each HTTP session are generated. Then, similarities of fingerprints among each application are calculated to classify the unknown HTTP traffic. Performance on a real-world dataset of HSLF achieves an accuracy of 96.6%, which outperforms classic machine learning methods and state-of-the-art models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Buchanan, W.J., Helme, S., Woodward, A.: Analysis of the adoption of security headers in http. IET Inf. Secur. 12(2), 118–126 (2017)
Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, pp. 380–388 (2002)
Crotti, M., Dusi, M., Gringoli, F., Salgarelli, L.: Traffic classification through simple statistical fingerprinting. ACM SIGCOMM Comput. Commun. Rev. 37(1), 5–16 (2007)
van Ede, T., et al.: Flowprint: semi-supervised mobile-app fingerprinting on encrypted network traffic. In: Network and Distributed System Security Symposium, NDSS 2020. Internet Society (2020)
Fraleigh, C., et al.: Packet-level traffic measurements from the sprint IP backbone. IEEE Network 17(6), 6–16 (2003)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, pp. 604–613 (1998)
Jie, Y., Lun, Y., Yang, H., Chen, L.y.: Timely traffic identification on p2p streaming media. J. China Universities Posts Telecommun. 19(2), 67–73 (2012)
Kaoprakhon, S., Visoottiviseth, V.: Classification of audio and video traffic over http protocol. In: 2009 9th International Symposium on Communications and Information Technology pp. 1534–1539. IEEE (2009)
Lavrenovs, A., Melón, F.J.R.: Http security headers analysis of top one million websites. In: 2018 10th International Conference on Cyber Conflict (CyCon), pp. 345–370. IEEE (2018)
Li, Y., Li, J.: Multiclassifier: A combination of dpi and ml for application-layer classification in SDN. In: The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014), pp. 682–686. IEEE (2014)
Li, Z., Yuan, R., Guan, X.: Accurate classification of the internet traffic based on the SVM method. In: 2007 IEEE International Conference on Communications, pp. 1373–1378. IEEE (2007)
Liu, C., He, L., Xiong, G., Cao, Z., Li, Z.: Fs-net: a flow sequence network for encrypted traffic classification. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1171–1179. IEEE (2019)
Liu, C.C., Chang, Y., Tseng, C.W., Yang, Y.T., Lai, M.S., Chou, L.D.: SVM-based classification mechanism and its application in SDN networks. In: 2018 10th International Conference on Communication Software and Networks (ICCSN), pp. 45–49. IEEE (2018)
Manku, G.S., Jain, A., Das Sarma, A.: Detecting near-duplicates for web crawling. In: Proceedings of the 16th International Conference on World Wide Web, pp. 141–150 (2007)
Moore, A., Zuev, D., Crogan, M.: Discriminators for use in flow-based classification. Technical report (2013)
Pham, K., Santos, A., Freire, J.: Understanding website behavior based on user agent. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 1053–1056 (2016)
Raghuramu, A., Pathak, P.H., Zang, H., Han, J., Liu, C., Chuah, C.N.: Uncovering the footprints of malicious traffic in wireless/mobile networks. Comput. Commun. 95, 95–107 (2016)
Wang, S., et al.: Trafficav: an effective and explainable detection of mobile malware behavior using network traffic. In: 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS), pp. 1–6. IEEE (2016)
Williams, N., Zander, S.: Evaluating machine learning algorithms for automated network application identification (2006)
Xu, F., et al.: Identifying malware with http content type inconsistency via header-payload comparison. In: 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC), pp. 1–7. IEEE (2017)
Xu, Q., Erman, J., Gerber, A., Mao, Z., Pang, J., Venkataraman, S.: Identifying diverse usage behaviors of smartphone apps. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, pp. 329–344 (2011)
Yao, H., Ranjan, G., Tongaonkar, A., Liao, Y., Mao, Z.M.: Samples: self adaptive mining of persistent lexical snippets for classifying mobile application traffic. In: Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, pp. 439–451 (2015)
Zhang, J., Xiang, Y., Zhou, W., Wang, Y.: Unsupervised traffic classification using flow statistical properties and IP packet payload. J. Comput. Syst. Sci. 79(5), 573–585 (2013)
Acknowledgment
This work was supported by the National Key R&D Program of China with No. 2018YFC0806900 and No. 2018YFB0805004, Beijing Municipal Science & Technology Commission with Project No. Z191100007119009, NSFC No.61902397, NSFC No. U2003111 and NSFC No. 61871378.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tang, Z., Wang, Q., Li, W., Bao, H., Liu, F., Wang, W. (2021). HSLF: HTTP Header Sequence Based LSH Fingerprints for Application Traffic Classification. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science(), vol 12742. Springer, Cham. https://doi.org/10.1007/978-3-030-77961-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-77961-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77960-3
Online ISBN: 978-3-030-77961-0
eBook Packages: Computer ScienceComputer Science (R0)