Abstract
Tor is an open-source communications software program that enables anonymity on the Internet. Tor’s ability to hide its users’ identity means it is incredibly popular with criminals, who use it to keep their online activities secret from law enforcement authorities. Tor uses layers of encryption to hide its users’ data on the Web. However, most encryption techniques implemented till date do not provide full anonymity. We can use classification algorithms based on machine learning and deep learning to extract information about the users from network traffic. In this paper, we show that by performing a temporal analysis of Tor network traffic flowing between the user node and guard node, one can classify the Tor network traffic into various application types such as browsing, chat, email, P2P, FTP, audio, video, VoIP, and file-transfer. We apply many standard and popular machine learning and deep learning algorithms to categorize traffic by application and achieved an accuracy of 95.75% for Random Forest which outperforms the best work done till date on the ISCXTor2016 dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lashkari, A.H., Draper-Gil, G., Mamun, M.S.I., Ghorbani, A.A.: Characterization of tor traffic using time-based features. In: ICISSP, pp. 253–262 (2004)
Dingledine, R., Mathewson, N., Syverson, P.: Tor: the second-generation onion router. Technical report, Naval Research Lab Washington DC (2002)
Back, A., Möller, U., Stiglic, A.: Traffic analysis attacks and trade-offs in anonymity providing systems. In: International Workshop on Information Hiding, pp. 245–257 (2002)
Draper-Gil, G., Lashkari, A.H., Mamun, M.S.I., Ghorbani, A.A.: Characterization of encrypted and VPN traffic using time-related. In: Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP), pp. 407–414 (2016)
Lal, T.N., Chapelle, O., Weston, J., Elisseeff, A.: Embedded methods. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction, pp. 137–165. Springer, Heidelberg (2006). https://doi.org/10.1007/978-3-540-35488-8_6
Gurunarayanan, A., Agrawal, A., Bhatia, A., Vishwakarma, D.K.: Improving the performance of machine learning algorithms for tor detection. In: 2021 International Conference on Information Networking (ICOIN), pp. 439–444 (2021)
Lamping, U., Warnicke, E.: Wireshark user’s guide. Interface 4(6), 1 (2004)
Klevinsky, T.J., Laliberte, S., Gupta, A.: Hack IT: Security Through Penetration Testing. Addison Wesley Professional, Boston (2002)
Fischetti T.: Data Analysis with R. Packt Publishing Ltd. (2015)
Duch, W.: Filter methods. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction, pp. 89–117. Springer, Heidelberg (2006). https://doi.org/10.1007/978-3-540-35488-8_4
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)
Yang, J.B., Ong, C.J.: An effective feature selection method via mutual information estimation. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42(6), 1550–1559 (2012)
Maldonado, S., Weber, R.: A wrapper method for feature selection using support vector machines. Inf. Sci. 179(13), 2208–2217 (2009)
Goodfelow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Bjorck, J., Gomes, C., Selman, B., Weinberger, K.Q.: Understanding batch normalization. arXiv preprint arXiv:180602375 (2018)
Dubey, A.K., Jain, V.: Comparative study of convolution neural network’s relu and leaky-relu activation functions. In: Mishra, S., Sood, Y.R., Tomar, A. (eds.) Applications of Computing, Automation and Wireless Systems in Electrical Engineering. LNEE, vol. 553, pp. 873–880. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-6772-4_76
How to grid search hyperparameters for deep learning models in python with Keras. https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras. Accessed 10 Oct 2022
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority oversampling technique. J. Artif. Intell. Res. 1, 321–357 (2002)
Xu, J., Wang, J., Qi, Q., Sun, H., He, B.: Deep neural networks for application awareness in sdnbased network. In: 28th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2018)
Sarkar, D., Vinod, P., Yerima, S.Y.: Detection of tor traffic using deep learning. In: IEEE/ACS 17th International Conference on Computer Systems and Applications (AICCSA), pp. 1–8. IEEE (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kumari, M., Ghosh, M., Baliyan, N. (2023). Temporal Analysis of Privacy Enhancing Technology Traffic Using Deep Learning. In: Arief, B., Monreale, A., Sirivianos, M., Li, S. (eds) Security and Privacy in Social Networks and Big Data. SocialSec 2023. Lecture Notes in Computer Science, vol 14097. Springer, Singapore. https://doi.org/10.1007/978-981-99-5177-2_14
Download citation
DOI: https://doi.org/10.1007/978-981-99-5177-2_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5176-5
Online ISBN: 978-981-99-5177-2
eBook Packages: Computer ScienceComputer Science (R0)