skip to main content
10.1145/3548606.3560604acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Exposing the Rat in the Tunnel: Using Traffic Analysis for Tor-based Malware Detection

Published: 07 November 2022 Publication History

Abstract

Tor~\citetor is the most widely used anonymous communication network with millions of daily users~\citetormetrics. Since Tor provides server and client anonymity, hundreds of malware binaries found in the wild rely on it to hide their presence and hinder Command & Control (C&C) takedown operations. We believe Tor is a paramount tool enabling online freedom and privacy, and blocking it to defend against such malware is infeasible for both users and organizations. In this work, we present effective traffic analysis approaches that can accurately identify Tor-based malware communication. We collect hundreds of Tor-based malware binaries, execute and examine more than 47,000 active encrypted malware connections and compare them with benign browsing traffic. In addition to traditional traffic analysis features (which work at the connection level), we propose global host-level network features to capture peculiar malware communication fingerprints across host logs. Our experiments confirm that our models are able to detect "zero-day'' malware connections with 0.7% FPR even when malware connections constitute less than 5% of Tor traces in the test set. Using multi-labeling approaches, we are able to accurately detect the malware behavior-based classes (grayware, ransomware, etc). Finally, we evaluate the robustness of our models on real-world enterprise logs and show that the classifiers can identify infected hosts even with missing features.

References

[1]
2013. How to Handle Millions of New Tor Clients. https://blog.torproject.org/ how-handle-millions-new-tor-clients/#comment-34624.
[2]
2016. Tor-nonTor Dataset (ISCXTor2016). https://www.unb.ca/cic/datasets/tor. html.
[3]
2017. WannaCry Ransomware Campaign: Threat Details and Risk Management. https://www.fireeye.com/blog/products-and-services/2017/05/wannacryransomware- campaign.html.
[4]
2021. Hybrid Analysis. https://www.hybrid-analysis.com/.
[5]
2021. Tor Hidden Services Deprecation Timeline. https://blog.torproject.org/v2- deprecation-timeline/.
[6]
2021. Tor Metrics. https://metrics.torproject.org.
[7]
2022. Ahmia - Search Tor Hidden Services. https://ahmia.fi/.
[8]
2022. Autogluon Predictors. https://auto.gluon.ai/stable/api/autogluon.predictor. html?highlight=p_value.
[9]
2022. Autogluon Tabular Models (Documentation). https://auto.gluon.ai/stable/ api/autogluon.tabular.models.html?highlight=weighted%20ensemble%20l2.
[10]
2022. AutoGluon Tasks. https://auto.gluon.ai/stable/api/autogluon.task.html.
[11]
2022. BinaryRelevance: scikit-multilearn. http://scikit.ml/api/skmultilearn. problem_transform.br.html.
[12]
2022. ClassifierChains: scikit-multilearn. https://scikit-learn.org/stable/auto_ examples/multioutput/plot_classifier_chain_yeast.html.
[13]
2022. EternnalRocks-The Malware Wiki. https://malwiki.org/index.php?title= EternalRocks.
[14]
2022. Grayware- The Malware Wiki. https://malwiki.org/index.php?title= Grayware.
[15]
2022. LabelPowerset: scikit-multilearn. http://scikit.ml/api/skmultilearn. problem_transform.lp.html#skmultilearn.problem_transform.LabelPowerset.
[16]
2022. Python dpkt. https://dpkt.readthedocs.io/en/latest/.
[17]
2022. Spyware- The Malware Wiki. https://malwiki.org/index.php?title= Spyware.
[18]
2022. SystemBC -- a RAT in the Pipeline. https://blogs.blackberry.com/en/2021/ 06/threat-thursday-systembc-a-rat-in-the-pipeline.
[19]
2022. Top 1M Alexa. http://s3.amazonaws.com/alexa-static/top-1m.csv.zip.
[20]
2022. Trojan- The Malware Wiki. https://malwiki.org/index.php?title=Adware.
[21]
2022. Zeek, An Open Source Network Security Monitoring Tool. https://zeek. org/.
[22]
Bushra A Alahmadi, Enrico Mariconti, Riccardo Spolaor, Gianluca Stringhini, and Ivan Martinovic. 2020. BOTection: Bot Detection by Building Markov Chain Models of Bots Network Behavior. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. 652--664.
[23]
Omar Alrawi, Moses Ike, Matthew Pruett, Ranjita Pai Kasturi, Srimanta Barua, Taleb Hirani, Brennan Hill, and Brendan Saltaformaggio. 2021. Forecasting Malware Capabilities From Cyber Attack Memory Images. In 30th USENIX Security Symposium (USENIX Security 21). 3523--3540.
[24]
Omar Alrawi, Charles Lever, Kevin Valakuzhy, Kevin Snow, Fabian Monrose, Manos Antonakakis, et al. 2021. The Circle Of Life: A {Large-Scale} Study of The {IoT} Malware Lifecycle. In 30th USENIX Security Symposium (USENIX Security . 3505--3522.
[25]
Athanasios Avgetidis, Omar Alrawi, Kevin Valakuzhy, Charles Lever, Paul Burbage, Angelos Keromytis, Fabian Monrose, and Manos Antonakakis. 2023. Beyond The Gates: An Empirical Analysis of HTTP-Managed Password Stealers and Operators. In 32nd USENIX Security Symposium (USENIX Security 23).
[26]
Sanjit Bhat, David Lu, Albert Hyukjae Kwon, and Srinivas Devadas. 2019. Varcnn: A Data-efficient Website Fingerprinting Attack based on Deep Learning. (2019).
[27]
Xiang Cai, Xin Cheng Zhang, Brijesh Joshi, and Rob Johnson. 2012. Touching from a Distance: Website Fingerprinting Attacks and Defenses. In Proceedings of the 2012 ACM conference on Computer and communications security. 605--616.
[28]
Pitpimon Choorod and GeorgeWeir. 2021. Tor Traffic Classification Based on Encrypted Payload Characteristics. In 2021 National Computing Colleges Conference (NCCC). IEEE, 1--6.
[29]
Lucian Constantin. 2012. Tor Network Used to Command Skynet Botnet. https://www.computerworld.com/article/2493980/tor-network-used-tocommand- skynet-botnet.html.
[30]
Alfredo Cuzzocrea, Fabio Martinelli, Francesco Mercaldo, and Gianni Vercelli. 2017. Tor Traffic Analysis and Detection via Machine Learning Techniques. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 4474--4480.
[31]
Roger Dingledine, Nick Mathewson, and Paul Syverson. 2004. Tor: The Second- Generation Onion Router. In 13th USENIX Security Symposium (USENIX Security . USENIX Association, San Diego, CA. https://www.usenix.org/conference/ 13th-usenix-security-symposium/tor-second-generation-onion-router
[32]
Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. 2020. AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. arXiv preprint arXiv:2003.06505 (2020).
[33]
Oluwatobi Fajana, Gareth Owenson, and Mihaela Cocea. 2018. Torbot Stalker: Detecting Tor Botnets through Intelligent Circuit Data Analysis. In 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA). IEEE, 1--8.
[34]
Ibrahim Ghafir, Vaclav Prenosil, Mohammad Hammoudeh, Thar Baker, Sohail Jabbar, Shehzad Khalid, and Sardar Jaf. 2018. BotDet: A System for Real Time Botnet Command and Control Traffic Detection. IEEE Access 6 (2018), 38947-- 38958.
[35]
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. 2008. Botminer: Clustering Analysis of Network Traffic for Protocol-and Structure-independent Botnet Detection. (2008).
[36]
Jamie Hayes and George Danezis. 2016. k-fingerprinting: A Robust ScalableWebsite Fingerprinting Technique. In 25th {USENIX} Security Symposium ({USENIX} Security 16). 1187--1203.
[37]
Jamie Hayes and George Danezis. 2016. k-fingerprinting: A Robust ScalableWebsite Fingerprinting Technique. In 25th {USENIX} Security Symposium ({USENIX} Security 16). 1187--1203.
[38]
Dominik Herrmann, Rolf Wendolsky, and Hannes Federrath. 2009. Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Naïve-Bayes Classifier. In Proceedings of the 2009 ACM workshop on Cloud computing security. 31--42.
[39]
Lazaros Alexios Iliadis and Theodoros Kaifas. 2021. Darknet Traffic Classification Using Machine Learning Techniques. In 2021 10th International Conference on Modern Circuits and Systems Technologies (MOCAST). IEEE, 1--4.
[40]
Gregoire Jacob, Ralf Hund, Christopher Kruegel, and Thorsten Holz. 2011. JACKSTRAWS: Picking Command and Control Connections from Bot Traffic. In USENIX Security Symposium, Vol. 2011. San Francisco, CA, USA.
[41]
Arash Habibi Lashkari, Gerard Draper-Gil, Mohammad Saiful Islam Mamun, and Ali A Ghorbani. 2017. Characterization of Tor Traffic using Time Based Features. In ICISSp. 253--262.
[42]
Zhen Ling, Junzhou Luo, Kui Wu, Wei Yu, and Xinwen Fu. 2015. Torward: Discovery, Blocking, and Traceback of Malicious Traffic over Tor. IEEE Transactions on Information Forensics and Security 10, 12 (2015), 2515--2530.
[43]
Liming Lu, Ee-Chien Chang, and Mun Choon Chan. 2010. Website Fingerprinting and Identification using Ordered Feature Sequences. In European Symposium on Research in Computer Security. Springer, 199--214.
[44]
Haoyu Ma, Jianqiu Cao, Bo Mi, Darong Huang, Yang Liu, and Zhenyuan Zhang. 2021. Dark Web Traffic Detection Method Based on Deep Learning. In 2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS). IEEE, 842--847.
[45]
Brad Miller, Ling Huang, Anthony D. Joseph, and J. Doug Tygar. 2014. I Know Why You Went to the Clinic: Risks and Realization of HTTPS Traffic Analysis. ArXiv abs/1403.0297 (2014).
[46]
Abedelaziz Mohaisen and Omar Alrawi. 2013. Unveiling Zeus: Automated Classification of Malware Samples. In Proceedings of the 22nd International Conference on World Wide Web. 829--832.
[47]
Aziz Mohaisen and Omar Alrawi. 2014. Av-meter: An Evaluation of Antivirus Scans and Labels. In International conference on detection of intrusions and malware, and vulnerability assessment. Springer, 112--131.
[48]
Aziz Mohaisen, Omar Alrawi, Matt Larson, and Danny McPherson. 2013. Towards a Methodical Evaluation of Antivirus Scans and Labels. In InternationalWorkshop on Information Security Applications. Springer, 231--241.
[49]
Aziz Mohaisen, Omar Alrawi, and Manar Mohaisen. 2015. AMAL: High-fidelity, Hehavior-based Automated Malware Analysis and Classification. computers & security 52 (2015), 251--266.
[50]
Aziz Mohaisen, Omar Alrawi, Andrew GWest, and Allison Mankin. 2013. Babble: Identifying Malware by its Dialects. In 2013 IEEE Conference on Communications and Network Security (CNS). IEEE, 407--408.
[51]
Aziz Mohaisen, Andrew G West, Allison Mankin, and Omar Alrawi. 2014. Chatter: Classifying Malware Families using System Event Ordering. In 2014 IEEE Conference on Communications and Network Security. IEEE, 283--291.
[52]
Palo Alto Networks. 2012. Threat Assessment: Egregor Ransomware. https: //unit42.paloaltonetworks.com/egregor-ransomware-courses-of-action/.
[53]
Andriy Panchenko, Lukas Niessen, Andreas Zinnen, and Thomas Engel. 2011. Website Fingerprinting in Onion Routing based Anonymization Networks. In Proceedings of the 10th annual ACM workshop on Privacy in the electronic society. 103--114.
[54]
Eva Papadogiannaki and Sotiris Ioannidis. 2021. A Survey on Encrypted Network Traffic Analysis Applications, Techniques, and Countermeasures. 54, 6 (2021). https://doi.org/10.1145/3457904
[55]
Michal Piskozub, Riccardo Spolaor, and Ivan Martinovic. 2019. Malalert: Detecting Malware in Large-scale Network Traffic using Statistical Features. ACM SIGMETRICS Performance Evaluation Review 46, 3 (2019), 151--154.
[56]
Ferry Astika Saputra, Isbat Uzzin Nadhori, and Balighani Fathul Barry. 2016. Detecting and Blocking Onion Router Traffic Using Deep Packet Inspection. In 2016 International Electronics Symposium (IES). IEEE, 283--288.
[57]
Debmalya Sarkar, P Vinod, and Suleiman Y Yerima. 2020. Detection of Tor traffic using Deep Learning. In 2020 IEEE/ACS 17th International Conference on Computer Systems and Applications (AICCSA). IEEE, 1--8.
[58]
Roei Schuster, Vitaly Shmatikov, and Eran Tromer. 2017. Beauty and the Burst: Remote Identification of Encrypted Video Streams. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 1357-- 1374. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/ presentation/schuster
[59]
Silvia Sebastián and Juan Caballero. 2020. AVclass2: Massive Malware Tag Extraction from AV Labels. In Annual Computer Security Applications Conference (Austin, USA) (ACSAC '20). Association for Computing Machinery, New York, NY, USA, 42--53. https://doi.org/10.1145/3427228.3427261
[60]
Marcos Sebastián, Richard Rivera, Platon Kotzias, and Juan Caballero. 2016. AVclass: A Tool for Massive Malware Labeling, Vol. 9854. 230--253. https: //doi.org/10.1007/978--3--319--45719--2_11
[61]
Payap Sirinam, Mohsen Imani, Marc Juarez, and MatthewWright. 2018. Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 1928--1943.
[62]
Payap Sirinam, Nate Mathews, Mohammad Saidur Rahman, and MatthewWright. 2019. Triplet Fingerprinting: More Practical and Portable Website Fingerprinting with n-shot Learning. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1131--1148.
[63]
Qixiang Sun, Daniel R Simon, Yi-Min Wang, Wilf Russell, Venkata N Padmanabhan, and Lili Qiu. 2002. Statistical Identification of Encrypted Web Browsing Traffic. In Proceedings 2002 IEEE Symposium on Security and Privacy. IEEE, 19--30.
[64]
Tao Wang, Xiang Cai, Rishab Nithyanand, Rob Johnson, and Ian Goldberg. 2014. Effective Attacks and Provable Defenses for Website Fingerprinting. In 23rd {USENIX} Security Symposium ({USENIX} Security 14). 143--157.
[65]
Tao Wang and Ian Goldberg. 2013. Improved Website Fingerprinting on Tor. In Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society. 201--212.
[66]
Tao Wang and Ian Goldberg. 2013. Improved Website Fingerprinting on Tor. In Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society. 201--212.
[67]
Tao Wang and Ian Goldberg. 2016. On Realistically Attacking Tor with Website Fingerprinting. Proc. Priv. Enhancing Technol. 2016, 4 (2016), 21--36.
[68]
Charles V. Wright, Lucas Ballard, Fabian Monrose, and Gerald M. Masson. 2007. Language Identification of Encrypted VoIP Traffic: Alejandra y Roberto or Alice and Bob?. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium (Boston, MA) (SS'07). USENIX Association, USA, Article 4, 12 pages.
[69]
Yixiao Xu, Tao Wang, Qi Li, Qingyuan Gong, Yang Chen, and Yong Jiang. 2018. A Multi-tab Website Fingerprinting Attack. In Proceedings of the 34th Annual Computer Security Applications Conference. 327--341.

Cited By

View all

Index Terms

  1. Exposing the Rat in the Tunnel: Using Traffic Analysis for Tor-based Malware Detection

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CCS '22: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security
      November 2022
      3598 pages
      ISBN:9781450394505
      DOI:10.1145/3548606
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 November 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. malware
      2. tor
      3. traffic analysis

      Qualifiers

      • Research-article

      Conference

      CCS '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

      Upcoming Conference

      CCS '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)407
      • Downloads (Last 6 weeks)31
      Reflects downloads up to 03 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)MIDASComputers and Security10.1016/j.cose.2024.104154148:COnline publication date: 1-Jan-2025
      • (2025)Efficient Dark Web traffic classification using a hybrid CNN-LSTM modelInternational Journal of Information Technology10.1007/s41870-025-02427-xOnline publication date: 21-Feb-2025
      • (2024)Detecting and Understanding Self-Deleting JavaScript CodeProceedings of the ACM Web Conference 202410.1145/3589334.3645540(1768-1778)Online publication date: 13-May-2024
      • (2024)ANDE: Detect the Anonymity Web Traffic With Comprehensive ModelIEEE Transactions on Network and Service Management10.1109/TNSM.2024.345391721:6(6924-6936)Online publication date: Dec-2024
      • (2024)A Novel Self-Supervised Framework Based on Masked Autoencoder for Traffic ClassificationIEEE/ACM Transactions on Networking10.1109/TNET.2023.333525332:3(2012-2025)Online publication date: Jun-2024
      • (2024)Cactus: Obfuscating Bidirectional Encrypted TCP Traffic at Client SideIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.344253019(7659-7673)Online publication date: 2024
      • (2024)Discovering Command and Control (C2) Channels on Tor and Public Networks Using Reinforcement LearningSoutheastCon 202410.1109/SoutheastCon52093.2024.10500045(427-433)Online publication date: 15-Mar-2024
      • (2024)Darknet Traffic Analysis: A Systematic Literature ReviewIEEE Access10.1109/ACCESS.2024.337376912(42423-42452)Online publication date: 2024
      • (2024)CM-UTC: A Cost-sensitive Matrix based Method for Unknown Encrypted Traffic ClassificationThe Computer Journal10.1093/comjnl/bxae01767:7(2441-2452)Online publication date: 26-Feb-2024
      • (2024)Unveiling encrypted traffic types through hierarchical network characteristicsComputers and Security10.1016/j.cose.2023.103645138:COnline publication date: 17-Apr-2024
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media