skip to main content
research-article

Exploring and Analyzing the Tor Hidden Services Graph

Published: 24 July 2017 Publication History

Abstract

The exploration and analysis of Web graphs has flourished in the recent past, producing a large number of relevant and interesting research results. However, the unique characteristics of the Tor network limit the applicability of standard techniques and demand for specific algorithms to explore and analyze it. The attention of the research community has focused on assessing the security of the Tor infrastructure (i.e., its ability to actually provide the intended level of anonymity) and on discussing what Tor is currently being used for. Since there are no foolproof techniques for automatically discovering Tor hidden services, little or no information is available about the topology of the Tor Web graph. Even less is known on the relationship between content similarity and topological structure. The present article aims at addressing such lack of information. Among its contributions: a study on automatic Tor Web exploration/data collection approaches; the adoption of novel representative metrics for evaluating Tor data; a novel in-depth analysis of the hidden services graph; a rich correlation analysis of hidden services’ semantics and topology. Finally, a broad interesting set of novel insights/considerations over the Tor Web organization and content are provided.

References

[1]
Daniel Arp, Fabian Yamaguchi, and Konrad Rieck. 2015. Torben: A practical side-channel attack for deanonymizing Tor communication. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security (ASIACCS’15). ACM, New York, 597--602.
[2]
Monica J. Barrat. 2012. Silk road: Ebay for drugs. Addiction 107, 3 (2012), 683--683.
[3]
Kevin Bauer, Micah Sherr, Damon McCoy, and Dirk Grunwald. 2011. ExperimenTor: A testbed for safe and realistic Tor experimentation. In Proceedings of the Workshop on Cyber Security Experimentation and Test (CSET’11).
[4]
Massimo Bernaschi, Giancarlo Carbone, and Flavio Vella. 2016. Scalable betweenness centrality on multi-GPU systems. In Proceedings of the ACM International Conference on Computing Frontiers (CF’16). ACM, New York, 29--36.
[5]
Alex Biryukov, Ivan Pustogarov, Fabrice Thill, and Ralf-Philipp Weinmann. 2014. Content and popularity analysis of Tor hidden services. In Proceedings of the 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW’14). 188--193.
[6]
Alex Biryukov, Ivan Pustogarov, and Ralf-Philipp Weinmann. 2013. Trawling for Tor hidden services: Detection, measurement, deanonymization. In Proceedings of the Symposium on Security and Privacy (SP’13). IEEE Computer Society, Washington, DC, 80--94.
[7]
Paolo Boldi, Andrea Marino, Massimo Santini, and Sebastiano Vigna. 2014. BUbiNG: Massive crawling for the masses. In Proceedings of the 23rd International Conference on World Wide Web Companion. 227--228.
[8]
Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: Compression techniques. In Proceedings of the 13th International Conference on World Wide Web (WWW’04). ACM, New York, 595--602.
[9]
Phillip Bonacich. 2007. Some unique properties of eigenvector centrality. Soc. Netw. 29, 4 (2007), 555--564.
[10]
Anthony Bonato. 2005. A survey of models of the web graph. In Combinatorial and Algorithmic Aspects of Networking, Alejandro Lopez-Ortiz and Angle M. Hamel (Eds.). Lecture Notes in Computer Science, Vol. 3405. Springer, Berlin, 159--172.
[11]
Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. 2000. Graph structure in the web. Comput. Netw. 33, 16 (2000), 309--320.
[12]
Soumen Chakrabarti, Amit Pathak, and Manish Gupta. 2011. Index design and query processing for graph conductance search. VLDB J. 20, 3 (June 2011), 445--470.
[13]
Francisco Claude and Susana Ladra. 2011. Practical representations for web and social graphs. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, 1185--1190.
[14]
Francisco Claude and Gonzalo Navarro. 2010. Fast and compact web graph representations. ACM Trans. Web, 4, Article 16 (Sept. 2010), 31 pages.
[15]
Devanshu Dhyani, Wee Keong Ng, and Sourav S. Bhowmick. 2002. A survey of web metrics. ACM Comput. Surv. 34, 4 (Dec. 2002), 469--503.
[16]
Roger Dingledine, Nick Mathewson, and Paul Syverson. 2004. Tor: The second-generation onion router. In Proceedings of the 13th Usenix Security Symposium.
[17]
Paul Erdős and Alfréd Rényi. 1959. On random graphs. Publicat. Mathemat. Debrec. 6 (1959), 290--297.
[18]
Emilio Ferrara, Pasquale De Meo, Giacomo Fiumara, and Robert Baumgartner. 2014. Web data extraction, applications and techniques: A survey. Knowl.-Based Syst. 70 (2014), 301--323.
[19]
Gary William Flake, Steve Lawrence, C. Lee Giles, and Frans M. Coetzee. 2002. Self-organization and identification of web communities. IEEE Comput. 35 (2002), 66--71.
[20]
Massimo Franceschet. 2011. PageRank: Standing on the shoulders of giants. Commun. ACM 54, 6 (June 2011), 92--101.
[21]
Christos Giatsidis, Fragkiskos D. Malliaros, and Michalis Vazirgiannis. 2013. Advanced graph mining for community evaluation in social networks and the web. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM’13). ACM, New York, 771--772.
[22]
Evgeniy A. Grechnikov. 2012. Degree distribution and number of edges between nodes of given degrees in the buckleyosthus model of a random web graph. Internet Math. 8, 3 (2012), 257--287.
[23]
Rob Jansen, Kevin Bauer, Nicholas Hopper, and Roger Dingledine. 2012. Methodically modeling the Tor network. In Proceedings of the 5th USENIX Conference on Cyber Security Experimentation and Test (CSET’12). USENIX Association, Berkeley, CA, 8--8. Retrieved from http://dl.acm.org/citation.cfm?id=2372336.2372347
[24]
Rob Jansen and Nicholas Hopper. 2012. Shadow: Running Tor in a box for accurate and efficient experimentation. In Proceedings of the 19th Symposium on Network and Distributed System Security (NDSS’12). Internet Society.
[25]
Jon M. Kleinberg, Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew S. Tomkins. 1999. The web as a graph: Measurements, models, and methods. In Computing and Combinatorics. Lecture Notes in Computer Science, Vol. 1627. Springer, Berlin, 1--17.
[26]
Raymond Kosala and Hendrik Blockeel. 2000. Web mining research: A survey. SIGKDD Explor. Newsl. 2, 1 (June 2000), 1--15.
[27]
Ravi Kumar, Jasmine Novak, and Andrew Tomkins. 2010. Structure and evolution of online social networks. In Link Mining: Models, Algorithms, and Applications, Philip S. Yu, Jiawei Han, and Christos Faloutsos (Eds.). Springer, New York, 337--357.
[28]
Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, D. Sivakumar, Andrew Tomkins, and Eli Upfal. 2000. Stochastic models for the web graph. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science. 57--65.
[29]
Damon McCoy, Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, and Douglas Sicker. 2008. Shining light in dark places: Understanding the Tor network. In Privacy Enhancing Technologies. LNCS, Vol. 5134. Springer, Berlin, 63--76.
[30]
Mark E. J. Newman. 2003. The structure and function of complex networks. SIAM Rev. 45, 2 (2003), 167--256.
[31]
Gareth Owen and Nick Savage. 2016. Empirical analysis of Tor hidden services. IET Info. Sec. 10, 3 (2016), 113--118.
[32]
Mike Perry. 2009. Torflow: Tor network analysis. Retrieved from http://fscked.org/talks/ TorFlow-HotPETS-final.pdf.
[33]
Dimitrios Prountzos and Keshav Pingali. 2013. Betweenness centrality: Algorithms and implementations. SIGPLAN Not. 48, 8 (Feb 2013), 35--46.
[34]
Robin Snader and Nikita Borisov. 2011. Improving security and performance in the Tor network through tunable path selection. IEEE Trans. Depend. Secure Comput. 8, 5 (2011), 728--741.
[35]
Robin Snader et al. 2008. A Tune-up for Tor: Improving Security and Performance in the Tor Network. Retrieved from https://www.internetsociety.org/doc/tune-tor-improving-security-and-per formance-tor-network-paper.
[36]
Kyle Soska and Nicolas Christin. 2015. Measuring the longitudinal evolution of the online anonymous marketplace ecosystem. In Proceedings of the 24th USENIX Security Symposium (USENIX Security’15), Washington, D.C., 33--48.
[37]
Martijn Spitters, Stefan Verbruggen, and Mark van Staalduinen. 2014. Towards a comprehensive insight into the thematic organization of the tor hidden services. In Proceedings of the Intelligence and Security Informatics Conference (JISIC’14), 220--223.
[38]
Flavio Vella, Giancarlo Carbone, and Massimo Bernaschi. 2016. Algorithms and heuristics for scalable betweenness centrality computation on multi-GPU systems. CoRR abs/1602.00963 (2016). Retrieved from http://arxiv.org/abs/1602.00963.
[39]
Zachary Weinberg, Jeffrey Wang, Vinod Yegneswaran, Linda Briesemeister, Steven Cheung, Frank Wang, and Dan Boneh. 2012. StegoTorus: A camouflage proxy for the tor anonymity system. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS’12). ACM, New York, 109--120.

Cited By

View all
  • (2024)Security, information, and structure characterization of Tor: a surveyTelecommunications Systems10.1007/s11235-024-01149-y87:1(239-255)Online publication date: 1-Sep-2024
  • (2023)Cutting Onions With Others' Hands: A First Measurement of Tor Proxies in the Wild2023 IFIP Networking Conference (IFIP Networking)10.23919/IFIPNetworking57963.2023.10186440(1-9)Online publication date: 12-Jun-2023
  • (2023)Darkweb research: Past, present, and future trends and mapping to sustainable development goalsHeliyon10.1016/j.heliyon.2023.e222699:11(e22269)Online publication date: Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on the Web
ACM Transactions on the Web  Volume 11, Issue 4
November 2017
257 pages
ISSN:1559-1131
EISSN:1559-114X
DOI:10.1145/3127338
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2017
Accepted: 01 April 2017
Revised: 01 December 2016
Received: 01 July 2016
Published in TWEB Volume 11, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Web graphs
  2. automatic web exploration
  3. correlation analysis
  4. network topology

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • IANCIS, an EU ISEC project that involves IAC-CNR, Expert System
  • Italian ”Arma dei Carabinieri„

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Security, information, and structure characterization of Tor: a surveyTelecommunications Systems10.1007/s11235-024-01149-y87:1(239-255)Online publication date: 1-Sep-2024
  • (2023)Cutting Onions With Others' Hands: A First Measurement of Tor Proxies in the Wild2023 IFIP Networking Conference (IFIP Networking)10.23919/IFIPNetworking57963.2023.10186440(1-9)Online publication date: 12-Jun-2023
  • (2023)Darkweb research: Past, present, and future trends and mapping to sustainable development goalsHeliyon10.1016/j.heliyon.2023.e222699:11(e22269)Online publication date: Nov-2023
  • (2023)On the gathering of Tor onion addressesFuture Generation Computer Systems10.1016/j.future.2023.02.024145:C(12-26)Online publication date: 1-Aug-2023
  • (2023)A Comprehensive Survey of Recent Internet Measurement Techniques for Cyber SecurityComputers & Security10.1016/j.cose.2023.103123128(103123)Online publication date: May-2023
  • (2022)Process-Based Knowledge OrganizationJournal of Database Management10.4018/JDM.29955833:1(1-18)Online publication date: 13-May-2022
  • (2022)SoK: An Evaluation of the Secure End User Experience on the Dark Net through Systematic Literature ReviewJournal of Cybersecurity and Privacy10.3390/jcp20200182:2(329-357)Online publication date: 27-May-2022
  • (2022)A Synopsis of Critical Aspects for Darknet ResearchProceedings of the 17th International Conference on Availability, Reliability and Security10.1145/3538969.3544444(1-8)Online publication date: 23-Aug-2022
  • (2022)Graph Contraction on Attribute-Based ColoringProcedia Computer Science10.1016/j.procs.2022.03.056201:C(429-436)Online publication date: 1-Jan-2022
  • (2022)Onion under Microscope: An in-depth analysis of the Tor WebWorld Wide Web10.1007/s11280-022-01044-z25:3(1287-1313)Online publication date: 1-Apr-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media