skip to main content
research-article

Exploring and Analyzing the Tor Hidden Services Graph

Published:24 July 2017Publication History
Skip Abstract Section

Abstract

The exploration and analysis of Web graphs has flourished in the recent past, producing a large number of relevant and interesting research results. However, the unique characteristics of the Tor network limit the applicability of standard techniques and demand for specific algorithms to explore and analyze it. The attention of the research community has focused on assessing the security of the Tor infrastructure (i.e., its ability to actually provide the intended level of anonymity) and on discussing what Tor is currently being used for. Since there are no foolproof techniques for automatically discovering Tor hidden services, little or no information is available about the topology of the Tor Web graph. Even less is known on the relationship between content similarity and topological structure. The present article aims at addressing such lack of information. Among its contributions: a study on automatic Tor Web exploration/data collection approaches; the adoption of novel representative metrics for evaluating Tor data; a novel in-depth analysis of the hidden services graph; a rich correlation analysis of hidden services’ semantics and topology. Finally, a broad interesting set of novel insights/considerations over the Tor Web organization and content are provided.

References

  1. Daniel Arp, Fabian Yamaguchi, and Konrad Rieck. 2015. Torben: A practical side-channel attack for deanonymizing Tor communication. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security (ASIACCS’15). ACM, New York, 597--602. DOI:http://dx.doi.org/10.1145/2714576.2714627 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Monica J. Barrat. 2012. Silk road: Ebay for drugs. Addiction 107, 3 (2012), 683--683. DOI:http://dx.doi.org/10.1111/j.1360-0443.2011.03709.x Google ScholarGoogle ScholarCross RefCross Ref
  3. Kevin Bauer, Micah Sherr, Damon McCoy, and Dirk Grunwald. 2011. ExperimenTor: A testbed for safe and realistic Tor experimentation. In Proceedings of the Workshop on Cyber Security Experimentation and Test (CSET’11).Google ScholarGoogle Scholar
  4. Massimo Bernaschi, Giancarlo Carbone, and Flavio Vella. 2016. Scalable betweenness centrality on multi-GPU systems. In Proceedings of the ACM International Conference on Computing Frontiers (CF’16). ACM, New York, 29--36. DOI:http://dx.doi.org/10.1145/2903150.2903153 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alex Biryukov, Ivan Pustogarov, Fabrice Thill, and Ralf-Philipp Weinmann. 2014. Content and popularity analysis of Tor hidden services. In Proceedings of the 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW’14). 188--193. DOI:http://dx.doi.org/10.1109/ICDCSW.2014.20 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Alex Biryukov, Ivan Pustogarov, and Ralf-Philipp Weinmann. 2013. Trawling for Tor hidden services: Detection, measurement, deanonymization. In Proceedings of the Symposium on Security and Privacy (SP’13). IEEE Computer Society, Washington, DC, 80--94. DOI:http://dx.doi.org/10.1109/SP.2013.15 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Paolo Boldi, Andrea Marino, Massimo Santini, and Sebastiano Vigna. 2014. BUbiNG: Massive crawling for the masses. In Proceedings of the 23rd International Conference on World Wide Web Companion. 227--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: Compression techniques. In Proceedings of the 13th International Conference on World Wide Web (WWW’04). ACM, New York, 595--602. DOI:http://dx.doi.org/10.1145/988672.988752 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Phillip Bonacich. 2007. Some unique properties of eigenvector centrality. Soc. Netw. 29, 4 (2007), 555--564. DOI:http://dx.doi.org/10.1016/j.socnet.2007.04.002 Google ScholarGoogle ScholarCross RefCross Ref
  10. Anthony Bonato. 2005. A survey of models of the web graph. In Combinatorial and Algorithmic Aspects of Networking, Alejandro Lopez-Ortiz and Angle M. Hamel (Eds.). Lecture Notes in Computer Science, Vol. 3405. Springer, Berlin, 159--172. DOI:http://dx.doi.org/10.1007/11527954_16 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. 2000. Graph structure in the web. Comput. Netw. 33, 16 (2000), 309--320. DOI:http://dx.doi.org/10.1016/S1389-1286(00)00083-9 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Soumen Chakrabarti, Amit Pathak, and Manish Gupta. 2011. Index design and query processing for graph conductance search. VLDB J. 20, 3 (June 2011), 445--470. DOI:http://dx.doi.org/10.1007/s00778-010-0204-8 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Francisco Claude and Susana Ladra. 2011. Practical representations for web and social graphs. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, 1185--1190. DOI:http://dx.doi.org/10.1145/2063576.2063747 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Francisco Claude and Gonzalo Navarro. 2010. Fast and compact web graph representations. ACM Trans. Web, 4, Article 16 (Sept. 2010), 31 pages. DOI:http://dx.doi.org/10.1145/1841909.1841913 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Devanshu Dhyani, Wee Keong Ng, and Sourav S. Bhowmick. 2002. A survey of web metrics. ACM Comput. Surv. 34, 4 (Dec. 2002), 469--503. DOI:http://dx.doi.org/10.1145/592642.592645 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Roger Dingledine, Nick Mathewson, and Paul Syverson. 2004. Tor: The second-generation onion router. In Proceedings of the 13th Usenix Security Symposium. Google ScholarGoogle ScholarCross RefCross Ref
  17. Paul Erdős and Alfréd Rényi. 1959. On random graphs. Publicat. Mathemat. Debrec. 6 (1959), 290--297.Google ScholarGoogle Scholar
  18. Emilio Ferrara, Pasquale De Meo, Giacomo Fiumara, and Robert Baumgartner. 2014. Web data extraction, applications and techniques: A survey. Knowl.-Based Syst. 70 (2014), 301--323. DOI:http://dx.doi.org/10.1016/j.knosys.2014.07.007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gary William Flake, Steve Lawrence, C. Lee Giles, and Frans M. Coetzee. 2002. Self-organization and identification of web communities. IEEE Comput. 35 (2002), 66--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Massimo Franceschet. 2011. PageRank: Standing on the shoulders of giants. Commun. ACM 54, 6 (June 2011), 92--101. DOI:http://dx.doi.org/10.1145/1953122.1953146 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Christos Giatsidis, Fragkiskos D. Malliaros, and Michalis Vazirgiannis. 2013. Advanced graph mining for community evaluation in social networks and the web. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM’13). ACM, New York, 771--772. DOI:http://dx.doi.org/10.1145/2433396.2433495 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Evgeniy A. Grechnikov. 2012. Degree distribution and number of edges between nodes of given degrees in the buckleyosthus model of a random web graph. Internet Math. 8, 3 (2012), 257--287. DOI:http://dx.doi.org/10.1080/15427951.2011.646176 Google ScholarGoogle ScholarCross RefCross Ref
  23. Rob Jansen, Kevin Bauer, Nicholas Hopper, and Roger Dingledine. 2012. Methodically modeling the Tor network. In Proceedings of the 5th USENIX Conference on Cyber Security Experimentation and Test (CSET’12). USENIX Association, Berkeley, CA, 8--8. Retrieved from http://dl.acm.org/citation.cfm?id=2372336.2372347Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Rob Jansen and Nicholas Hopper. 2012. Shadow: Running Tor in a box for accurate and efficient experimentation. In Proceedings of the 19th Symposium on Network and Distributed System Security (NDSS’12). Internet Society.Google ScholarGoogle Scholar
  25. Jon M. Kleinberg, Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew S. Tomkins. 1999. The web as a graph: Measurements, models, and methods. In Computing and Combinatorics. Lecture Notes in Computer Science, Vol. 1627. Springer, Berlin, 1--17. DOI:http://dx.doi.org/10.1007/3-540-48686-0_1 Google ScholarGoogle ScholarCross RefCross Ref
  26. Raymond Kosala and Hendrik Blockeel. 2000. Web mining research: A survey. SIGKDD Explor. Newsl. 2, 1 (June 2000), 1--15. DOI:http://dx.doi.org/10.1145/360402.360406 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ravi Kumar, Jasmine Novak, and Andrew Tomkins. 2010. Structure and evolution of online social networks. In Link Mining: Models, Algorithms, and Applications, Philip S. Yu, Jiawei Han, and Christos Faloutsos (Eds.). Springer, New York, 337--357. DOI:http://dx.doi.org/10.1007/978-1-4419-6515-8_13 Google ScholarGoogle ScholarCross RefCross Ref
  28. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, D. Sivakumar, Andrew Tomkins, and Eli Upfal. 2000. Stochastic models for the web graph. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science. 57--65. DOI:http://dx.doi.org/10.1109/SFCS.2000.892065 Google ScholarGoogle ScholarCross RefCross Ref
  29. Damon McCoy, Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, and Douglas Sicker. 2008. Shining light in dark places: Understanding the Tor network. In Privacy Enhancing Technologies. LNCS, Vol. 5134. Springer, Berlin, 63--76. DOI:http://dx.doi.org/10.1007/978-3-540-70630-4_5 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mark E. J. Newman. 2003. The structure and function of complex networks. SIAM Rev. 45, 2 (2003), 167--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gareth Owen and Nick Savage. 2016. Empirical analysis of Tor hidden services. IET Info. Sec. 10, 3 (2016), 113--118. Google ScholarGoogle ScholarCross RefCross Ref
  32. Mike Perry. 2009. Torflow: Tor network analysis. Retrieved from http://fscked.org/talks/ TorFlow-HotPETS-final.pdf.Google ScholarGoogle Scholar
  33. Dimitrios Prountzos and Keshav Pingali. 2013. Betweenness centrality: Algorithms and implementations. SIGPLAN Not. 48, 8 (Feb 2013), 35--46. DOI:http://dx.doi.org/10.1145/2517327.2442521 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Robin Snader and Nikita Borisov. 2011. Improving security and performance in the Tor network through tunable path selection. IEEE Trans. Depend. Secure Comput. 8, 5 (2011), 728--741. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Robin Snader et al. 2008. A Tune-up for Tor: Improving Security and Performance in the Tor Network. Retrieved from https://www.internetsociety.org/doc/tune-tor-improving-security-and-per formance-tor-network-paper.Google ScholarGoogle Scholar
  36. Kyle Soska and Nicolas Christin. 2015. Measuring the longitudinal evolution of the online anonymous marketplace ecosystem. In Proceedings of the 24th USENIX Security Symposium (USENIX Security’15), Washington, D.C., 33--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Martijn Spitters, Stefan Verbruggen, and Mark van Staalduinen. 2014. Towards a comprehensive insight into the thematic organization of the tor hidden services. In Proceedings of the Intelligence and Security Informatics Conference (JISIC’14), 220--223. DOI:http://dx.doi.org/10.1109/JISIC.2014.40 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Flavio Vella, Giancarlo Carbone, and Massimo Bernaschi. 2016. Algorithms and heuristics for scalable betweenness centrality computation on multi-GPU systems. CoRR abs/1602.00963 (2016). Retrieved from http://arxiv.org/abs/1602.00963.Google ScholarGoogle Scholar
  39. Zachary Weinberg, Jeffrey Wang, Vinod Yegneswaran, Linda Briesemeister, Steven Cheung, Frank Wang, and Dan Boneh. 2012. StegoTorus: A camouflage proxy for the tor anonymity system. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS’12). ACM, New York, 109--120. DOI:http://dx.doi.org/10.1145/2382196.2382211 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploring and Analyzing the Tor Hidden Services Graph

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on the Web
            ACM Transactions on the Web  Volume 11, Issue 4
            November 2017
            257 pages
            ISSN:1559-1131
            EISSN:1559-114X
            DOI:10.1145/3127338
            Issue’s Table of Contents

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 24 July 2017
            • Accepted: 1 April 2017
            • Revised: 1 December 2016
            • Received: 1 July 2016
            Published in tweb Volume 11, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader