Skip to main content
Log in

Statistical and combinatorial analysis of the TOR routing protocol: structural weaknesses identified in the TOR network

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

In this paper, we present the results of a deep TOR routing protocol analysis from a statistical and combinatorial point of view. We have modelled all possible routes of this famous anonymity network exhaustively while taking different parameters into account with the data provided by the TOR foundation only. We have then confronted our theoretical model with the reality on the ground. To do this, we have generated thousands of roads on the actual TOR network and compared the results obtained with those predicted by the theory. A last step of combinatorial analysis has enabled us to identify critical subsets of Onion routers (ORs) which 33%, 50%, 66% and 75% of the TOR traffic respectively depends on. We have also managed to extract most of the TOR relay bridges which are non public nodes managed by the TOR foundation. The same results as for the ORs have been observed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The file can be downloaded on https://metrics.torproject.org/collector.html.

  2. https://metrics.torproject.org/onionoo.html.

  3. By default, the configuration freezes the entry router during connection initialization for a period ranging from 2 to 9 months.

References

  1. AlSabah, M., Goldberg, I.: Performance and security improvements for Tor: a survey. ACM Comput. Surv. 49(2), 32:1–32:36 (2016)

    Article  Google Scholar 

  2. Alstott, J., Bullmore, E., Plenz, D.: Powerlaw: a Python package for analysis of heavy-tailed distributions. PLoS ONE 9, e85777 (2014)

    Article  Google Scholar 

  3. Backes, M., Kate, A., Meiser, S., Mohammadi, E.: (nothing else) mator(s): monitoring the anonymity of tor’s path selection. Cryptology ePrint Archive, Report 2014/621. https://eprint.iacr.org/2014/621 (2014)

  4. Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)

    Article  MathSciNet  Google Scholar 

  5. Defense\_Science\_Board. Study on 21st century military operations in a complex electromagnetic environment. Office of the Under Secretary of Defense for Acquisition, Technology, and Logistics. http://www.acq.osd.mil/dsb/reports/2010s/DSB_SS13--EW_Study.pdf (2015)

  6. Delong, M., Filiol, E., Coddet, C., Fatou, O., Suhard, C.: Technical and OSINT analysis of the tor foundation. In: Proceedings of the 13th International Conference on Cyber Warfare and Security, ICCWS 2018, pp. 164–173. ACPI (2018)

  7. Delong, M., Fatou, O., Filiol, E., Coddet, C., Suhard, C.: Technical and OSINT analysis of the tor project. In: ICCWS’2018 (2018)

  8. Filiol, E., Delong, M., Nicolas, J.: Results of the tor routing protocol statistical and combinatorial analyses. http://cvo-lab.blogspot.fr/2017/09/preliminary-results-on-tor-routing.html (2017). Accessed 12 Sept 2017

  9. Galteland, H., Gjøsteen, K.: Adversaries monitoring tor traffic crossing their jurisdictional border and reconstructing tor circuits. CoRR (2018). arXiv:abs/1808.09237

  10. Goldschlag, D.M., Reed, M.G., Syverson, P.F.: Hiding routing information. In: Anderson, R. (ed.) Proceedings of Information Hiding: 1st International Workshop, LNCS, vol. 1174, pp. 137–150. Springer (1996)

  11. Jansen, R., Juarez, M., Gálvez, R., Elahi, T., Diaz, C.: Inside job: applying traffic analysis to measure tor from within. In: Proceedings of the 25th Symposium on Network and Distributed System Security (NDSS ’18). Internet Society (2018)

  12. Johnson, A., Wacek, C., Jansen, R., Sherr, M., Syverson, P.: Users get routed: traffic correlation on tor by realistic adversaries, vol. 11, pp. 337–348 (2013)

  13. Johnson, A., Wacek, C., Jansen, R., Sherr, M., Syverson, P.: Users get routed: traffic correlation on tor by realistic adversaries. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS ’13, New York, NY, USA, pp. 337–348. ACM (2013)

  14. Kim, S., Han, J., Ha, J., Kim, T., Han, D.: Enhancing security and privacy of tor’s ecosystem by using trusted execution environments. In: 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), Boston, MA, pp. 145–161. USENIX Association (2017)

  15. Nasr, M., Bahramali, A., Houmansadr, A.: Deepcorr: strong flow correlation attacks on tor using deep learning. CoRR (2018). arXiv:abs/1808.07285

  16. Nithyanand, R., Starov, O., Zair, A., Gill, P., Schapira, M.: Measuring and mitigating as-level adversaries against tor. CoRR (2015). arXiv:abs/1505.05173

  17. Sun, Y., Edmundson, A., Vanbever, L., Li, O., Rexford, J., Chiang, M., Mittal, P.: RAPTOR: routing attacks on privacy in tor. In: 24th USENIX Security Symposium (USENIX Security 15), Washington, D.C., pp. 271–286. USENIX Association (2015)

  18. Syverson, P.F., Goldschlag, D.M., Reed, M.G.: Anonymous connections and onion routing. In: Proceedings of the 1997 IEEE Symposium on Security and Privacy, SP ’97, Washington, DC, USA, p. 44. IEEE Computer Society (1997)

  19. TOR\_Foundation. Tor documentation. https://www.torproject.org/docs/tor-manual.html.en (2014). Accessed 12 Sept 2017

  20. TOR\_Foundation. Tor project. https://gitweb.torproject.org (2014). Accessed 12 Sept 2017

  21. TOR\_Foundation. The tor project. https://www.torproject.org/docs/tor.git (2014). Accessed 12 Sept 2017

  22. TOR\_Foundation. Tor specifications. https://gitweb.torproject.org/torspec.git (2014). Accessed 12 Sept 2017

  23. TOR\_Foundation. Did the FBI pay a university to attack tor users? https://blog.torproject.org/did-fbi-pay-university-attack-tor-users (2015). Accessed 12 Sept 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eric Filiol.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Statistical model for the TOR routing protocol

The results obtained seem to indicate that the distribution of TOR routes follows a power law distribution (general case including Pareto, Zipf, Mandelbrot laws). We will limit ourselves to the discrete case (however, when the number of data is large enough—which is our case—it is possible to work with the continuous version of this law [4]. For more detailed information of these laws the reader can refer to [2, 4] we also used for statistical analysis.

A discrete variable X follows a Power law if its probability density is given by

$$\begin{aligned} p (x) = P[X = x] = C. x^{-\alpha } \end{aligned}$$

where \(\alpha \) is a constant parameter called power parameter of the law and C is the proportionality factor. In practice, most of the phenomena obey a power law for some \(x \ge x_{min}\). The constant C is given by the value \(x_{min}\) using the fact that

$$\begin{aligned} \int _{x_{min}}^{\infty } p(x)dx = 1 \end{aligned}$$

This results in \(C = \frac{\alpha - 1}{x_{min}^{-\alpha + 1}}\)

On a graph with logarithmic scales (log–log representation), the graph of a power law is a line since when noting \(y = P[X = x]\) we can write

$$\begin{aligned} \log (y) = \alpha .\log (x) + \log (C) \end{aligned}$$

Another useful representation is the inverse of the so-called cumulative cumulative distribution function (\(P[X[X > x] = 1 - F (x)\)). This is the one we will use here to compare the theoretical law obtained with the empirical law of data. In the following we will limit ourselves to the estimation of the parameters \(\alpha \) (maximum likelihood method, validation by the Kolmogorov–Smirnov adequacy test) and \(x_{min}\) (exhaustive estimation on all the values minimizing the D value of the Kolmogorov–Smirnov test).

Table 7 Results for all possible TOR routes (D represents the Kolmogorov–Smirnov distance between data and model)
Fig. 7
figure 7

Inverse cumulative density functions of data compared to the fitted law. The red dotted line (resp. green and blue) describes the power law (resp. log–normal and exponential law) (color figure online)

Table 8 Results for 1-billion top TOR routes (D represents the Kolmogorov–Smirnov distance between data and model)
Fig. 8
figure 8

Inverse cumulative density functions of data compared to the fitted law (1-billion top TOR routes). The red dotted line (resp. green and blue) describes the power law (resp. log–normal and exponential law)(color figure online)

It should be noted that in few cases, the theory suggests that empirical data can be described by two laws, without any clear distinction between the two. In our case. In three cases, the log–normal law has been identified as a possible alternative, but relatively close to the power law. Let us recall that a random variable X follows a log–normal law if \(Y = \ln (X)\) follows a normal law of average \(\mu \) and standard deviation \(\sigma \) (denoted \(LOG_\mathcal {N} (\mu , \sigma )\)).

For all possible routes (around 10 billions) we have results given in Table 7 and in Fig. 7. For Guard nodes occurences, let us mention the fact that the \(LOG_\mathcal {N}(7.38, 0.956)\) law is a possible alternative law.

Table 8 and Fig. 8 provides the results for the top one billion routes. For the Guard nodes occurences, we have \(LOG_\mathcal {N}(5.99, 0.661)\) as possible alternative law while for the Exit nodes occurrences, \(LOG_\mathcal {N}(8.47, 0.827)\) can be also an alternative law.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Filiol, E., Delong, M. & Nicolas, J. Statistical and combinatorial analysis of the TOR routing protocol: structural weaknesses identified in the TOR network. J Comput Virol Hack Tech 16, 3–18 (2020). https://doi.org/10.1007/s11416-019-00334-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-019-00334-x

Keywords

Navigation