ReviewAn overview of anonymity technology usage
Introduction
Anonymity technologies enable Internet users to maintain a level of privacy that prevents the collection of identifying information such as the IP address. Due to privacy concerns and other factors, usage of anonymity technologies on the Internet is growing [23]. Most anonymity technologies are offsprings of mix networks that use a chain of proxy servers to create hard-to-trace communications [14]. Anonymity services are provided by either commercial companies driven by subscription fees or advertising, or non-commercial services through open source anonymity tools. Community contributed systems include the Java Anon Proxy (JAP) [11], Tor [20], and the Invisible Internet Project (I2P) [4].
Anonymity systems send data packets over relays so that no single system has information about both the sender and the receiver. Since many people utilize these intermediaries at the same time, the Internet connection of any one single person is hidden among the connections of all other users. Hence, no individual system, internal or external, can determine which connection belongs to which user. The degree of anonymity varies and depends on the utilized mechanisms, adversary capabilities, and operation environment. The degree of anonymity can be measured via different models of entropy, evidence theory, probability and similarity [18], [27], [37].
Anonymity has always been a dichotomous issue in both social life and cyber space. On one view, anonymity technologies provide legitimate usage such as privacy, freedom of speech, anti-censorship, anonymous tips for law enforcement, and surveys such as evaluation and feedback. On the other view, anonymity technologies provide protection to criminals in facilitating on-line crimes such as piracy, information and identity theft, spam, cyber-stalking and even organizing terrorism. Additionally, they may be utilized for network abuse by bypassing the Internet use policy of an organization, exposing an organization to malicious activities, and preventing web filters from monitoring. Assessing anonymity technology usage on the Internet is valuable to understand the current state and to help enhance future versions. However, it is difficult to measure anonymity networks due to the nature of these networks, and ethical and legal issues related to the user privacy.
Anonymity research remains a very active area where investigators have focused on anonymous communication [16], traffic analysis [32], [42], [36], anonymous emails [7], [17], provable shuffles, anonymous publications [3], private information retrieval [15], [19], anonymous voting [12], [26], [40], taxonomy [27], security and communication censorship [43], [44]. Various anonymity systems have been deployed worldwide [16], [21], [39]. Moreover, few studies measure anonymity technologies, in particular the Tor network, [13], [25], [29], [30], [31], [33], [34]. Given the fast evolution of the Internet, it is important to understand the current deployment and usage of anonymity networks.
In this study, we conduct a survey of anonymity technology usage on the Internet and report measurement results from multiple perspectives and platforms.1 First, we collected server lists of each technology and identified the geo-location of servers. During our exploration, we identified 7246 proxy servers, 52 remailers (12 Cypherpunk, 20 Mixmaster, and 20 Mixminion servers), 44 JAP mixers, 61,798 Tor relays, and 2267 I2P relays. We observed that United States and Germany were among the top five server providers for proxy, Tor and I2P systems. Additionally, France and Russia were among the top five server providers for Tor and I2P systems. We then performed a detailed analysis of the Tor system, the most popular anonymity system based on the number of users, by setting up a Tor server and collecting historical Tor measurement data from the Tor project website https://metrics.torproject.org/. The website provides summary measurement reports and graphs about the Tor network usage. We reported Tor usage of the top 50 countries in terms of users and relay servers, application usage, cumulative distribution of average traffic size, and count of conversations between two end points. Moreover, we analyzed historical change in Tor deployment. We observed that relays from Germany and United States contributed most bandwidth resources to the Tor system and that they had the highest number of Tor users. Finally, we analyzed anonymity systems from application perspectives by analyzing sFlow [10] traffic data from a large campus network, spam email of departmental email servers and publicly available data from the Internet, and P2P data by setting up a Shareaza client. In sFlow data, we observed that proxies and Tor are the frequently utilized anonymity technology and HTTP and SSL are the commonly utilized application protocols over anonymity networks. Moreover, we found that Tor and proxy servers are used more than other anonymity techniques by spammers and peer-to-peer users to hide their IP addresses. In spam data, we also observed emails sent through commercial anonymity web services such as GoTrusted.com.
The rest of the paper is organized as follows. In Section 2, we introduce proxy servers, remailers, JAP, Tor, and I2P, and provide the geo-location distribution of their servers. In Section 3, we analyze the usage of the Tor anonymity system in depth. In Section 4, we analyze the anonymity system usage from application perspectives. In Section 5, we survey related work. Finally, we provide conclusion and future work in Section 6.
Section snippets
Anonymity technology usage
Anonymity systems can be categorized by their latency, trust level, network type, anonymity properties, or adversary capabilities. From a usability point of view, anonymity systems are classified into two categories: high latency systems, mostly used by non-interactive applications to provide strong anonymity, and low latency systems, mostly used by anonymous web browsing to have better performance.
In this section, we investigate well-known deployed anonymity systems and provide the
Tor usage analysis
In this section, we analyze Tor usage, as it has the largest user base. We utilize two data sets for the Tor usage analysis: data collected by setting up a Tor relay as an exit node, and historical Tor measurements reported at Tor project website https://metrics.torproject.org. These data sets complement each other as they provide different perspectives into the Tor.
Application perspective
In this section, we investigate the usage of anonymity technology from the application perspectives and study campus network traffic, spam emails and peer-to-peer network. For this, we compared the observed IP addresses of applications to the collected IP addresses of the anonymity servers discussed in Section 2. Table 2 provides an overview of the anonymity systems we considered. We identified the originating countries of these IP addresses from IP address geo-location database provided by //ipinfodb.com
Related work
There have been many studies on the anonymity service and anonymity systems [3], [16], [21], [27] while some studies have analyzed anonymity technology usage [13], [25], [28], [29], [31], [33], [34]. Most of the anonymity usage studies focused on the Tor due to its popularity.
Studies on the anonymity network usage is difficult due to the nature of the anonymity network, and ethical and legal issues related to privacy. An important step in such a measurement study is ensuring privacy of users
Conclusion and future work
In this paper, we provided a tutorial survey and a measurement study to understand the usage of anonymity technologies on the Internet from multiple perspectives and platforms. We surveyed previous studies and collected measurement data from multiple sources.
We overview contemporary anonymity technologies including proxies, remailers, JAP, Tor and I2P and reported their capabilities. We observed that as newer systems (i.e., I2P and Tor) gain popularity, earlier systems become extinct (e.g.,
References (44)
- et al.
Survey on anonymous communications in computer networks
Computer Communications
(Mar. 2010) - Anonymity 4 proxy, 2012. <http://inetprivacy.com/a4proxy/review.htm> (retrieved...
- Cisco visual networking index: forecast and methodology, 2010–2015, 2012....
- Free haven selected papers in anonymity, 2012. <http://www.freehaven.net/anonbib/date.html> (retrieved...
- I2P anonymous network, 2012. <www.i2p2.de> (retrieved...
- InMotion sFlow toolkit, 2012. <http://www.inmon.com/technology/sflowTools.php> (retrieved...
- John doe, 2012. <http://www.compulink.co.uk/net-services/jd.htm> (retrieved...
- Omnimix – pathway to privacy, 2012. <http://www.danner-net.de/om.htm> (retrieved...
- Open source deep package inspection, 2012. <http://www.opendpi.org> (retrieved...
- Service name and transport protocol port number registry, 2012....
Web mixes: a system for anonymous and unobservable Internet access
Almost entirely correct mixing with applications to voting
Untraceable electronic mail, return addresses, and digital pseudonyms
Communications of the ACM
Freenet: a distributed anonymous information storage and retrieval system
Tor: the second-generation onion router
On anonymity in an electronic society: a survey of anonymous communication systems
ACM Computing Surveys
Cited by (27)
Random spanning trees for expanders, sparsifiers, and virtual network security
2023, Computer CommunicationsSystematic literature review on the state of the art and future research work in anonymous communications systems
2018, Computers and Electrical EngineeringCitation Excerpt :As we can see in Table 4 there are several publications that cover the analysis of the state of the art. This is analysed from several point of views, such as the tools available for anonymous Web communications [2], its situation in MANETs [25,34], the technology usage [24], attacks and defenses [28], network topology [33,35], deanonymization of hidden services [36], performance and security [37] or regulation [8]. In particular, Ruiz-Martínez [2] presents a analysis of the different risks a user is exposed when he/she is surfing on the Web and a set the different solutions and tools that could be used to mitigate those risks.
Internet Censorship detection: A survey
2015, Computer NetworksActivity-based payments: alternative (anonymous) online payment model
2024, International Journal of Information SecuritySpecifying a principle of cryptographic justice as a response to the problem of going dark
2023, Ethics and Information TechnologyAn observational mechanism for detection of distributed denialof-service attacks
2023, International Journal of Advances in Applied Sciences