Abstract
The lack of privacy in DNS and DNSSEC is a problem that has only recently begun to see widespread attention by the Internet and research communities, and the solutions proposed so far only look at a narrow slice of the design space. In this paper we investigate a new approach for a privacy-preserving DNS mechanism that hides query information from root name servers and TLD registries. Our architecture lets TLD registries group the DNS records in their zones together into pages. Resolvers cache all pages locally, and retrieve only small incremental updates to optimize performance. We show that this strategy is particularly effective given the relatively static nature of TLD zone records. We analyze the privacy guarantees to assess the potential and limitations of our approach; we also evaluate the memory overhead for a resolver, and obtain feasibility guarantees through a prototype implementation of the new functionalities for resolvers and registries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We consider SHA-256 as a reasonable choice for the hash function. While the size could be reduced to 16 bytes while still retaining a negligible collision probability in a non-adversarial setting, a larger size is necessary if we want to have a negligible probability even in a scenario where the adversary actively tries to find a domain name which will result in a collision.
- 2.
In practice, popularity will vary on a regional basis. We envision that replication may be made region-specific (the non-replicated part of each page would remain the same). We leave a more detailed analysis of these aspects to future work.
- 3.
PRPs of small domain size can be implemented using format-preserving encryption (FPE) schemes, there exist suitable encryption modes that use standard AES block ciphers as a primitive and achieve FPEs of arbitrary domain size.
- 4.
The pages in T are chosen uniformly at random; the only dependency that T has from k is for its size. For instance, |T| may be chosen such that the total number of page requests is higher than or equal to a given minimum.
- 5.
We are slightly approximating the exact value in Eq. 6, ignoring the fact that the pages in T are chosen from the set of all m pages excluding those that are already part of the fingerprint.
- 6.
To formally show this step, one needs to average out the probability over all possible replica sets that could be assumed by all domains, i.e., over all possible hash functions (or all possible sets of domain names of size N).
- 7.
The number of accessed domains is almost half of the 20,000 we consider: this is because many of them did not have a www host, and also due to some restrictions we imposed on the loading time.
- 8.
The reason almost 25% of domains were not resolved is that for our monitoring we kept low timeouts, and excluded the domains which frequently resulted in time-outs.
- 9.
With a growth rate of \({\sim }5\%\) the number of domains and thus the number of pages doubles approximately every 14 years. We expect that the available bandwidth and computing power can easily keep up with the growth of PageDNS.
- 10.
It appears that Verisign, Inc. was able to obtain a patent [28] on this technology, and it is unclear what this will mean for its adoption.
References
Google Analytics Solutions. https://www.google.com/analytics. Accessed 22 Sept 2017
NSA Spying on Americans. https://www.eff.org/nsa-spying. Accessed 22 Sept 2017
OpenDNS. https://www.opendns.com/. Accessed 22 Sept 2017
Aguilar-Melchor, C., Barrier, J., Fousse, L., Killijian, M.-O.: XPIR: private information retrieval for everyone. In: PETS (2016)
Arends, R., Austein, R., Larson, M., Massey, D., Rose, S.: DNS security introduction and requirements. RFC 4033 (2005)
Barnes, R., et al.: Confidentiality in the face of pervasive surveillance: a threat model and problem statement. RFC 7624 (2015)
Bernstein, D.J.: DNSCurve: usable security for DNS. https://dnscurve.org/. Accessed 22 Sept 2017
Bortzmeyer, S.: DNS privacy considerations. RFC 7626 (2015)
Bortzmeyer, S.: DNS query name minimisation to improve privacy. RFC 7816 (2016)
Chor, B., Goldreich, O., Kushilevitz, E., Sudan, M.: Private information retrieval. In: IEEE FOCS (1995)
Chor, B., Goldreich, O., Kushilevitz, E., Sudan, M.: Private information retrieval. J. ACM 45(6) (1998)
Cohen, E., Kaplan, H.: Proactive caching of DNS records: addressing a performance bottleneck. In: IEEE/IPSJ International Symposium on Applications and the Internet (SAINT) (2001)
Denis, F., Fu, Y.: DNSCrypt (2011). https://dnscrypt.org/. Accessed 22 Sept 2017
Devet, C., Goldberg, I., Heninger, N.: Optimally robust private information retrieval. In: USENIX Security (2012)
Dickinson, J., Dickinson, S., Bellis, R., Mankin, A., Wessels, D.: DNS transport over TCP - implementation requirements. RFC 7766 (2016)
Dingledine, R., Mathewson, N., Syverson, P.: Tor: the second-generation onion router. In: USENIX Security (2004)
Farrell, S., Tschofenig, H.: Pervasive monitoring is an attack. RFC 7258 (2014)
Federrath, H., Fuchs, K.-P., Herrmann, D., Piosecny, C.: Privacy-preserving DNS: analysis of broadcast, range queries and mix-based protection methods. In: ESORICS (2011)
Grothoff, C., Wachs, M., Emert, M., Appelbaum, J.: NSA’s MORECOWBELL: knell for DNS. Technical report, GNUnet e.V. (2015)
Handley, M., Greenhalgh, A.: The case for pushing DNS. In: HotNets (2005)
Hu, S., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., Hoffman, P.: Specification for DNS over Transport Layer Security (TLS). RFC 7858 (2016)
ICANN: .com Monthly Registry Reports. https://www.icann.org/resources/pages/com-2014-03-04-en. Accessed 22 Sept 2017
Jung, J., Sit, E., Balakrishnan, H., Morris, R.: DNS performance and the effectiveness of caching. IEEE/ACM TON 10(5), 589–603 (2002)
Kangasharju, J., Ross, K.W.: A replicated architecture for the domain name system. In: IEEE INFOCOM (2000)
Kushilevitz, E., Ostrovsky, R.: Replication is not needed: single database, computationally-private information retrieval. In: IEEE FOCS (1997)
Laurie, B., Sisson, G., Arends, R., Blacka, D.: DNS security (DNSSEC) hashed authenticated denial of existence. RFC 5155 (2008)
Lu, Y., Tsudik, G.: Towards plugging privacy leaks in the domain name system. In: IEEE P2P (2010)
McPherson, D., Osterweil, E.: Providing privacy enhanced resolution system in the domain name system. US Patent 8,880,686 B2 (2014)
Mockapetris, P.: Domain names - concepts and facilities. RFC 1034 (1987)
Mockapetris, P.: Domain names - implementation and specification. RFC 1035 (1987)
Ostrovsky, R., Skeith III, W.E.: A survey of single-database PIR: techniques and applications. In: PKC (2007)
Pappas, V., Massey, D., Terzis, A., Zhang, L.: A comparative study of the DNS design with DHT-based alternatives. In: IEEE INFOCOM (2006)
Ramasubramanian, V., Sirer, E.G.: The design and implementation of a next generation name service for the Internet. In: ACM SIGCOMM (2004)
Rossow, C.: Amplification hell: revisiting network protocols for DDoS abuse. In: NDSS (2014)
Alexa the Web Information Company. Alexa Top 500 Global Sites (2016). http://www.alexa.com/topsites
Toledo, R.R., Danezis, G., Goldberg, I.: Lower-cost \(\epsilon \)-private information retrieval. PoPETS 2016(4), 184–201 (2016)
Verisign, Inc.: The domain name industry brief, vol. 14, no. 2 (2017). https://www.verisign.com/assets/domain-name-report-Q12017.pdf. Accessed 22 Sept 2017
Wachs, M., Schanzenbach, M., Grothoff, C.: A censorship-resistant, privacy-enhancing and fully decentralized name system. In: International Conference on Cryptology and Network Security (CANS) (2014)
Wijngaards, W., Wiley, G.: Confidential DNS. Internet Draft draft-wijngaards-dnsop-confidentialdns-03 (2015)
Zhao, F., Hori, Y., Sakurai, K.: Analysis of privacy disclosure in DNS query. In: International Conference on Multimedia and Ubiquitous Engineering (MUE) (2007)
Zhu, L., Hu, Z., Heidemann, J., Wessels, D., Mankin, A., Somaiya, N.: Connection-oriented DNS to improve privacy and security. In: IEEE Symposium on Security and Privacy (2015)
Acknowledgments
We thank Jinank Jain for his help with the prototype implementation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Replication Function
Let \(\mathcal {P}\) be a page of PageDNS containing n records, and let \(k'\) be the rank of a record in \(\mathcal {P}\). For ease of notation, we write \(k' \in \mathcal {P}\), and in general we will often use a domain’s rank to refer to the domain. We assume a total of N domains, thus \(k' \in \{1,\dots ,N\}\), where \(k'=1\) is the highest rank. We consider a random variable K indicating the rank of a domain chosen at random (by a generic client) according to a Zipf distribution with parameter \(s=0.91\), i.e., such that:
To simplify the notation, we will write f(k) to mean f(k; s, N) and H for \(H_{N,s}\).
Now we try to analytically express the probability that an adversary would assign to \(k'\) being the target domain having observed a request to \(\mathcal {P}\), which is equal to the probability that \(K = k'\) given that K is restricted to \(\mathcal {P}\). By applying Bayes theorem, we obtain the following equation:
Probability \(\Pr (K\in \mathcal {P}\mid K=k)\) is equal to 1 if k is not replicated, since we are assuming that \(k\in \mathcal {P}\). More generally, if k has a replication degree of r(k) (i.e., the record for domain k exists on r(k) pages), then the probability of choosing the replica in \(\mathcal {P}\) is 1 / r(k). We can therefore rewrite Eq. 11 as follows:
Now let \(k''\) be another domain on the same page, i.e., \(k''\in \mathcal {P}\). Ideally, we would like the replication function to be such that \(\Pr (K=k' \mid K\in \mathcal {P}) = \Pr (K=k'' \mid K\in \mathcal {P})\) for all possible choices of \(k'\) and \(k''\). Unfortunately, it is possible to see that the only scenario where this could theoretically be achieved is one where the number of pages is equal to the number of domains, and the cost of replication would be excessive (the total size of all PageDNS pages would increase by almost a hundredfold). Instead, we try to get the ratio of those probabilities as close to 1 as possible. Since we also want to minimize the cost of replication, we do not replicate the least popular domain (i.e., \(r(N)=1\)): replication should only help to reduce the probability in Eq. 11 for high-rank domains, to get it closer to the probability of the more unpopular domains. It is reasonable therefore for the ratio to be at its maximum when \(k'=1\) and \(k''=N\).
Denoting with R the replication degree of the most popular domain (\(r(1) = R\)), and since \(r(N)=1\), Eq. 13 becomes the following:
All other domains should be replicated in order not to increase this ratio further. From this requirement, we obtain the following bound \(\forall k\).
We derived the bound in Eq. 17 for the worst case of the page containing both the most and the least popular domains, so by applying it generally to the replication for all k-s we ensure that on no page there will be two domains for which the ratio of their identification probabilities (Eq. 11) exceeds \(\rho _{ MAX }\). Furthermore, with the approximation that the denominator in Eq. 12, \(\sum _{k\in \mathcal {P}} f(k)/r(k)\), has the same value for all pages, the bound in Eq. 17 actually guarantees the following for any two pages \(\mathcal {P}\), \(\mathcal {P}'\):
Since we desire to minimize the cost of replication, we try to match the bound of Eq. 17 as closely as possible (rounding it to the nearest integer), with the additional constraint that replication of any domain be at least 1. Thus the replication function we use is the following:
In Fig. 5 we plot the function. Note how only the most popular domains with rank from 1 to \(k^{*}\) are replicated: these will all have approximately (because of rounding) the same identification probability, while for less popular domains the probability will be lower. We also point out that in our scenario \(R\le m\), where m is the number of pages, and that the best (lowest) probabilities are obtained for the equality: in this case, we have the most popular domain replicated on all pages. For realistic values (\(m = 10,000\) and \(s=0.91\)), we obtain \(k^{*} \simeq 200,000\).
B Page-Size Variance
We can think of the size of a page P as the sum of N random variables \(X_{k}\), each assuming value 1 if the k-th domain is assigned by the hash function to page P, and value 0 otherwise. The size of page P is thus \(X = \sum _{k=1}^{N} X_{k}\). Assuming that the hash function behaves as a random function, and considering a set of m pages, we can easily compute the expected value of the size of the generic page P as follows:
Since X is the sum of independent random variables with values in the set \(\{0,1\}\), we can apply the multiplicative Chernoff bound to estimate the probability that the size of a specific page will deviate from the expected value \(\mu \) by a certain factor \((1-\delta )\) (we aim to find a lower bound). The bound has the following form.
Considering for the parameters the values \(N = 10^9\) and \(m = 10^5\), as we have done throughout the paper, we obtain from Eq. 20 that \(\mu = 10^4\). Setting \(\delta = 0.1\) for a deviation of at least 10% from the mean, Eq. 21 yields the following bound:
We see from these numbers that the probability of having pages significantly smaller than the average is clearly negligible. Another Chernoff bound can be used to find similar limitations for the probability of pages to be 10% larger than the average.
Replicas Distribution. To determine the distribution of the number of replicas per page, we find that Chernoff bounds are not effective, as they do not allow us to rule out extreme cases such as having only 2 or 3 replicas on some page. Instead, we use a simulation over \(10^6\) pages, and find that the median is 25 records per page, though it can be as low as 7 in exceptional cases.
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Asoni, D.E., Hitz, S., Perrig, A. (2018). A Paged Domain Name System for Query Privacy. In: Capkun, S., Chow, S. (eds) Cryptology and Network Security. CANS 2017. Lecture Notes in Computer Science(), vol 11261. Springer, Cham. https://doi.org/10.1007/978-3-030-02641-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-02641-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02640-0
Online ISBN: 978-3-030-02641-7
eBook Packages: Computer ScienceComputer Science (R0)