Skip to main content

A Paged Domain Name System for Query Privacy

  • Conference paper
  • First Online:
Cryptology and Network Security (CANS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11261))

Included in the following conference series:

  • 768 Accesses

Abstract

The lack of privacy in DNS and DNSSEC is a problem that has only recently begun to see widespread attention by the Internet and research communities, and the solutions proposed so far only look at a narrow slice of the design space. In this paper we investigate a new approach for a privacy-preserving DNS mechanism that hides query information from root name servers and TLD registries. Our architecture lets TLD registries group the DNS records in their zones together into pages. Resolvers cache all pages locally, and retrieve only small incremental updates to optimize performance. We show that this strategy is particularly effective given the relatively static nature of TLD zone records. We analyze the privacy guarantees to assess the potential and limitations of our approach; we also evaluate the memory overhead for a resolver, and obtain feasibility guarantees through a prototype implementation of the new functionalities for resolvers and registries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We consider SHA-256 as a reasonable choice for the hash function. While the size could be reduced to 16 bytes while still retaining a negligible collision probability in a non-adversarial setting, a larger size is necessary if we want to have a negligible probability even in a scenario where the adversary actively tries to find a domain name which will result in a collision.

  2. 2.

    In practice, popularity will vary on a regional basis. We envision that replication may be made region-specific (the non-replicated part of each page would remain the same). We leave a more detailed analysis of these aspects to future work.

  3. 3.

    PRPs of small domain size can be implemented using format-preserving encryption (FPE) schemes, there exist suitable encryption modes that use standard AES block ciphers as a primitive and achieve FPEs of arbitrary domain size.

  4. 4.

    The pages in T are chosen uniformly at random; the only dependency that T has from k is for its size. For instance, |T| may be chosen such that the total number of page requests is higher than or equal to a given minimum.

  5. 5.

    We are slightly approximating the exact value in Eq. 6, ignoring the fact that the pages in T are chosen from the set of all m pages excluding those that are already part of the fingerprint.

  6. 6.

    To formally show this step, one needs to average out the probability over all possible replica sets that could be assumed by all domains, i.e., over all possible hash functions (or all possible sets of domain names of size N).

  7. 7.

    The number of accessed domains is almost half of the 20,000 we consider: this is because many of them did not have a www host, and also due to some restrictions we imposed on the loading time.

  8. 8.

    The reason almost 25% of domains were not resolved is that for our monitoring we kept low timeouts, and excluded the domains which frequently resulted in time-outs.

  9. 9.

    With a growth rate of \({\sim }5\%\) the number of domains and thus the number of pages doubles approximately every 14 years. We expect that the available bandwidth and computing power can easily keep up with the growth of PageDNS.

  10. 10.

    It appears that Verisign, Inc. was able to obtain a patent [28] on this technology, and it is unclear what this will mean for its adoption.

References

  1. Google Analytics Solutions. https://www.google.com/analytics. Accessed 22 Sept 2017

  2. NSA Spying on Americans. https://www.eff.org/nsa-spying. Accessed 22 Sept 2017

  3. OpenDNS. https://www.opendns.com/. Accessed 22 Sept 2017

  4. Aguilar-Melchor, C., Barrier, J., Fousse, L., Killijian, M.-O.: XPIR: private information retrieval for everyone. In: PETS (2016)

    Google Scholar 

  5. Arends, R., Austein, R., Larson, M., Massey, D., Rose, S.: DNS security introduction and requirements. RFC 4033 (2005)

    Google Scholar 

  6. Barnes, R., et al.: Confidentiality in the face of pervasive surveillance: a threat model and problem statement. RFC 7624 (2015)

    Google Scholar 

  7. Bernstein, D.J.: DNSCurve: usable security for DNS. https://dnscurve.org/. Accessed 22 Sept 2017

  8. Bortzmeyer, S.: DNS privacy considerations. RFC 7626 (2015)

    Google Scholar 

  9. Bortzmeyer, S.: DNS query name minimisation to improve privacy. RFC 7816 (2016)

    Google Scholar 

  10. Chor, B., Goldreich, O., Kushilevitz, E., Sudan, M.: Private information retrieval. In: IEEE FOCS (1995)

    Google Scholar 

  11. Chor, B., Goldreich, O., Kushilevitz, E., Sudan, M.: Private information retrieval. J. ACM 45(6) (1998)

    Article  MathSciNet  Google Scholar 

  12. Cohen, E., Kaplan, H.: Proactive caching of DNS records: addressing a performance bottleneck. In: IEEE/IPSJ International Symposium on Applications and the Internet (SAINT) (2001)

    Google Scholar 

  13. Denis, F., Fu, Y.: DNSCrypt (2011). https://dnscrypt.org/. Accessed 22 Sept 2017

  14. Devet, C., Goldberg, I., Heninger, N.: Optimally robust private information retrieval. In: USENIX Security (2012)

    Google Scholar 

  15. Dickinson, J., Dickinson, S., Bellis, R., Mankin, A., Wessels, D.: DNS transport over TCP - implementation requirements. RFC 7766 (2016)

    Google Scholar 

  16. Dingledine, R., Mathewson, N., Syverson, P.: Tor: the second-generation onion router. In: USENIX Security (2004)

    Google Scholar 

  17. Farrell, S., Tschofenig, H.: Pervasive monitoring is an attack. RFC 7258 (2014)

    Google Scholar 

  18. Federrath, H., Fuchs, K.-P., Herrmann, D., Piosecny, C.: Privacy-preserving DNS: analysis of broadcast, range queries and mix-based protection methods. In: ESORICS (2011)

    Google Scholar 

  19. Grothoff, C., Wachs, M., Emert, M., Appelbaum, J.: NSA’s MORECOWBELL: knell for DNS. Technical report, GNUnet e.V. (2015)

    Google Scholar 

  20. Handley, M., Greenhalgh, A.: The case for pushing DNS. In: HotNets (2005)

    Google Scholar 

  21. Hu, S., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., Hoffman, P.: Specification for DNS over Transport Layer Security (TLS). RFC 7858 (2016)

    Google Scholar 

  22. ICANN: .com Monthly Registry Reports. https://www.icann.org/resources/pages/com-2014-03-04-en. Accessed 22 Sept 2017

  23. Jung, J., Sit, E., Balakrishnan, H., Morris, R.: DNS performance and the effectiveness of caching. IEEE/ACM TON 10(5), 589–603 (2002)

    Article  Google Scholar 

  24. Kangasharju, J., Ross, K.W.: A replicated architecture for the domain name system. In: IEEE INFOCOM (2000)

    Google Scholar 

  25. Kushilevitz, E., Ostrovsky, R.: Replication is not needed: single database, computationally-private information retrieval. In: IEEE FOCS (1997)

    Google Scholar 

  26. Laurie, B., Sisson, G., Arends, R., Blacka, D.: DNS security (DNSSEC) hashed authenticated denial of existence. RFC 5155 (2008)

    Google Scholar 

  27. Lu, Y., Tsudik, G.: Towards plugging privacy leaks in the domain name system. In: IEEE P2P (2010)

    Google Scholar 

  28. McPherson, D., Osterweil, E.: Providing privacy enhanced resolution system in the domain name system. US Patent 8,880,686 B2 (2014)

    Google Scholar 

  29. Mockapetris, P.: Domain names - concepts and facilities. RFC 1034 (1987)

    Google Scholar 

  30. Mockapetris, P.: Domain names - implementation and specification. RFC 1035 (1987)

    Google Scholar 

  31. Ostrovsky, R., Skeith III, W.E.: A survey of single-database PIR: techniques and applications. In: PKC (2007)

    Google Scholar 

  32. Pappas, V., Massey, D., Terzis, A., Zhang, L.: A comparative study of the DNS design with DHT-based alternatives. In: IEEE INFOCOM (2006)

    Google Scholar 

  33. Ramasubramanian, V., Sirer, E.G.: The design and implementation of a next generation name service for the Internet. In: ACM SIGCOMM (2004)

    Google Scholar 

  34. Rossow, C.: Amplification hell: revisiting network protocols for DDoS abuse. In: NDSS (2014)

    Google Scholar 

  35. Alexa the Web Information Company. Alexa Top 500 Global Sites (2016). http://www.alexa.com/topsites

  36. Toledo, R.R., Danezis, G., Goldberg, I.: Lower-cost \(\epsilon \)-private information retrieval. PoPETS 2016(4), 184–201 (2016)

    Google Scholar 

  37. Verisign, Inc.: The domain name industry brief, vol. 14, no. 2 (2017). https://www.verisign.com/assets/domain-name-report-Q12017.pdf. Accessed 22 Sept 2017

  38. Wachs, M., Schanzenbach, M., Grothoff, C.: A censorship-resistant, privacy-enhancing and fully decentralized name system. In: International Conference on Cryptology and Network Security (CANS) (2014)

    Google Scholar 

  39. Wijngaards, W., Wiley, G.: Confidential DNS. Internet Draft draft-wijngaards-dnsop-confidentialdns-03 (2015)

    Google Scholar 

  40. Zhao, F., Hori, Y., Sakurai, K.: Analysis of privacy disclosure in DNS query. In: International Conference on Multimedia and Ubiquitous Engineering (MUE) (2007)

    Google Scholar 

  41. Zhu, L., Hu, Z., Heidemann, J., Wessels, D., Mankin, A., Somaiya, N.: Connection-oriented DNS to improve privacy and security. In: IEEE Symposium on Security and Privacy (2015)

    Google Scholar 

Download references

Acknowledgments

We thank Jinank Jain for his help with the prototype implementation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniele E. Asoni .

Editor information

Editors and Affiliations

Appendices

A Replication Function

Let \(\mathcal {P}\) be a page of PageDNS containing n records, and let \(k'\) be the rank of a record in \(\mathcal {P}\). For ease of notation, we write \(k' \in \mathcal {P}\), and in general we will often use a domain’s rank to refer to the domain. We assume a total of N domains, thus \(k' \in \{1,\dots ,N\}\), where \(k'=1\) is the highest rank. We consider a random variable K indicating the rank of a domain chosen at random (by a generic client) according to a Zipf distribution with parameter \(s=0.91\), i.e., such that:

$$\begin{aligned} \Pr (K = k) = f(k; s, N) = \frac{1}{k^{s} H_{N,s}} \quad \;\text {with}\ H_{N,s} = \sum _{k=1}^{N} \frac{1}{k^s} \end{aligned}$$
(10)

To simplify the notation, we will write f(k) to mean f(ksN) and H for \(H_{N,s}\).

Now we try to analytically express the probability that an adversary would assign to \(k'\) being the target domain having observed a request to \(\mathcal {P}\), which is equal to the probability that \(K = k'\) given that K is restricted to \(\mathcal {P}\). By applying Bayes theorem, we obtain the following equation:

$$\begin{aligned} \Pr (K=k' \mid K\in \mathcal {P}) = \frac{\Pr (K\in \mathcal {P}\mid K=k') \Pr (K=k')}{\sum _{k\in \mathcal {P}} \Pr (K\in \mathcal {P}\mid K=k) \Pr (K=k)} \end{aligned}$$
(11)

Probability \(\Pr (K\in \mathcal {P}\mid K=k)\) is equal to 1 if k is not replicated, since we are assuming that \(k\in \mathcal {P}\). More generally, if k has a replication degree of r(k) (i.e., the record for domain k exists on r(k) pages), then the probability of choosing the replica in \(\mathcal {P}\) is 1 / r(k). We can therefore rewrite Eq. 11 as follows:

$$\begin{aligned} \Pr (K=k' \mid K\in \mathcal {P}) = \frac{f(k')/r(k')}{\sum _{k\in \mathcal {P}} f(k)/r(k)} \end{aligned}$$
(12)

Now let \(k''\) be another domain on the same page, i.e., \(k''\in \mathcal {P}\). Ideally, we would like the replication function to be such that \(\Pr (K=k' \mid K\in \mathcal {P}) = \Pr (K=k'' \mid K\in \mathcal {P})\) for all possible choices of \(k'\) and \(k''\). Unfortunately, it is possible to see that the only scenario where this could theoretically be achieved is one where the number of pages is equal to the number of domains, and the cost of replication would be excessive (the total size of all PageDNS pages would increase by almost a hundredfold). Instead, we try to get the ratio of those probabilities as close to 1 as possible. Since we also want to minimize the cost of replication, we do not replicate the least popular domain (i.e., \(r(N)=1\)): replication should only help to reduce the probability in Eq. 11 for high-rank domains, to get it closer to the probability of the more unpopular domains. It is reasonable therefore for the ratio to be at its maximum when \(k'=1\) and \(k''=N\).

$$\begin{aligned} \rho _{ MAX }= \frac{\Pr (K=1 \mid K\in \mathcal {P})}{\Pr (K=N \mid K\in \mathcal {P})} = \frac{f(1)/r(1)}{f(N)/r(N)} \end{aligned}$$
(13)

Denoting with R the replication degree of the most popular domain (\(r(1) = R\)), and since \(r(N)=1\), Eq. 13 becomes the following:

$$\begin{aligned} \rho _{ MAX }= \frac{f(1)/R}{f(N)} = \frac{H^{-1}/R}{N^{-s} H^{-1}} = \frac{N^s}{R} \end{aligned}$$
(14)

All other domains should be replicated in order not to increase this ratio further. From this requirement, we obtain the following bound \(\forall k\).

$$\begin{aligned}&\frac{\Pr (K=k \mid K\in \mathcal {P})}{\Pr (K=N \mid K\in \mathcal {P})}&\le \rho _{ MAX }\end{aligned}$$
(15)
$$\begin{aligned}&\implies \quad&\frac{f(k)/r(k)}{f(N)/r(N)} = \frac{k^{-s}H^{-1}/r(k)}{N^{-s} H^{-1}} = \frac{N^s}{k^s r(k)}&\le \frac{N^s}{R}\end{aligned}$$
(16)
$$\begin{aligned}&\implies \quad&r(k)&\ge \frac{R}{k^s} \end{aligned}$$
(17)

We derived the bound in Eq. 17 for the worst case of the page containing both the most and the least popular domains, so by applying it generally to the replication for all k-s we ensure that on no page there will be two domains for which the ratio of their identification probabilities (Eq. 11) exceeds \(\rho _{ MAX }\). Furthermore, with the approximation that the denominator in Eq. 12, \(\sum _{k\in \mathcal {P}} f(k)/r(k)\), has the same value for all pages, the bound in Eq. 17 actually guarantees the following for any two pages \(\mathcal {P}\), \(\mathcal {P}'\):

$$\begin{aligned} \forall k\in \mathcal {P}, \forall k'\in \mathcal {P}'\quad \frac{\Pr (K=k \mid K\in \mathcal {P})}{\Pr (K=k' \mid K\in \mathcal {P}')} \le \rho _{ MAX }\end{aligned}$$
(18)
Fig. 5.
figure 5

Replication degree of domain names according to their rank.

Since we desire to minimize the cost of replication, we try to match the bound of Eq. 17 as closely as possible (rounding it to the nearest integer), with the additional constraint that replication of any domain be at least 1. Thus the replication function we use is the following:

$$\begin{aligned} r(k) = max\{1, round (R k^{-s})\} \end{aligned}$$
(19)

In Fig. 5 we plot the function. Note how only the most popular domains with rank from 1 to \(k^{*}\) are replicated: these will all have approximately (because of rounding) the same identification probability, while for less popular domains the probability will be lower. We also point out that in our scenario \(R\le m\), where m is the number of pages, and that the best (lowest) probabilities are obtained for the equality: in this case, we have the most popular domain replicated on all pages. For realistic values (\(m = 10,000\) and \(s=0.91\)), we obtain \(k^{*} \simeq 200,000\).

B Page-Size Variance

We can think of the size of a page P as the sum of N random variables \(X_{k}\), each assuming value 1 if the k-th domain is assigned by the hash function to page P, and value 0 otherwise. The size of page P is thus \(X = \sum _{k=1}^{N} X_{k}\). Assuming that the hash function behaves as a random function, and considering a set of m pages, we can easily compute the expected value of the size of the generic page P as follows:

(20)

Since X is the sum of independent random variables with values in the set \(\{0,1\}\), we can apply the multiplicative Chernoff bound to estimate the probability that the size of a specific page will deviate from the expected value \(\mu \) by a certain factor \((1-\delta )\) (we aim to find a lower bound). The bound has the following form.

$$\begin{aligned} \Pr (X \le (1-\delta )\mu ) \le e^{-\frac{\delta ^2\mu }{2}} \end{aligned}$$
(21)

Considering for the parameters the values \(N = 10^9\) and \(m = 10^5\), as we have done throughout the paper, we obtain from Eq. 20 that \(\mu = 10^4\). Setting \(\delta = 0.1\) for a deviation of at least 10% from the mean, Eq. 21 yields the following bound:

$$\begin{aligned} \Pr (X \le 0.9 \mu ) \le e^{-\frac{10^{-2}10^4}{2}} = e^{-50} \simeq 2\cdot 10^{-22} \end{aligned}$$
(22)

We see from these numbers that the probability of having pages significantly smaller than the average is clearly negligible. Another Chernoff bound can be used to find similar limitations for the probability of pages to be 10% larger than the average.

Replicas Distribution. To determine the distribution of the number of replicas per page, we find that Chernoff bounds are not effective, as they do not allow us to rule out extreme cases such as having only 2 or 3 replicas on some page. Instead, we use a simulation over \(10^6\) pages, and find that the median is 25 records per page, though it can be as low as 7 in exceptional cases.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Asoni, D.E., Hitz, S., Perrig, A. (2018). A Paged Domain Name System for Query Privacy. In: Capkun, S., Chow, S. (eds) Cryptology and Network Security. CANS 2017. Lecture Notes in Computer Science(), vol 11261. Springer, Cham. https://doi.org/10.1007/978-3-030-02641-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02641-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02640-0

  • Online ISBN: 978-3-030-02641-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics