Pseudoentropy: Lower-Bounds for Chain Rules and Transformations

Pietrzak, Krzysztof; Skórski, Maciej

doi:10.1007/978-3-662-53641-4_8

Krzysztof Pietrzak^15,16 &
Maciej Skórski^15,16

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9985))

Included in the following conference series:

Theory of Cryptography Conference

1749 Accesses
1 Citations

Abstract

Computational notions of entropy have recently found many applications, including leakage-resilient cryptography, deterministic encryption or memory delegation. The two main types of results which make computational notions so useful are (1) Chain rules, which quantify by how much the computational entropy of a variable decreases if conditioned on some other variable (2) Transformations, which quantify to which extend one type of entropy implies another.

Such chain rules and transformations typically lose a significant amount in quality of the entropy, and are the reason why applying these results one gets rather weak quantitative security bounds. In this paper we for the first time prove lower bounds in this context, showing that existing results for transformations are, unfortunately, basically optimal for non-adaptive black-box reductions (and it’s hard to imagine how non black-box reductions or adaptivity could be useful here.)

A variable X has k bits of HILL entropy of quality $(\epsilon ,s)$ if there exists a variable Y with k bits min-entropy which cannot be distinguished from X with advantage $\epsilon $ by distinguishing circuits of size s. A weaker notion is Metric entropy, where we switch quantifiers, and only require that for every distinguisher of size s, such a Y exists.

We first describe our result concerning transformations. By definition, HILL implies Metric without any loss in quality. Metric entropy often comes up in applications, but must be transformed to HILL for meaningful security guarantees. The best known result states that if a variable X has k bits of Metric entropy of quality $(\epsilon ,s)$, then it has k bits of HILL with quality $(2\epsilon ,s\cdot \epsilon ^2)$. We show that this loss of a factor ${\varOmega }(\epsilon ^{-2})$ in circuit size is necessary. In fact, we show the stronger result that this loss is already necessary when transforming so called deterministic real valued Metric entropy to randomised boolean Metric (both these variants of Metric entropy are implied by HILL without loss in quality).

The chain rule for HILL entropy states that if X has k bits of HILL entropy of quality $(\epsilon ,s)$, then for any variable Z of length m, X conditioned on Z has $k-m$ bits of HILL entropy with quality $(\epsilon ,s\cdot \epsilon ^2/ 2^{m})$. We show that a loss of ${\varOmega }(2^m/\epsilon )$ in circuit size necessary here. Note that this still leaves a gap of $\epsilon $ between the known bound and our lower bound.

K. Pietrzak—Supported by the European Research Council consolidator grant (682815-TOCNeT).

M. Skórski—Supported by the National Science Center, Poland (2015/17/N/ST6/03564).

You have full access to this open access chapter, Download conference paper PDF

The Chain Rule for HILL Pseudoentropy, Revisited

A Counterexample to the Chain Rule for Conditional HILL Entropy

A Note on Perfect Correctness by Derandomization

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

There exist various information theoretic notions of entropy that quantify the “uncertainty” of a random variable. A variable X has k bits of Shannon entropy if it cannot be compressed below k bits. In cryptography we mostly consider min-entropy, where we say that X has k bits of min-entropy, denoted ${\mathbf {H}}_{\infty }\left( X\right) =k$, if for any x, $\Pr [X=x]\le 2^{-k}$.

In a cryptographic context, we often have to deal with variables that only appear to have high entropy to computationally bounded observers. The most important case is pseudorandomness, where we say that $X\in \{0,1\}^n$ is pseudorandom, if it cannot be distinguished from the uniform distribution over $\{0,1\}^n$.

More generally, we say that $X\in \{0,1\}^n$ has $k\le n$ bits of HILL pseudoentropy [12], denoted ${\mathbf {H}}^{\mathsf{{HILL}}}_{\epsilon ,s}(X)=k$ if it cannot be distinguished from some Y with ${\mathbf {H}}_{\infty }\left( Y\right) =k$ by any circuit of size s with advantage $>\epsilon $, note that we get pseudorandomness as a special case for $k=n$. We refer to k as the quantity and to $(\epsilon ,s)$ as the quality of the entropy.

A weak notion of pseudoentropy called Metric pseudoentropy [3] often comes up in security proofs. This notion is defined like HILL, but with the quantifiers exchanged: We only require that for every distininguisher there exists a distribution $Y,{\mathbf {H}}_{\infty }\left( Y\right) =k$ that fools this particular distinguisher (not one such Y to fool them all).

HILL pseudoentropy is named after the authors of the [12] paper where it was introduced as a tool for constructing a pseudorandom generator from any one-way function. Their construction and analysis was subsequently improved in a series of works [11, 13, 28]. A lower bound on the number of calls to the underlying one-way function was given by [14].^{Footnote 1} More recently HILL pseudoentropy has been used in many other applications like leakage-resilient cryptography [6, 17], deterministic encryption [7] and memory delegation [4].

The two most important types of tools we have to manipulate pseudoentropy are chain rules and transformations from one notion into another. Unfortunately, the known transformations and chain rules lose large factors in the quality of the entropy, which results in poor quantitative security bounds that can be achieved using these tools. In this paper we provide lower bounds, showing that unfortunately, the known results are tight (or almost tight for chain rules), at least when considering non-adaptive black-box reductions. Although black-box impossibility results have been overcome by non black-box constructions in the past [2], we find it hard to imagine how non black-box constructions or adaptivity could help in this setting. We believe that relative to the oracles we construct also adaptive reductions are impossible as adaptivity “obviously” is no of use, but proving this seems hard. Our results are summarized in Figs. 1 and 2.

Complexity of the Adversary. In order to prove a black-box separation, we will construct an oracle and prove the separation unconditionally relative to this oracle, i.e., assuming all parties have access to it. This then shows that any construction/proof circumventing or separation in the plain model cannot be relativizing, which in particular rules out all black-box constructions [1, 16].

In the discussion below we measure the complexity of adversaries only in terms of numbers of oracle queries. Of course, in the actual proof we also bound them in terms of circuit size. For our upper bounds the circuits will be of basically the same size as the number of oracle queries (so the number of oracle queries is a good indication of the actual size), whereas for the lower bounds, we can even consider circuits of exponential size, thus making the bounds stronger (basically, we just require that one cannot hard-code a large fraction of the function table of the oracle into the circuit).

Transformations. It is often easy to prove that a variable $X\in \{0,1\}^n$ has so called Metric pseudoentropy against deterministic distinguishers, denoted ${\mathbf {H}}^{\mathsf{{Metric}}}_{\epsilon ,s}{}^{,\mathrm {det}\{0,1\}}(X)=k$. Unfortunately, this notion is usually too weak to be useful, as it only states that for every (deterministic, boolean) distinguisher, there exists some Y with ${\mathbf {H}}_{\infty }\left( Y\right) =k$ that fools this particular distinguisher, but one usually needs a single Y that fools all (randomised) distinguishers, this is captured by HILL pseudoentropy.

Barak et al. [3] show that any variable $X\in \{0,1\}^n$ that has Metric entropy, also has the same amount of HILL entropy. Their proof uses the min-max theorem, and although it perseveres the amount k of entropy, the quality drops from $(\epsilon ,s)$ to $(2\epsilon , {\varOmega }(s\cdot \epsilon ^2/n))$. A slightly better bound $ \left( 2\epsilon , {\varOmega }(s\cdot \epsilon ^2/(n+1-k)) \right) $ (where again k is the amount of Metric entropy), was given recently in [25]. The argument uses the min-max theorem and some results on convex approximation in $L_p$ spaces.

In Theorem 1 we show that this is optimal – up to a small factor ${\varTheta }((n-k+1)/\ln (1/\epsilon ))$ – as a loss of ${\varOmega }(\ln (1/\epsilon )/\epsilon ^2)$ in circuit size is necessary for any black-box reduction. Note that for sufficiently small $\epsilon \in 2^{-{\varOmega }(n-k+1)}$ our bound even matches the positive result up to a small constant factor.

The high-level idea of our separation is as follows; We construct an oracle ${{\mathcal {O}}}$ and a variable $X\in \{0,1\}^n$, such that relative to this oracle X can be distinguished from any variable Y with high min-entropy when we can make one randomized query, but for any deterministic distinguisher ${\mathsf {A}}$, we can find a Y with high min-entropy which ${\mathsf {A}}$ cannot distinguish from X.

To define ${{\mathcal {O}}}$, we first choose a uniformly random subset $S\in \{0,1\}^n$ of size $|S|=2^m$. Moreover we chose a sufficiently large set of boolean functions $D_1(\cdot ),\ldots ,D_h(\cdot )$ as follows: for every $x\in S$ we set $D_i(x)=1$ with probability 1 / 2 and for every $x\not \in S$, $D_i(x)=1$ with probability $1/2+\delta $.

Given any x, we can distinguish $x\in S$ from $x\not \in S$ with advantage $\approx 2\delta $ by quering $D_i(x)$ for a random i. This shows that X cannot have much more than $\log (|S|)=m$ bits of HILL entropy (in fact, even probabilistic Metric entropy) as any variable Y with ${\mathbf {H}}_{\infty }\left( Y\right) \geqslant m+1$ has at least half of its support outside S, and thus can be distinguished with advantage $\approx 2\delta /2=\delta $ with one query as just explained. Concretely (recall that in this informal discussion we measure size simply by the number of oracle queries).

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{Metric}}}_{\delta ,1}{}^{,\mathrm {rand}\{0,1\}}(X)\leqslant m+1 \end{aligned}$$

On the other hand, if the adversary is allowed q deterministic queries, then intuitively, the best he can do is to query $D_1(x),\ldots ,D_q(x)$ and guess that $x\in S$ if less than a $1/2+{\delta }/2$ fraction of the outputs is 1. But even if $q=1/\delta ^2$, this strategy will fail with constant probability. Thus, we can choose a Y with large support outside S (and thus also high min-entropy) which will fool this adversary. This shows that X does have large Metric entropy against deterministic distinguishers, even if we allow the adversaries to run in time $1/\delta ^2$, concretely, we show that

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{Metric,\mathrm {det}\{0,1\}}}}_{{\varTheta }(\delta ),O(1/\delta ^2)}(X)\geqslant n - O(\log (1/\delta )) \end{aligned}$$

The Adversary. Let us stress that we show impossibility in the non-uniform setting, i.e., for any input length, the distinguisher circuit can depend arbitrarily on the oracle. Like in many non-uniform black-box separation results (including [19, 22, 24, 30, 31]), the type of adversaries for which we can rigorously prove the lower bound is not completely general, but the necessary restrictions seem “obviously” irrelevant. In particular, given some input x (where we must decide if $x\in S$), we only allow the adversary queries on input x. This doesn’t seem like a real restriction as the distribution of $D_i(x')$ for any $x'\ne x$ is independent of x, and thus seems useless (but such queries can be used to make the success probability of the adversary on different inputs correlated, and this causes a problem in the proof). Moreover, we assume the adversary makes his queries non-adaptively, i.e., it choses the indices $i_1,\ldots ,i_q$ before seeing the outputs of the queries $D_{i_1}(x),\ldots ,D_{i_q}(x)$. As the distribution of all the $D_i$’s is identical, this doesn’t seem like a relevant restriction either.

Chain Rules. Most (if not all) information theoretic entropy notions H(.) satisfy some kind of chain rule, which states that the entropy of a variable X, when conditioned on another variable Z, can decrease by at most the bitlength |Z| of Z, i.e., $H(X|Z) \geqslant H(X)-|Z|$.

Such a chain rule also holds for some computational notions of entropy. For HILL entropy a chain rule was first proven in [6, 22] by a variant of the dense model theorem, and was improved by Fuller and Reyzin [8]. A different approach using a simulator was proposed in [17] and later improved by Vadhan and Zheng [29]. A unified approach, based on convex optimization techniques was proposed recently in [21, 26] achieving best bounds so far.

The “dense model theorem approach” [8] proceeds as follows: one shows that if X has k bits of HILL entropy, then X|Z has $k-m$ (where $Z\in \{0,1\}^m$) bits of Metric entropy. In a second step one applies a Metric to HILL transformation, first proven by Barak et al. [3], to argue that X|Z has also large HILL. The first step loses a factor $2^m$ in advantage, the second another $2^{2m}\epsilon ^2$ in circuit size. Eventually, the loss in circuit size is $2^{2m}/\epsilon ^2$ and the loss in advantage is $2^{m}$ which measured in terms of the security ratio size/advantage gives a total loss of $2^{m}/\epsilon ^2$.

A more direct “simulator” approach [29] loses only a multiplicative factor $2^{m}/\epsilon ^2$ in circuit size (there’s also an additive $1/\epsilon ^2$ term) but there is no loss in advantage. The additive term can be improved to only $2^{m}\epsilon ^2$ as shown in [21, 26].

In this paper we show that a loss of $2^m/\epsilon ^{}$ is necessary. Note that this still is a factor $1/\epsilon ^{}$ away from the positive result. Our result as stated in Theorem 2 is a bit stronger as just outlined, as we show that the loss is necessary even if we only want a bound on the “relaxed” HILL entropy of X|Z (a notion weaker than standard HILL).

To prove our lower bound, we construct an oracle ${{\mathcal {O}}}(.)$, together with a joint distribution $(X,Z)\in \{0,1\}^{n}\times \{0,1\}^{m}$. We want X to have high HILL entropy relative to ${{\mathcal {O}}}(.)$, but when conditioning on Z it should decrease as much as possible (in quantity and quality).

We first consider the case $m=1$, i.e., the conditional part Z is just one bit. For $n \gg \ell \gg m=1$ the oracle ${{\mathcal {O}}}(.)$ and the distribution (X, Z) is defined as follows. We sample (once and for all) two (disjoint) random subset ${\mathcal {X}}_0,{\mathcal {X}}_1\subseteq \{0,1\}^n$ of size $|{\mathcal {X}}_0|=|{\mathcal {X}}_1|=2^{\ell -1}$, let ${\mathcal {X}}={\mathcal {X}}_0\cup {\mathcal {X}}_1$. The oracle ${{\mathcal {O}}}(.)$ on input x is defined as follows (below $B_p$ denotes the Bernoulli distribution with parameter p, i.e., $\Pr [b=1\ :\ b\leftarrow B_p]=p$).

If $x\in {\mathcal {X}}_0$ output a sample of $B_{1/2+\delta }$.
If $x\in {\mathcal {X}}_1$ output a sample of $B_{1/2-\delta }$.
Otherwise, if $x\not \in {\mathcal {X}}$, output a sample of $B_{1/2}$.

Note that our oracle ${{\mathcal {O}}}(.)$ is probabilistic, but it can be “derandomized” as we’ll explain at the beginning of Sect. 4. The joint distribution (X, Z) is sampled by first sampling a random bit $Z\leftarrow \{0,1\}$ and then $X\leftarrow {\mathcal {X}}_Z$.

Given a tuple (V, Z), we can distinguish the case $V=X$ from the case where $V=Y$ for any Y with large support outside of ${\mathcal {X}}$ (X has min-entropy $\ell $, so let’s say we take a variable Y with ${\mathbf {H}}_{\infty }\left( Y|Z\right) \geqslant \ell +1$ which will have at least half of its support outside ${\mathcal {X}}$) with advantage ${\varTheta }(\delta )$ by quering $\alpha \leftarrow {{\mathcal {O}}}(V,Z)$, and outputting $\beta =\alpha \oplus Z$.

If $(V,Z)=(X,Z)$ then $\Pr [\beta =1]=1/2+\delta $. To see this, consider the case $Z=0$, then $\Pr [\beta =1]=\Pr [\alpha =1]=\Pr [{{\mathcal {O}}}(X)=1]=1/2+\delta $.
If $(V,Z)=(Y,Z)$ then $\Pr [\beta =1]= \Pr [Y\not \in {\mathcal {X}}](1/2)+\Pr [Y\in {\mathcal {X}}](1/2+\delta )\le 1/2+\delta /2$.

Therefore X|Z doesn’t have $\ell +1$ bits of HILL entropy

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{HILL}}}_{\delta /2, 1}(X|Z)< \ell +1 \end{aligned}$$

On the other hand, we claim that X (without Z but access to ${{\mathcal {O}}}(.)$) cannot be distinguished from the uniform distribution over $\{0,1\}^n$ with advantage ${\varTheta }(\delta )$ unless we allow the distinguisher ${\varOmega }(1/\delta )$ oracle queries (the hidden constant in ${\varTheta }(\delta )$ can be made arbitrary large by stetting the hidden constant in ${\varOmega }(1/\delta )$ small enough), i.e.,

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{HILL}}}_{{\varTheta }(\delta ), {\varOmega }(1/\delta )}(X)=n \end{aligned}$$

(1)

To see why (1) holds, we first note that given some V, a single oracle query is useless to tell whether $V=X$ or $V=U_n$: although in the case where $V=X\in {\mathcal {X}}_Z$ the output ${{\mathcal {O}}}(X)$ will have bias $\delta $, one can’t decide in which direction the bias goes as Z is (unconditionally) pseudorandom. If we’re allowed in the order $1/\delta ^2$ queries, we can distinguish X from $U_n$ with constant advantage, as with $1/\delta ^2$ samples one can distinguish the distribution $B_{1/2+\delta }$ (or $B_{1/2-\delta }$) from $B_{1/2}$ with constant advantage. If we just want ${\varTheta }(\delta )$ advantage, ${\varOmega }(1/\delta )$ samples are necessary, which proves (1). While it is easy to prove that for the coin with bias $\delta $ one needs $O\left( 1/\delta ^2\right) $ trials to achieve $99\,\%$ of certainty, finding the number of trials for some confidence level in o(1) as in our case, is more challenging. We solve this problem by a tricky application of Renyi divergences ^{Footnote 2} The statement of our “coin problem” with precise bounds is given in Lemma 3.

So far, we have only sketched the case $m=1$. For $m>1$, we define a random function $\pi :\{0,1\}^n\rightarrow \{0,1\}^{m-1}$. The oracle now takes an extra $m-1$ bit string j, and for $x\in {\mathcal {X}}$, the output of ${{\mathcal {O}}}(x,j)$ only has bias $\delta $ if $\pi (x)=j$ (and outputs a uniform bit everywhere else). We define the joint distribution (X, Z) by sampling $X\leftarrow {\mathcal {X}}$, define $Z'$ s.t. $X\in {\mathcal {X}}_{Z'}$, and set $Z=\pi (X)\Vert Z'$. Now, given Z, we can make one query $\alpha \leftarrow {{\mathcal {O}}}(V,Z[1\ldots m-1])$ and output $\beta =\alpha \oplus Z[m]$, where, as before, getting advantage $\delta $ in distinguishing X from any Y with min-entropy $\ge \ell +1$.

On the other hand, given some V (but no Z) it is now even harder to tell if $V=X$ or $V=Y$. Not only don’t we know in which direction the bias goes as before in the case $m=1$ (this information is encoded in the last bit Z[m] of Z), but we also don’t know on which index $\pi (V)$ (in the case $V=X$) we have to query the oracle to observe any bias at all. As there are $2^{m-1}$ possible choices for $\pi (V)$, this intuitively means we need $2^{m-1}$ times as many samples as before to observe any bias, which generalises (1) to

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{HILL}}}_{{\varTheta }(\delta ), {\varOmega }(2^{m-1}/\delta ^{})}(X)=n \end{aligned}$$

1.1 Some Implications of Our Lower Bounds

Leakage Resilient Cryptography. The chain rule for HILL entropy is a main technical tool used in several security proofs like the construction of leakage-resilient schemes [6, 20]. Here, the quantitative bound provided by the chain rule directly translates into the amount of leakage these constructions can tolerate. Our Theorem 2 implies a lower bound on the necessary security degradation for this proof technique. This degradation is, unfortunately, rather severe: even if we just leak $m=1$ bit, we will lose a factor $2^m/\epsilon $, which for a typical security parameter $\epsilon =2^{-80}$ means a security degradation of “80 bits”.

Let us also mention that Theorem 2 answers a question raised by Fuller and Reyzin [8], showing that for any chain rule the simultaneous loss in quality and quantity is necessary,^{Footnote 3}

Faking Auxiliary Inputs. [17, 27, 29] consider the question how efficiently one can “fake” auxiliary inputs. Concretely, given any joint distribution (X, Z) with $Z\in \{0,1\}^m$, construct an efficient simulator h s.t. (X, h(X)) is $(\epsilon ,s)$-indistinguishable from (X, Z). For example [29] gives a simulator h of complexity $O\left( 2^{m}\epsilon ^2\cdot s\right) $ (plus additive terms independent of s). This result has found many applications in leakage-resilient crypto, complexity theory and zero-knowledge theory. The best known lower bound (assuming exponentially hard OWFs) is ${{\varOmega }}\left( \max (2^{ m},1/\epsilon \right) )$. Since the chain rule for relaxed HILL entropy follows by a simulator argument [17] with the same complexity loss, our Theorem 2 yields a better lower bound ${{\varOmega }}\left( 2^{m}/\epsilon \right) $ on the complexity of simulating auxiliary inputs.

Dense Model Theorem. The computational dense model theorem [22] says, roughly speaking, that dense subsets of pseudorandom distributions are computationally indistinguishable from true dense distributions. It has found applications including differential privacy, memory delegation, graph decompositions and additive combinatorics. It is well known that the worst-case chain rule for HILL-entropy is equivalent to the dense model theorem, as one can think of dense distributions as uniform distributions X given short leakage Z. For settings with constant density, which correspond to $|Z|=O\left( 1\right) $, HILL and relaxed HILL entropy are equivalent [17]; moreover, the complexity loss in the chain rule is then equal to the cost of transforming Metric Entropy into HILL Entropy. Now our Theorem 1 implies a necessary loss in circuit size ${{\varOmega }}\left( 1/\epsilon ^2\right) $ if one wants $\epsilon $-indistinguishability. This way we reprove the tight lower bound due to Zhang [31] for constant densities.

2 Basic Definitions

Let $X_1$ and $X_2$ be two distributions over the same finite set. The statistical distance of $X_1$ and $X_2$ equals $\mathrm {SD}\left( X_1 ; X_2 \right) = \frac{1}{2}\sum _{x}\left| \Pr [X_1=x] - \Pr [X_2=x]\right| $.

Definition 1

(Min-Entropy). A random variable X has min-entropy k, denoted by ${\mathbf {H}}_{\infty }\left( X\right) =k$, if $\max _x\Pr [X=x]\le 2^{-k}$.

Definition 2

(Average conditional min-Entropy [5]). For a pair (X, Z) of random variables, the average min-entropy of X conditioned on Z is

$$\begin{aligned} {\widetilde{{\mathbf {H}}}}_\infty (X|Z) = -\log {{\mathrm{{\mathbb {E}}}}}_{z\leftarrow Z}[\max _x\Pr [X=x|Z=z]] = -\log {{\mathrm{{\mathbb {E}}}}}_{z\leftarrow Z}[2^{-{\mathbf {H}}_{\infty }\left( X|Z=z\right) }] \end{aligned}$$

Distinguishers. We consider several classes of distinguishers. With ${\mathcal {D}}^{\mathsf {rand},\{0,1\}}_s$ we denote the class of randomized circuits of size at most s with boolean output (this is the standard non-uniform class of distinguishers considered in cryptographic definitions). The class ${\mathcal {D}}^{\mathsf {rand},[0,1]}_s$ is defined analogously, but with real valued output in [0, 1]. ${\mathcal {D}}^{\mathsf {det},\{0,1\}}_s,{\mathcal {D}}^{\mathsf {det},[0,1]}_s$ are defined as the corresponding classes for deterministic circuits. With ${\varDelta }^D(X;Y)=|{{\mathrm{{\mathbb {E}}}}}_X[D(X)]-{{\mathrm{{\mathbb {E}}}}}_Y[D(Y)]$ we denote D’s advantage in distinguishing X and Y.

Definition 3

(HILL pseudoentropy [12, 15]). A variable X has HILL entropy at least k if

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{HILL}}}_{\epsilon ,s}(X)\ge k\iff \exists Y,\ {\mathbf {H}}_{\infty }\left( Y\right) =k \ \forall D\in {\mathcal {D}}^{\mathsf {rand},\{0,1\}}_s:{\varDelta }^D(X;Y)\le \epsilon \end{aligned}$$

For a joint distribution (X, Z), we say that X has k bits conditonal Hill entropy (conditionned on Z) if

$$\begin{aligned}&{\mathbf {H}}^{\mathsf{{HILL}}}_{\epsilon ,s}(X|Z)\ge k\\\iff & {} \exists (Y,Z),{\widetilde{{\mathbf {H}}}}_\infty (Y|Z)=k \ \forall D\in {\mathcal {D}}^{\mathsf {rand},\{0,1\}}_s:{\varDelta }^D((X,Z);(Y,Z))\le \epsilon \end{aligned}$$

Definition 4

(Metric pseudoentropy [3]). A variable X has Metric entropy at least k if

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{Metric}}}_{\epsilon ,s}(X)\ge k\iff \forall D\in {\mathcal {D}}^{\mathsf {rand},\{0,1\}}_s\exists Y_D\ ,\ {\mathbf {H}}_{\infty }\left( Y_D\right) =k \ :\ {\varDelta }^D(X;Y_D)\le \epsilon \end{aligned}$$

Metric star entropy is defined analogousely but using deterministic real valued distinguishers

$$\begin{aligned} {\mathbf {H}}^{{\textsf {Metric}}*}_{\epsilon ,s}(X)\ge k\iff \forall D\in {\mathcal {D}}^{\mathsf {det},[0,1]}_s\exists Y_D,\ {\mathbf {H}}_{\infty }\left( Y_D\right) =k :{\varDelta }^D(X;Y_D)\le \epsilon \end{aligned}$$

Relaxed Versions of HILL and Metric Entropy. A weaker notion of conditional HILL entropy allows the conditional part to be replaced by some computationally indistinguishable variable

Definition 5

(Relaxed HILL pseudoentropy [9, 23]). For a joint distribution (X, Z) we say that X has relaxed HILL entropy k conditioned on Z if

$$\begin{aligned}&{\mathbf {H}}^{\mathsf{{HILL-rlx}}}_{\epsilon ,s}(X|Z)\ge k\\\iff & {} \exists (Y,Z'), {\widetilde{{\mathbf {H}}}}_\infty (Y|Z')=k, \forall D\in {\mathcal {D}}^{\mathsf {rand},\{0,1\}}_s,\ \ :\ {\varDelta }^D((X,Z);(Y,Z'))\le \epsilon \end{aligned}$$

The above notion of relaxed HILL satisfies a chain rule whereas the chain rule for the standard definition of conditional HILL entropy is known to be false [18]. One can analogously define relaxed variants of metric entropy, we won’t give these as they will not be required in this paper.

Pseudoentropy Against Different Distinguisher Classes. For randomized distinguishers, it’s irrelevant if the output is boolean or real values, as we can replace any $D\in {\mathcal {D}}^{\mathsf {rand},[0,1]}_s$ with a $D'\in {\mathcal {D}}^{\mathsf {rand},\{0,1\}}$ s.t. ${{\mathrm{{\mathbb {E}}}}}[D'(X)]={{\mathrm{{\mathbb {E}}}}}[D(X)]$ by setting (for any x) $\Pr [D'(x)=1]={{\mathrm{{\mathbb {E}}}}}[D(x)]$. For HILL entropy (as well as for its relaxed version), it also doesn’t matter if we consider randomized or deterministic distinguishers in Definition 3, as we always can “fix” the randomness to an optimal value. This is no longer true for metric entropy,^{Footnote 4} and thus the distinction between metric and metric star entropy is crucial.

3 A Lower Bound on Metric-to-HILL Transformations

Theorem 1

For every n, k, m and $\epsilon $ such that $n \geqslant k + \log (1/\epsilon )+4$, $\frac{1}{8}>\epsilon $ and $n-1\ge m > 6\log (1/\epsilon )$ there exist an oracle ${{\mathcal {O}}}$ and a distribution X over $\{0,1\}^n$ such that

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{Metric}}}_{\epsilon ,T}{}^{,\mathrm {det}\{0,1\}}(X)\geqslant k \end{aligned}$$

(2)

here the complexity T denotes any circuit of size $2^{O(m)}$ that makes at most $\frac{\ln (2/\epsilon )}{216\epsilon ^2} $ non-adaptive queries and, simultaneously,

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{Metric}}}_{2\epsilon ,T'}{}^{,\mathrm {rand}\{0,1\}}(X)\leqslant m+1 \end{aligned}$$

(3)

where the distinguishers size $T'$ is only O(n) and the query complexity is 1.

Let S be a random subset of $\{0,1\}^{n}$ of size $2^{m}$, where $m\leqslant n-1$, and let $D_1,\ldots ,D_h$ be boolean functions drawn independently from the following distribution D: $D(x)=1$ on S with probability p if $x\in S$ and $D(x)=1$ with probability q if $x\in S^c$, where $p>q$ and $p+q=1$. Denote $X=U_{S}$. We will argue that the metric entropy against a probabilistic adversary who is allowed one query is roughly m with advantage ${\varOmega }(p-q)$. But the metric entropy against non-adaptive deterministic adversary who can make t queries of the form $D_i(x)$ is much bigger, even if $t = O\left( (p-q)^{-2}\right) $. Let us sketch an informal argument before we give the actual proof. We need to prove two facts:

(i)
There is a probabilistic adversary ${\mathsf {A}}^{*}$ such that with high probability over $X,D_1,\ldots ,D_h$ we have ${\varDelta }^{{\mathsf {A}}^{*}}(X,Y) = {\varOmega }( p-q)$ for all Y with ${\mathbf {H}}_{\infty }\left( Y\right) \geqslant m+1$.
(ii)
For every deterministic adversary ${\mathsf {A}}$ making at most $t=O\left( (p-q)^{-2}\right) $ non-adaptive queries, with high probability over $X,D_1,\ldots ,D_h$ we have ${\varDelta }^{{\mathsf {A}}}(X;Y)= 0$ for some Y with ${\mathbf {H}}_{\infty }\left( Y\right) =n-{\varTheta }(1)$.

To prove (i) we observe that the probabilistic adversary can distinguish between S and $S^c$ by comparing the bias of ones. We simply let ${\mathsf {A}}^*$ forward its input to $D_i$ for a randomly chosen i, i.e.,

$$\begin{aligned} {\mathsf {A}}^{*}(x) = D_i(x),\quad i\leftarrow [1,\ldots ,h] \end{aligned}$$

With extremely high probability we have $\Pr [{\mathsf {A}}^{*}(x)=1] \in [p-\delta ,p+\delta ]$ if $x\in S$ and $\Pr [{\mathsf {A}}^{*}(x)=1] \in [q-\delta ,q+\delta ]$ if $x\not \in S$ for some $\delta \ll p-q$ (by a Chernoff bound, $\delta $ drops exponentially fast in h, so we just have to set h large enough). We have then $\Pr [{\mathsf {A}}^{*}(X)=1] \geqslant p+\delta $ and $\Pr [{\mathsf {A}}^{*}(Y)=1] \leqslant 1/2\cdot ( p+q+2\delta )$ for every Y of min-entropy at least $m+1$ (since then $\Pr [Y\in S] \leqslant 1/2$). This yields ${\varDelta }^{{\mathsf {A}}^{*}}(X;Y) = (p-q)/2$. In order to prove (ii) one might intuitively argue that the best a t-query deterministic adversary can do to contradict to (ii), is to guess whether some value x has bias p or $q=1-p$, by taking the majority of t samples

$$\begin{aligned} {\mathsf {A}}(x) = \mathrm {Maj}(D_1(x),\ldots ,D_t(x)) \end{aligned}$$

But even if $t={\varTheta }( 1/(p-q)^2)$, majority will fail to predict the bias with constant probability. This means there exists a variable Y with min-entropy $n-{\varTheta }(1)$ such that $\Pr [{\mathsf {A}}(Y)=1]=\Pr [{\mathsf {A}}(X)=1]$. The full proof gives quantitative forms of (i) and (ii), showing essentially that “majority is best” and appears in Appendix A.

4 Lower Bounds on Chain Rules

For any $n\gg \ell \gg m$, we construct a distribution $(X,Z)\in \{0,1\}^n\times \{0,1\}^m$ and an oracle ${{\mathcal {O}}}(.)$ such that relative to this oracle, X has very large HILL entropy but the HILL entropy of X|Z is much lower in quantity and quality: for arbitrary $n\gg \ell \gg m$ (where $|Z|=m$, $X\in \{0,1\}^n$), the quantity drops from n to $\ell -m+2$ (it particular, by much more than $|Z|=m$), even if we allow for a $2^m/\epsilon $ drop in quality.

Theorem 2

(A lower bound on the chain rule for ${\mathbf {H}}^{\mathsf{{HILL-rlx}}}$ ). There exists a joint distribution (X, Z) over $\{0,1\}^n\times \{0,1\}^m$, and an oracle ${{\mathcal {O}}}$ such that, relative to ${{\mathcal {O}}}$, for any $(\ell ,\delta )$ such that $\frac{n}{2} - \frac{\log (1/\delta )}{2} > m$ and $\ell > m + 6\log (1/\delta ) $, we have

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{HILL}}}_{{\delta }/{2},T}(X) = n \end{aligned}$$

(4)

where^{Footnote 5} $T > c\cdot 2^{m}/\delta $ with some absolute constant c but

$$\begin{aligned} {\mathbf {H}}^{\mathsf{{HILL-rlx}}}_{\delta /2,T'}(X|Z) < \ell +1 \end{aligned}$$

(5)

where $T' $ captures a circuit of size only O(n) making only 1 oracle query.

Remark 1

(On the technical restrictions). Note that the assumptions on $\ell $ and $\delta $ are automatically satisfied in most interesting settings, as typically we assume $m\ll n$ and $\log (1/\delta ) \ll n$.

Remark 2

(A strict separation). The theorem also holds if we insist on a larger distinguishing advantage after leakage. Concretely, allowing for more than just one oracle query, the $\delta /2$ advantage in (5) can be amplified to $C\delta $ for any constant C assuming $\delta $ is small enough to start with (see Remark 4 in the proof).

The full proof appears in Appendix B. The heart of the argument is a lower bound on the query complexity for the corresponding “coin problem”: we need to distinguish between T random bits, and the distribution where we sample equally likely T independent bits $B_p$ or T independent bits $B_q$ where $p=\frac{1}{2}+\delta $ and $q=1-p$. (see Appendix C for more details). The rest of the proof is based on a standard concentration argument, using extensively Chernoff Bounds.

5 Open Problems

As shown in Fig. 2, there remains a gap between the best proofs for the chain-rule, which lose a factor $\epsilon ^2/2^{|Z|}$ in circuit size, and the required loss of $\epsilon /2^{|Z|}$ we prove in this paper. Closing this bound by either improving the proof for the chain-rule or give an improved lower bound remains an intriguing open problem.

Our lower bounds are only proven for adversaries that make their queries non-adaptively. Adaptive queries don’t seem to help against our oracle, but rigorously proving this fact seems tricky.

Finally, the lower bounds we prove on the loss of circuit size assume that the distinguishing advantage remains roughly the same. There exist results which are not of this form, in particular – as shown in Fig. 2 – the HILL to Metric transformation from [8] only loses in distinguishing advantage, not in circuit size (i.e., we have $s\approx s'$). Proving lower bounds and giving constructions for different circuit size vs. distinguishing advantage trade-offs leave many challenges for future work.

Notes

1.
Their ${\varOmega }(n/log(n))$ lower bound matches existing constructions from regular one-way functions [10]. For general one-way functions this lower bound is still far of the best construction [28] making ${{\tilde{{\varTheta }}}}(n^3)$ calls.
2.
Lower bounds [30, 31] also require nontrivial binomial estimates. They were obtained, however by direct and involved calculations.
3.
Their question was about chain rules bounding the worst-case entropy, that is bounding ${\mathbf {H}}^{\mathsf{{HILL}}}(X|Z=z)$ for every z. Our result, stated simply for average entropy ${\mathbf {H}}^{\mathsf{{HILL}}}(X|Z)$, is much more general and applies to qualitatively better chain rules obtained by simulator arguments.
4.
It might be hard to find a high min-entropy distribution Y that fools a randomized distinguisher D, but this task can become easy once D’s randomness is fixed.
5.
The class of adversaries here consists of all circuits with the total number of gates, including oracle gates, at most T. Theorem 2 is also true when the circuit size s is much bigger than the total number of oracle gates T (under some assumption on s, $\ell $, $\epsilon $). For simplicity, we do not state this version.
6.
We use the following version: let $X_i$ for $i=1,\ldots ,N$ be independent random variables such that $X_i \in [a_i,b_i]$. Then for any positive t we have $ \Pr _{X_1,\ldots ,X_N}\left[ \sum _{i=1}^{N} X_i - {{\mathrm{{\mathbb {E}}}}}\left[ \sum _{i=1}^{N} X_i \right] \geqslant t \right] \leqslant \exp \left( \frac{2 t^2}{\sum _{i=1}^{N} (b_i-a_i)^2}\right) $.
7.
This can be shown along the lines of the proof that a random exponential size subset is unconditionally pseudorandom against exponential size distinguishers, see Goldreich’s book “Foundations of Cryptography – Basic Techniques", Proposition 3.2.3.

References

Baker, T., Gill, J., Solovay, R.: Relativizations of the p=?np question. SIAM J. Comput. 4(4), 431–442 (1975)
Article MathSciNet MATH Google Scholar
Barak, B.: How to go beyond the black-box simulation barrier. In: 42nd FOCS, pp. 106–115. IEEE Computer Society Press, October 2001
Google Scholar
Barak, B., Shaltiel, R., Wigderson, A.: Computational analogues of entropy. In: 11th International Conference on Random Structures and Algorithms, pp. 200–215 (2003)
Google Scholar
Chung, K.-M., Kalai, Y.T., Liu, F.-H., Raz, R.: Memory delegation. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 151–168. Springer, Heidelberg (2011)
Chapter Google Scholar
Dodis, Y., Reyzin, L., Smith, A.: Fuzzy extractors: how to generate strong keys from biometrics and other noisy data. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 523–540. Springer, Heidelberg (2004)
Chapter Google Scholar
Dziembowski, S., Pietrzak, K.: Leakage-resilient cryptography. In: 49th FOCS, pp. 293–302. IEEE Computer Society Press, October 2008
Google Scholar
Fuller, B., O’Neill, A., Reyzin, L.: A unified approach to deterministic encryption: new constructions and a connection to computational entropy. In: Cramer, R. (ed.) TCC 2012. LNCS, vol. 7194, pp. 582–599. Springer, Heidelberg (2012)
Chapter Google Scholar
Fuller, B., Reyzin, L.: Computational entropy and information leakage. Cryptology ePrint Archive, report 2012/466 (2012). http://eprint.iacr.org/
Gentry, C., Wichs, D.: Separating succinct non-interactive arguments from all falsifiable assumptions. In: Fortnow, L., Vadhan, S.P. (eds.) 43rd ACM STOC, pp. 99–108. ACM Press, June 2011
Google Scholar
Goldreich, O., Krawczyk, H., Luby, M.: On the existence of pseudorandom generators. SIAM J. Comput. 22(6), 1163–1175 (1993)
Article MathSciNet MATH Google Scholar
Haitner, I., Reingold, O., Vadhan, S.P.: Efficiency improvements in constructing pseudorandom generators from one-way functions. In: Schulman, L.J. (ed.) 42nd ACM STOC, pp. 437–446. ACM Press, June 2010
Google Scholar
Håstad, J., Impagliazzo, R., Levin, L.A., Luby, M.: A pseudorandom generator from any one-way function. SIAM J. Comput. 28(4), 1364–1396 (1999)
Article MathSciNet MATH Google Scholar
Holenstein, T.: Pseudorandom generators from one-way functions: a simple construction for any hardness. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 443–461. Springer, Heidelberg (2006)
Chapter Google Scholar
Holenstein, T., Sinha, M.: Constructing a pseudorandom generator requires an almost linear number of calls. In: 53rd FOCS, pp. 698–707. IEEE Computer Society Press, October 2012
Google Scholar
Hsiao, C.-Y., Lu, C.-J., Reyzin, L.: Conditional computational entropy, or toward separating pseudoentropy from compressibility. In: Naor, M. (ed.) EUROCRYPT 2007. LNCS, vol. 4515, pp. 169–186. Springer, Heidelberg (2007)
Chapter Google Scholar
Impagliazzo, R., Rudich, S.: Limits on the provable consequences of one-way permutations. In: Goldwasser, S. (ed.) CRYPTO 1988. LNCS, vol. 403, pp. 8–26. Springer, Heidelberg (1990)
Google Scholar
Jetchev, D., Pietrzak, K.: How to fake auxiliary input. In: Lindell, Y. (ed.) TCC 2014. LNCS, vol. 8349, pp. 566–590. Springer, Heidelberg (2014)
Chapter Google Scholar
Krenn, S., Pietrzak, K., Wadia, A.: A counterexample to the chain rule for conditional HILL entropy. In: Sahai, A. (ed.) TCC 2013. LNCS, vol. 7785, pp. 23–39. Springer, Heidelberg (2013)
Chapter Google Scholar
Lu, C.-J., Tsai, S.-C., Wu, H.-L.: On the complexity of hard-core set constructions. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 183–194. Springer, Heidelberg (2007)
Chapter Google Scholar
Pietrzak, K.: A leakage-resilient mode of operation. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 462–482. Springer, Heidelberg (2009)
Chapter Google Scholar
Pietrzak, K., Skórski, M.: The chain rule for HILL pseudoentropy, revisited. In: Lauter, K., Rodríguez-Henríquez, F. (eds.) LatinCrypt 2015. LNCS, vol. 9230, pp. 81–98. Springer, Heidelberg (2015)
Chapter Google Scholar
Reingold, O., Trevisan, L., Tulsiani, M., Vadhan, S.P.: Dense subsets of pseudorandom sets. In: 49th FOCS, pp. 76–85. IEEE Computer Society Press, October 2008
Google Scholar
Reyzin, L.: Some notions of entropy for cryptography. In: Fehr, S. (ed.) ICITS 2011. LNCS, vol. 6673, pp. 138–142. Springer, Heidelberg (2011)
Chapter Google Scholar
Simon, D.R.: Findings collisions on a one-way street: can secure hash functions be based on general assumptions? In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS, vol. 1403, pp. 334–345. Springer, Heidelberg (1998)
Chapter Google Scholar
Skorski, M.: Metric pseudoentropy: characterizations, transformations and applications. In: Lehmann, A., Wolf, S. (eds.) Information Theoretic Security. LNCS, vol. 9063, pp. 105–122. Springer, Heidelberg (2015)
Google Scholar
Skorski, M.: A better chain rule for hill pseudoentropy - beyond bounded leakage. In: Information Theoretic Security - 9th International Conference, ICITS 2016 (2016)
Google Scholar
Skorski, M.: Simulating auxiliary information, revisited. In: TCC 2016-B (2016)
Google Scholar
Vadhan, S.P., Zheng, C.J.: Characterizing pseudoentropy and simplifying pseudorandom generator constructions. In: Karloff, H.J., Pitassi, T. (eds.) 44th ACM STOC, pp. 817–836. ACM Press, May 2012
Google Scholar
Vadhan, S., Zheng, C.J.: A uniform min-max theorem with applications in cryptography. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013, Part I. LNCS, vol. 8042, pp. 93–110. Springer, Heidelberg (2013)
Chapter Google Scholar
Watson, T.: Advice lower bounds for the dense model theorem. TOCT 7(1), 1 (2014)
Article MathSciNet Google Scholar
Zhang, J.: On the query complexity for showing dense model. Electron. Colloquium Comput. Complexity (ECCC) 18, 38 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

IST Austria, Klosterneuburg, Austria
Krzysztof Pietrzak & Maciej Skórski
University of Warsaw, Warsaw, Poland
Krzysztof Pietrzak & Maciej Skórski

Authors

Krzysztof Pietrzak
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Skórski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krzysztof Pietrzak .

Editor information

Editors and Affiliations

Department of Computer Science, ETH Zurich, Zurich, Switzerland
Martin Hirt
Pennsylvania State University, University Park, Pennsylvania, USA
Adam Smith

Appendices

A Proof of Theorem 1

1.1 A.1 Majority Is Best

We prove two statements which are quantitative forms of (i) and (ii) discussed after the statement of Theorem 1. First we show that the probabilistic adversary ${\mathsf {A}}^{*}$ easily distinguishes X from all Y of high min-entropy.

Claim 1

(Probabilistic Metric Entropy of X is small). Let ${\mathsf {A}}^{*}$ be a probabilistic adversary who on input x samples a random $i\in [1,\ldots ,h]$, then queries for $D_i(x)$ and outputs the response. Then for any $\delta \leqslant (p-q)/3$ we have

$$\begin{aligned} \Pr [ \forall Y: \ {\mathbf {H}}_{\infty }\left( Y\right) \geqslant m+1,\ {\varDelta }^{{\mathsf {A}}^{*}}(X;Y) \geqslant (p-q)/3] \geqslant 1-2^{\max (n-1,m+1)}\exp (-h\delta ^2). \end{aligned}$$

(6)

Remark 3

(The complexity of the probabilistic distinguisher). We can chose h in Claim 1 to be $2^{n}$, then ${\mathsf {A}}^{*}$ is of size $O\left( n\right) $ and makes only one query.

Consider now a deterministic adversary ${\mathsf {A}}$ who on input x can make at most t queries learning $D_i(x)$ for t different $i \in [1,\ldots ,h]$. We claim that

Claim 2

(Deterministic Metric Entropy is big). Suppose that we have $n \geqslant k + \log (1/\epsilon )+4$ and $\delta = \frac{\epsilon ^2}{2+2\epsilon }$. Then for every nonadaptive adversary ${\mathsf {A}}$ which makes $t \leqslant \frac{\ln (2/\epsilon )}{6(p-q)^2}$ queries we have

$$\begin{aligned} \mathop {\Pr }\limits _{X,D_1,\ldots ,D_h}\left[ \exists Y: \ {\mathbf {H}}_{\infty }\left( Y\right) \geqslant k,\ \mathrm {{\varDelta }}^{{\mathsf {A}}}(X;Y) \leqslant \epsilon \right] \geqslant 1-4\exp (-2^{m}\delta ^2). \end{aligned}$$

(7)

Setting $p-q = 6\epsilon $ we see that Eq. (2) follows from Claim 1 and Eq. (3) follows from Eq. (7) combined with the union bound over all distinguishers. Note that the right hand side of Eq. (7) converges to 1 with the rate doubly exponential in m, so we can even afford taking a union bound over all distinguishers of size exponential in m.

Proof

(of Claim 1 ). By a Chernoff bound^{Footnote 6} and the union bound

$$\begin{aligned} \mathop {\Pr }\limits _{X,D_1,\ldots ,D_h}[\forall x\in S^{c}\ :\ \Pr [{\mathsf {A}}^{*}(x)=1] \leqslant q+\delta ] \geqslant 1-2^{n-1}\exp (-2\delta ^2 h) \end{aligned}$$

(8)

similarly

$$\begin{aligned} \mathop {\Pr }\limits _{X,D_1,\ldots ,D_h}[\forall x\in S: |\Pr [{\mathsf {A}}^{*}(x)=1] - p| \leqslant \delta ] \geqslant 1-2^{m}\cdot 2\exp (-2\delta ^2 h). \end{aligned}$$

(9)

The advantage of ${\mathsf {A}}^{*}$, with probability $1-2^{n-1}\exp (-2h\delta ^2)$, equals

$$\begin{aligned} {\varDelta }^{{\mathsf {A}}^{*}}(X;Y)&\geqslant (p-\delta ) -(p+\delta )\Pr [Y\in S] - (q+\delta )\Pr [Y\in S^{c}] \\&\geqslant p-q - (p-q)\Pr [Y\in S] - 2\delta . \end{aligned}$$

Since by the assumption we have $\Pr [Y\in S] \leqslant \frac{1}{2}$, Eq. (6) follows.

Proof

(of Claim 2 ). The adversary ${\mathsf {A}}$ non-adaptively queries for $D_i(x)$ values for t distinct i’s and then outputs a bit, this bit is thus computed by a function of the form

$$\begin{aligned} f\left( x,D_{i_1(x)}(x),\ldots , D_{i_t(x)}(x) \right) , \end{aligned}$$

(10)

for some fixed boolean function $f: \{0,1\}^{n}\times \{0,1\}^t\rightarrow \{0,1\}$. We start by simplifying the event (7) using the following proposition, which gives an alternative characterization of the deterministic metric entropy.

Lemma 1

([3, 25]). Let D be a boolean deterministic function on $\{0,1\}^n$. Then there exists Y of min-entropy at least k such that ${\varDelta }^{D}(X;Y) \leqslant \epsilon $ if and only if

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}D'(X) \leqslant 2^{n-k}{{\mathrm{{\mathbb {E}}}}}D'(U) + \epsilon \end{aligned}$$

(11)

holds for $D' \in \{D,{\mathbf {1}}-D \}$

Since $|S^c| \geqslant 2^{n-1}$, we have ${{\mathrm{{\mathbb {E}}}}}D(U) \geqslant {{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S^c} D(x)/2$ for any function D. Therefore, by Lemma 1, the inequality (7) will be proved if we show that the following inequality holds:

$$\begin{aligned} \mathop {\Pr }\limits _{X,D_1,\ldots ,D_h}\left[ \forall {\mathsf {A}}' \in \{{\mathsf {A}},{\mathbf {1}}-{\mathsf {A}}\}:\ {{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S}{\mathsf {A}}'(x)\leqslant 2^{n-k-1}{{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S^c} {\mathsf {A}}'(x) + \epsilon \right] \geqslant 1-4\exp (-2^{m}\delta ^2) \end{aligned}$$

(12)

By the union bound, it is enough to show that for ${\mathsf {A}}' \in \{{\mathsf {A}},{\mathbf {1}}-{\mathsf {A}}\}$ we have

$$\begin{aligned} \mathop {\Pr }\limits _{X,D_1,\ldots ,D_h}\left[ {{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S}{\mathsf {A}}'(x) \leqslant 2^{n-k-1}{{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S^c} {\mathsf {A}}'(x) + \epsilon \right] \geqslant 1-2\exp (-2^{m}\delta ^2) \end{aligned}$$

(13)

In the next step we simplify the expressions ${{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S}{\mathsf {A}}'(x)$ and ${{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S^c}{\mathsf {A}}'(x)$. The following fact is a direct consequence of the Chernoff bound.

Proposition 1

For any function $f\in \{0,1\}^n\times \{0,1\}^t \rightarrow [0,1]$ we have

$$\begin{aligned} \left| {{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S} f\left( x,D_{i_1(x)}(x),\ldots , D_{i_t(x)}(x) \right) - {{\mathrm{{\mathbb {E}}}}}f(U_n,B_p^{1},\ldots ,B_p^{t} ) \right|&\leqslant \delta \end{aligned}$$

(14)

$$\begin{aligned} \left| {{\mathrm{{\mathbb {E}}}}}_{x\leftarrow S^c} f\left( x,D_{i_1(x)}(x),\ldots , D_{i_t(x)}(x) \right) - {{\mathrm{{\mathbb {E}}}}}f(U_n,B_q^{1},\ldots ,B_q^{t} ) \right|&\leqslant \delta \end{aligned}$$

(15)

with probability $1-2\exp (-2\cdot 2^{m}\delta ^2)$ over the choice of X and $D_1,\ldots ,D_h$.

For any ${\mathbf {r}}=({\mathbf {r}}_1,{\mathbf {r}}_2,\ldots ,{\mathbf {r}}_t)\in [0,1]^t$, and any (deterministic or randomized) function $f\in \{0,1\}^t\rightarrow [0,1]$ we denote ${\mathbb {E}}_{{\mathbf {r}}}f={\mathbb {E}}f(B_{{\mathbf {r}}_1},\ldots ,B_{{\mathbf {r}}_t})$. It is enough to show that if ${\mathbf {r}},{\mathbf {r}}'$ are both chosen from $\{p,q\}^t$ then we have

$$\begin{aligned} {\mathbb {E}}_{{\mathbf {r}}}f+\delta \leqslant 2^{n-k-1} \max ( {\mathbb {E}}_{\mathbf {r'}}f-\delta ,0) + \epsilon . \end{aligned}$$

(16)

This inequality will follow by the following lemma (applied to f in the proposition but considered as a function of $\{0,1\}^t$ randomized with the first n input bits).

Lemma 2

Suppose that $p,q>0$ are such that $p+q=1$. Let $f:\{0,1\}^{t}\rightarrow [0,1]$ be an arbitrary function and let ${\mathbf {r}},{\mathbf {r}}' \in \{p,q\}^t$. Then for any $c>0$ we have

$$\begin{aligned} {\mathbf {E}}_{{\mathbf {r}}}f \leqslant \exp \left( \frac{(c+1)(p-q)^2}{q}\cdot t\right) \cdot {\mathbf {E}}_{\mathbf {r'}}f + \exp (-2c^2(p-q)^2 t). \end{aligned}$$

Proof

The idea of the proof is to show that for most values of z the ratio $\Pr [B_{{\mathbf {r}}}=z]/\Pr [B_{\mathbf {r'}}=z]$ is bounded. We have

$$\begin{aligned}&{\Pr [B_{{\mathbf {r}}}=z]}/{\Pr [B_{\mathbf {r'}}=z]}\nonumber \\&= (p/q)^{\#\{i: z_i=1,\ {\mathbf {r}}_i>\mathbf {r'}_i\}-\#\{i: z_i=1,\ {\mathbf {r}}_i<\mathbf {r'}_i\}}\cdot (q/p)^{\#\{i: z_i=0,\ {\mathbf {r}}_i>\mathbf {r'}_i\}-\#\{i: z_i=0,\ {\mathbf {r}}_i<\mathbf {r'}_i\}} \nonumber \\&= (p/q)^{\#\{i: z_i=1,\ {\mathbf {r}}_i>\mathbf {r'}_i\}-\#\{i: z_i=0,\ {\mathbf {r}}_i>\mathbf {r'}_i\}-\#\{i: z_i=1,\ {\mathbf {r}}_i<\mathbf {r'}_i\}+\#\{i: z_i=0,\ {\mathbf {r}}_i<\mathbf {r'}_i\}} \end{aligned}$$

(17)

$$\begin{aligned}&= (p/q)^{\sum _{i=1}^{t} (2z_i-1) \cdot \mathrm {sgn}({\mathbf {r}}_i-\mathbf {r'}_i)} \end{aligned}$$

(18)

The random variables $\xi _i = (2z_i-1) \cdot \mathrm {sgn}({\mathbf {r}}_i-\mathbf {r'}_i)$ for $i=1,\ldots ,t$, where z is sampled from $B_{{\mathbf {r}}}$, are independent with the expectations ${\mathbb {E}}\xi _i =(2{\mathbf {r}}_i-1)\mathrm {sgn}({\mathbf {r}}_i-\mathbf {r'}_i)\leqslant p-q$. By the Chernoff bound for any $c>0$ we get

$$\begin{aligned} \Pr _{z\leftarrow B_{{\mathbf {r}}}}\left[ \sum _{i=1}^{t} (2z_i-1) \cdot \mathrm {sgn}({\mathbf {r}}_i-\mathbf {r'}_i) \geqslant (p-q)t +c(p-q)t \right] \leqslant \exp (-2c^2(p-q)^2 t). \end{aligned}$$

(19)

Therefore,

$$\begin{aligned} {\mathbb {E}}_{{\mathbf {r}}} f \leqslant (p/q)^{(c+1)(p-q)t }{\mathbb {E}}_{\mathbf {r'}} f + 2\exp (-2c^2(p-q)^2 t) \end{aligned}$$

(20)

and the claim follows by observing that $p/q = 1+(p-q)/q \leqslant \exp ((p-q)/q)$.

From Lemma 2 it follows that Eq. (16) is satisfied with

$$\begin{aligned} \delta \leqslant \frac{\epsilon }{2\exp \left( (c+1)(p-q)^2\cdot t /q\right) +2} \end{aligned}$$

(21)

provided that

$$\begin{aligned} \exp (-2c^2(p-q)^2\cdot t)&\leqslant \epsilon /2 \end{aligned}$$

(22)

$$\begin{aligned} \exp \left( (c+1)(p-q)^2\cdot t /q\right)&\leqslant 2^{n-k-1} \end{aligned}$$

(23)

It is easy to see that Eqs. (23) and (22) are satisfied if and only if

$$\begin{aligned} \frac{\ln (2/\epsilon )}{2c^2(p-q)^2} \leqslant t \leqslant (n-k-3)\ln 2\cdot \frac{q}{(c+1)(p-q)^2}. \end{aligned}$$

This inequality can be satisfied if and only if

$$\begin{aligned} \epsilon \geqslant 2\cdot 2^{(k-n+3)\cdot \frac{2q c^2}{c+1}}. \end{aligned}$$

If we set $t = \frac{\ln (2/\epsilon )}{2c^2(p-q)^2} $ then Eq. (21) becomes

$$\begin{aligned} \delta \leqslant \frac{\epsilon }{(2/\epsilon )^{\frac{c+1}{2qc^2}} + 2} \end{aligned}$$

Choosing c so that $\frac{2q c^2}{c+1}=1$ we see that it is enough to assume $\epsilon \geqslant 2\cdot 2^{k-n+3}$, any $\delta $ such that $\delta \leqslant \frac{\epsilon ^2}{2+2\epsilon }$ and $t \approx \frac{\ln (2/\epsilon )}{6(p-q)^2}$ (the constant 6 is sightly bigger than the exact value, but if Claim 2 holds true for some t then also for $t' < t$). This finishes the proof of Claim 2.

B Proof of Theorem 2

A Remark on the Oracle. For convenience, the oracle ${{\mathcal {O}}}:\{0,1\}^n\rightarrow \{0,1\}$ we use in the proof is probabilistic, in the sense that it flips some random coins before answering a query (in particular, making the same query twice might give different outputs). We remark that, as the adversaries considered are probabilistic, one can replace this oracle with a deterministic one ${{\mathcal {O}}}_{\mathrm {det}}$ by assigning to every possible query x a $2^L$ tuple $(x,r),r\in \{0,1\}^L$ of queries (for some sufficiently large L), where the output for ${{\mathcal {O}}}_{\mathrm {det}}((x,r))$ is sampled according to ${{\mathcal {O}}}(x)$ for every r. We can emulate the output distribution ${{\mathcal {O}}}(x)$ by querying ${{\mathcal {O}}}((x,r))$ for a random r. On the other hand, for a random x, even an exponential size distinguisher will not be able to distinguish ${{\mathcal {O}}}_{\mathrm {def}}((x,\cdot ))$ from an oracle which, when queried on input (x, r) for the first time, samples the output according to the distribution of ${{\mathcal {O}}}(x)$.^{Footnote 7}

Proof

(of Theorem 2 ). We first describe how we construct the distribution (X, Z) and the oracle ${{\mathcal {O}}}$.

Construction details. We chose at random two disjoint sets ${\mathcal {X}}_0,{\mathcal {X}}_1\subset \{0,1\}^n$ of size $2^{\ell }$ and define ${\mathcal {X}}= {\mathcal {X}}_0\cup {\mathcal {X}}_1$. Let $\pi :\{0,1\}^n\rightarrow \{0,1\}^{m-1}$ be a random function. The oracle ${{\mathcal {O}}}$ on input $(x,j)\in {\mathcal {X}}\times \{0,1\}^{m-1}$ outputs a sample of $B_{1/2}$ (i.e., a uniformly random bit), except if $x\in {\mathcal {X}}$ and $\pi (x)=j$, in this case the output bit has bias $\delta $; If $x\in {\mathcal {X}}_0$, the oracle outputs a sample of $B_{1/2-\delta }$, and otherwise, if $x\in {\mathcal {X}}_1$, a sample of $B_{1/2+\delta }$. We define the joint distribution (X, Z) by sampling $Z'\leftarrow \{0,1\},X\leftarrow {\mathcal {X}}_{Z'}$ and setting $Z=\pi (X)\Vert Z'$ (note that X is uniform in ${\mathcal {X}}$)

Adversaries. The adversary on input $x\in \{0,1\}^n$ makes T non-adaptive queries $(x,j_1(x)),\ldots ,(x,j_T(x))$ to the oracle. We denote ${{\mathcal {O}}}$’s response with $R(x)=\left( R^{i}(x,j_i(x))\right) _{i=1}^{T}$. The adversary’s final output f(x, R(x)) is computed by a boolean function $f:\{0,1\}^n\times \{0,1\}^T\rightarrow \{0,1\}$.

Formal proof. Let $R(x) = (R^{1}(x,j_1(x)),\ldots ,R^{T}(x,j_T(x)))$ be the sequences of the oracle’s responses and Let $B(x) = (B_{1/2}^{1},\ldots ,B_{1/2}^{T})$ be independent random bits. For every x the number of useful responses, that is indexes i such that $R^{i}(x,j_i(x))$ is biased, is defined to be

$$\begin{aligned} T(x)=\sum _{i=1}^T [j_i(x)=\pi (x)] \end{aligned}$$

(24)

On average we have ${{\mathrm{{\mathbb {E}}}}}_{{{\mathcal {O}}}(\cdot )} T(x) = T/2^{m-1}$. We claim that the adversary actually learns basically nothing about ${\mathcal {X}}$: the sequence of oracle outptus is close to the sequence of unbiased bits. We start by showing that ${\mathcal {X}}$ is pseudorandom for our adversary.

Claim 3

( X is pseudorandom, even given oracle responses). For any f and $\epsilon >0$ we have

$$\begin{aligned} \left| {{\mathrm{{\mathbb {E}}}}}_{x\leftarrow {\mathcal {X}}} f(x,R(x)) - {{\mathrm{{\mathbb {E}}}}}_{x\leftarrow U_n} f(x,R(x)) \right| \le \epsilon +O\left( \delta ^{2}T/2^{m}\right) \end{aligned}$$

(25)

with error probability at most $O\left( \exp \left( -{{\varOmega }}\left( 2^{n-m}\right) \right) +\exp \left( -{{\varOmega }}\left( 2^{\ell }\epsilon ^2\right) \right) \right) $.

Proof

By Lemma 3 and the definition of ${{\mathcal {O}}}$, for every $x\in {\mathcal {X}}$ we obtain

$$\begin{aligned} |{\mathbb {E}} f(x,R(x)) - {\mathbb {E}} f(x,B(x)) | = \left\{ \begin{array}{rl} O\left( T(x)\delta ^2\right) , &{} \ x\in {\mathcal {X}}\\ 0, &{} \ x\not \in {\mathcal {X}}\end{array} \right. \end{aligned}$$

(26)

for every boolean function f and some absolute constant hidden under big-Oh. Thus

$$\begin{aligned} \left| \mathop {\mathbb {E}}\limits _{x\leftarrow {\mathcal {X}}} f(x,R(x)) - \mathop {\mathbb {E}}\limits _{x\leftarrow {\mathcal {X}}} f(x,B(x)) \right| = O\left( \mathop {\mathbb {E}}\limits _{x\leftarrow {\mathcal {X}}}T(x) \delta ^2\right) \end{aligned}$$

(27)

Note that the random variables f(x, R(x)) for different values of x are independent and similarly f(x, B(x)) for different values of x are independent. Since the set ${\mathcal {X}}$ is chosen at random by the Hoeffding-Chernoff bound we obtain that with probability $1-2\exp \left( -{{\varOmega }}\left( 2^{\ell }\epsilon ^2\right) \right) $ over ${{\mathcal {O}}}$ the following holds:

$$\begin{aligned} \left| \mathop {\mathbb {E}}\limits _{x\leftarrow {\mathcal {X}}} f(x,B(x)) - \mathop {\mathbb {E}}\limits _{x\leftarrow U_n} f(x,B(x)) \right|&\leqslant \epsilon \end{aligned}$$

(28)

Combining Eqs. (27) and (28) we obtain (with probability $1-2\exp \left( -{{\varOmega }}\left( 2^{\ell }\epsilon ^2\right) \right) $ over ${{\mathcal {O}}}$).

$$\begin{aligned} \left| \mathop {\mathbb {E}}\limits _{x\leftarrow {\mathcal {X}}} f(x,R(x)) - \mathop {\mathbb {E}}\limits _{x\leftarrow U_n} f(x,B(x)) \right| \leqslant \epsilon + O\left( \mathop {\mathbb {E}}\limits _{x\leftarrow {\mathcal {X}}}T(x) \delta ^2\right) \end{aligned}$$

(29)

By Eq. (26) we have

$$\begin{aligned} \left| \mathop {\mathbb {E}}\limits _{x\leftarrow U_n} f(x,R(x)) - \mathop {\mathbb {E}}\limits _{x\leftarrow U_n} f(x,B(x)) \right| \leqslant O\left( \mathop {\mathbb {E}}\limits _{x\leftarrow U_n}T(x) \delta ^2\right) . \end{aligned}$$

(30)

Now Eqs. (29) and (30) imply

$$\begin{aligned} \left| \mathop {\mathbb {E}}\limits _{x\leftarrow {\mathcal {X}}} f(x,R(x)) - \mathop {\mathbb {E}}\limits _{x\leftarrow U_n} f(x,R(x)) \right| \leqslant \epsilon + O\left( \mathop {\mathbb {E}}\limits _{x\leftarrow U_n}T(x) \delta ^2\right) . \end{aligned}$$

(31)

The random variables T(x) for different x are independent, bounded by T and have the first moment $\mathop {\mathbb {E}}\limits _{{\mathcal {O}}}(T(x)) = T/2^{m-1}$. By the multiplicative Chernoff bound with probability $1-2\exp \left( -{{\varOmega }}\left( 2^{n-m}\right) \right) $ over ${{\mathcal {O}}}$ it holds that $\mathop {\mathbb {E}}\limits _{x\leftarrow U_n}T(x) < 2\cdot T/2^{m-1}$. This implies Eq. (25) with error probability at most

$$\begin{aligned} P_{\text {err}}=O\left( \exp \left( -{{\varOmega }}\left( 2^{n-m}\right) \right) +\exp \left( -{{\varOmega }}\left( 2^{\ell }\epsilon ^2\right) \right) \right) . \end{aligned}$$

Claim 4

There exists a distinguisher ${\mathsf {D}}:\{0,1\}^{n}\times \{0,1\}^m\rightarrow \{0,1\}$ which calls the oracle ${{\mathcal {O}}}$ one time and such that for any joint distribution $Y,Z'$ over $\{0,1\}^n\times \{0,1\}^m$ with entropy ${\widetilde{{\mathbf {H}}}}_\infty (Y|Z') \geqslant \ell +1$ it holds that

$$\begin{aligned} {{\mathrm{{\mathbb {E}}}}}{\mathsf {D}}(X,Z) - {{\mathrm{{\mathbb {E}}}}}{\mathsf {D}}(Y,Z') \geqslant \frac{\delta }{2} \end{aligned}$$

with probability $1-2\exp (-{{\varOmega }}\left( 2^{\ell }\delta ^2)\right) $.

Remark 4

(Amplified distinguisher). Assuming that T is sufficiently large, we can modify ${\mathsf {D}}$ by taking the majority vote over T queries on ${{\mathcal {O}}}(x,z)$. This will boost the distinguishing advantage from $\delta /2$ to $C\delta $ where C can be an arbitrary constant (for sufficiently small $\delta $).

Proof

(of Claim 4 ). The distinguisher ${\mathsf {D}}$ simply calls the oracle ${{\mathcal {O}}}$ on the pair (x, z). The probability that ${\mathsf {D}}$ outputs 1 on input $(Y,Z')$ is at most (the probabilities below are over the choice of ${{\mathcal {O}}}$ and $Y,Z'$)

$$\begin{aligned} \Pr \left( {\mathsf {D}}(Y,Z') = 1\right)&= \mathop {\mathbb {E}}\limits _{z\leftarrow Z'} \Pr \left( {\mathsf {D}}(Y|_{Z'=z},z)=1\right) \\&= \mathop {\mathbb {E}}\limits _{z\leftarrow Z'} \left[ \Pr \left( \left. {\mathsf {D}}(Y,z)=1 \wedge Y \not \in {\mathcal {X}} \right| Z'=z\right) \right] \\&\quad + \mathop {\mathbb {E}}\limits _{z\leftarrow Z'} \left[ \Pr \left( \left. {\mathsf {D}}(Y,z)=1 \wedge Y\in {\mathcal {X}} \right| Z'=z\right) \right] \\&= \frac{1}{2} + \delta \cdot \mathop {\mathbb {E}}\limits _{z\leftarrow Z'} \left[ \Pr \left( \left. Y\in {\mathcal {X}} \right| Z'=z\right) \right] \\&\leqslant \frac{1}{2}+\delta \mathop {\mathbb {E}}\limits _{z\leftarrow Z'} \left[ |{\mathcal {X}}|\cdot 2^{-{\mathbf {H}}_{\infty }\left( Y|Z'=z\right) } \right] \\&= \frac{1}{2}+\delta \cdot |{\mathcal {X}}|\cdot 2^{-{\widetilde{{\mathbf {H}}}}_\infty (Y|Z')} \end{aligned}$$

which is at most $\frac{1}{2}+\frac{\delta }{2}$. On the other hand we have $\Pr ({\mathsf {D}}(X,Z) = 1) = \frac{1}{2}+\delta $. From this we see that the advantage is $\delta $ on average - but we need stronger concentration guarantees. Note that $\Pr ({\mathsf {D}}(X,Z) = 1) = \sum _{x\in S} \Pr [X=x]\cdot {\mathsf {D}}(x,i(x))$ can be viewed as a sum of independent random variables. By the Chernoff-Hoeffding bound we get

$$\begin{aligned} \Pr _{{{\mathcal {O}}}}\left[ \Pr ({\mathsf {D}}(X,Z) = 1) \geqslant \frac{1}{2} + \delta - \frac{\delta }{8} \right] \geqslant 1-\exp (-{{\varOmega }}\left( 2^{\ell }\delta ^2 )\right) ) \end{aligned}$$

Similarly, $\Pr ({\mathsf {D}}(Y,Z') = 1) = \sum _{x,z} \Pr [Y=x,Z'=z]\cdot {\mathsf {D}}(x,z')$. Since

$$\begin{aligned} \sum _{x,z} \Pr [Y=x,Z'=z]^2&= \sum _{z}\sum {x}\Pr [Z'=z]^2 \Pr [Y=x|Z'=z]^2 \\&\leqslant \sum _{z}\Pr [Z'=z] 2^{-{\mathbf {H}}_{\infty }\left( Y|_{Z'=z}\right) } \\&\leqslant 2^{-{\widetilde{{\mathbf {H}}}}_\infty (Y|Z)}, \end{aligned}$$

the Chernoff-Hoeffding bound implies

$$\begin{aligned} \Pr _{{{\mathcal {O}}}}\left[ \Pr ({\mathsf {D}}(Y',Z) = 1) \leqslant \frac{1}{2} + \frac{\delta }{2} + \frac{\delta }{8} \right] \geqslant 1-\exp (-{{\varOmega }}\left( 2^{\ell }\delta ^2 \right) ) \end{aligned}$$

(32)

and the result follows. We set $\epsilon = \frac{\delta }{3}$ and $T = c\cdot 2^m/\epsilon $. Now Claim 4 directly implies Eq. (5) whereas Eq. (4) follows, when c is sufficiently small, from Claim 3 by a union bound; To see this, note that the right hand side of (32) is doubly exponentially close (in $\ell $) to 1, and recall that $\ell > m + 6\log (1/\delta )$. So we can take a union bound over all $O(\exp (T))$ circuits ${\mathsf {D}}$ of size T and deduce that with high probability the left hand side of (32) hold for all of them.

C Proof of Lemma 3

Lemma 3

(Lower bounds on the coin problem). Fix $\delta \in (0,1/2)$ and define $p=\frac{1}{2}+\delta $ and $q=1-p$. Consider the following two experiments:

(a)
We flip a fair coin, and depending on the result we toss T times a biased coin $B_p$ (probability of the head is p) or toss T times a coin $B_{q}$ (probability of the head is q). The output is the result of these T flips.
(b)
We flip T times a fair coin and output the results.

Then one cannot distinguish (a) from (b) better than with advantage $O\left( T\delta ^2\right) $.

Remark 5

We give a simple proof based on calculating Renyi divergences. This result can be also derived by more sophisticaed techniques from Fourier analysis (the generalized XOR lemma).

Before we give the proof, let’s recall some basic facts about Pearson Chi-Squared Distance. For any two distributions P, Q over the same space, their Chi-Squared distance defined by

$$\begin{aligned} D_{\chi ^2}(P\parallel Q) =\sum _x Q(x) \left( \frac{P(x)}{Q(x)} - 1\right) ^2 = \sum _x \frac{P(x)^2}{Q(x)^2}-1 \end{aligned}$$

(33)

Now let $U_1,\ldots ,U_n$ be independent uniform bits, $X_1,\ldots ,X_n$ be i.i.d. bits where 1 appears with probability $p=\frac{1}{2}+\delta $ and $Y_1,\ldots ,Y_n$ be i.i.d. bits where 1 appears w ith probability $q=1-p=\frac{1}{2}-\delta $. We want to estimate the distance between $U = U_1,\ldots ,U_n$ and Z distributed as an equally weighted combination of $X = X_1,\ldots ,X_n$ and $Y = Y_1,\ldots ,Y_n$. We think of $\delta $ as a fixed parameter and n as a growing number. Our statement will easily follow by combining the following two claims

Claim 5

With U and Z as above, and for $n = O\left( \delta ^{-2}\right) $, it holds that

$$\begin{aligned} D_{\chi ^2}\left( U;Z\right) = O\left( n^2\delta ^4\right) \end{aligned}$$

(34)

Claim 6

For any R and uniform U

$$\begin{aligned} \mathrm {SD}(R\parallel U)\leqslant \sqrt{ D_{\chi ^2}(R\parallel U) }, \end{aligned}$$

(35)

Indeed, combining these claims we obtain $\mathrm {SD}(Z\parallel U) =O( n\delta ^2 )$ when $n = O\left( \delta ^{-2}\right) $. Since the left-hand side is bounded by 1, this is true also when $n > c\delta ^{-2}$ for some absolute constant c and the result follows.

Proof

(of Claim 5 ). We have

$$\begin{aligned} D_{\chi ^2}\left( \frac{1}{2}P_{X_1,\ldots ,X_n} + \frac{1}{2}P_{Y_1,\ldots ,Y_n}\parallel P_{U_1}\cdot \ldots \cdot P_{U_n}\right)&\nonumber \\ = 2^n\cdot \sum _{z_1,\ldots ,z_n} \left( \frac{1}{2}P_{X_1}(z_1)\cdot \ldots \cdot P_{X_n}(z_n) + \frac{1}{2}P_{Y_1}(z_1)\cdot \ldots \cdot P_{Y_n}(z_n) \right) ^2-1&\nonumber \\ = \frac{1}{4} \cdot 2^n\prod _i \left( \sum _{z} P_{X_i}(z)^2 \right) + \frac{1}{4}\cdot 2\cdot 2^n \prod _i \left( \sum _{z} P_{X_i}(z) P_{Y_i}(z) \right) \nonumber \\ + \frac{1}{4}\cdot 2^n \prod _i \left( \sum _{z} P_{Y_i}(z)^2 \right) -1&\nonumber \\ = \frac{1}{4}\left( (1+4\delta ^2)^n + 2 (1-4\delta ^2)^n + (1+4\delta ^2)^n - 4\right) \nonumber \\ \end{aligned}$$

(36)

and the result follows by the Taylor expansion $(1+u)^n = 1+nu +O(n^2u^2)$ where $n u = O(1)$ applied to $u=4\delta ^2$. The bound is valid as long as $n = O\left( \delta ^{-2}\right) $.

Proof

(of Claim 6 ). This inequality follows immediately from the Cauchy-Schwarz inequality and the definition of $D_{\chi ^2}$.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pietrzak, K., Skórski, M. (2016). Pseudoentropy: Lower-Bounds for Chain Rules and Transformations. In: Hirt, M., Smith, A. (eds) Theory of Cryptography. TCC 2016. Lecture Notes in Computer Science(), vol 9985. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53641-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-662-53641-4_8
Published: 22 October 2016
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53640-7
Online ISBN: 978-3-662-53641-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the International Association for Cryptologic Research (opens in a new tab)

Pseudoentropy: Lower-Bounds for Chain Rules and Transformations

Abstract

Similar content being viewed by others

The Chain Rule for HILL Pseudoentropy, Revisited

A Counterexample to the Chain Rule for Conditional HILL Entropy

A Note on Perfect Correctness by Derandomization

Keywords

1 Introduction

1.1 Some Implications of Our Lower Bounds

2 Basic Definitions

Definition 1

Definition 2

Definition 3

Definition 4

Definition 5

3 A Lower Bound on Metric-to-HILL Transformations

Theorem 1

4 Lower Bounds on Chain Rules

Theorem 2

Remark 1

Remark 2

5 Open Problems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

A Proof of Theorem 1

1.1 A.1 Majority Is Best

Claim 1

Remark 3

Claim 2

Proof

Proof

Lemma 1

Proposition 1

Lemma 2

Proof

B Proof of Theorem 2

Proof

Claim 3

Proof

Claim 4

Remark 4

Proof

C Proof of Lemma 3

Lemma 3

Remark 5

Claim 5

Claim 6

Proof

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation