1 Introduction

Lattice-based cryptography has emerged in the last few decades as one of the most important developments in cryptography. Lattice-based cryptographic schemes have been shown to achieve functionalities that are unknown under any other cryptographic structure (such as fully homomorphic encryption [Gen09, BV11], attribute-based encryption for circuits [GVW13] and many others). At the same time, it is possible in many cases to show strong security properties such as worst-case to average-case hardness results [Ajt96, AD97, MR04, Reg05] that relate the hardness of breaking the cryptographic scheme to that of solving approximate short-vector problems in worst-case lattices, a problem that resists algorithmic progress even when use of quantum computers is considered.

Much of the progress in advancing lattice-based cryptography can be attributed to the hardness of the Learning with Errors (LWE) problem, introduced by Regev [Reg05]. This problem can be stated in a very clean linear-algebraic syntax, which allows to utilize it for applications very easily, and at the same time was shown to enjoy worst-case hardness as explained above. An instance of the LWE problem has the following form. It is parameterized by a dimension n and modulus \(q \gg n\). Consider the following distribution. Sample a (public) random matrix \(\mathbf {{A}} \in {\mathbb Z}_q^{n \times m}\), for arbitrary \(m = \mathsf {poly}(n)\), and a (secret) random vector \(\varvec{\mathbf {s}} \in {\mathbb Z}_q^n\), and output \((\mathbf {{A}}, \varvec{\mathbf {y}})\), where \(\varvec{\mathbf {y}} = \varvec{\mathbf {s}}\mathbf {{A}} + \varvec{\mathbf {e}} \pmod {q}\), and \(\varvec{\mathbf {e}}\) is a noise vector selected from some distribution (often a Gaussian with parameter \(\sigma \ll q\)). The goal of the LWE solver is to find \(\varvec{\mathbf {s}}\) given \((\mathbf {{A}}, \varvec{\mathbf {y}})\), where m can be as large as the adversary desires. In the most straightforward use of this assumption for cryptography (suggested in Regev’s original paper), \((\mathbf {{A}}, \varvec{\mathbf {y}})\) are used as public key for an encryption scheme, and \(\varvec{\mathbf {s}}\) is the secret key. Similar roles are assumed in other cryptographic constructions.

Goldwasser et al. [GKPV10] initiated a study on the hardness of LWE when \(\varvec{\mathbf {s}}\) is not chosen uniformly at random. This study was motivated by the desire to achieve an entropic notion of security that will allow to guarantee that the problem remains hard even if some information about \(\varvec{\mathbf {s}}\) is leaked. They showed that if \(\varvec{\mathbf {s}}\) is sampled from a binary distribution (i.e. supported over \(\{0,1\}^n\)), then LWE remains hard so long as \(\varvec{\mathbf {s}}\) has sufficient entropy. In fact, sampling \(\varvec{\mathbf {s}}\) from a (possibly sparse) binary distribution is attractive in other contexts such as constructing efficient post-quantum cryptographic objects [NIS], minimizing noise blowup in homomorphic encryption [BGV12], classical worst-case to average-case reduction [BLP+13] and proving hardness for the so-called Learning with Rounding (LWR) problem [BPR12, BGM+16]. Progress on understanding entropic LWE in the binary setting was made in subsequent works [BLP+13, Mic18].

However, the question of hardness of LWE on imperfect secret distributions carries significance beyond the binary setting. If we consider the key-leakage problem, then changing the honest key distribution to be binary just for the sake of improving robustness against key-leakage carries a heavy cost in the performance and security features in case no leakage occurs. An entropic hardness result for the general uniform setting is thus a natural question. Furthermore, for a problem as important as LWE, the mere scientific understanding of the robustness of the problem to small changes in the prescribed distributions and parameters stands as a self-supporting goal.

Alas, it appears that current approaches provide no insight for the general setting. Existing results can be extended beyond the binary setting so long as the norm of the vectors \(\varvec{\mathbf {s}}\) is bounded, i.e. so long as the secret distribution is contained within some small enough ball, as was made explicit by Alwen et al. [AKPW13]. However this appeared to be an artifact of the proof technique and it was speculated by some that a general entropic LWE result should exist. Exploring the hardness of general entropic LWE is the goal of this work.

1.1 Our Results

We relate the hardness of Entropic LWE for arbitrary distributions to a basic property of the distribution, specifically to how bad the distribution performs as an error correcting code against Gaussian noise. Specifically, let \(\mathcal{S}\) be some distribution over secrets in \({\mathbb Z}_q^n\). Recall the notion of conditional smooth min-entropy \(\tilde{H}_\infty \) and define the noise lossiness of \(\mathcal{S}\) as

$$\begin{aligned} \nu _{\sigma }(\mathcal{S}) = \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}}+\varvec{\mathbf {e}}) = - \log \left( \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\mathcal{A}^*(\varvec{\mathbf {s}}+\varvec{\mathbf {e}}) = \varvec{\mathbf {s}}] \right) {,} \end{aligned}$$
(1)

where \(\varvec{\mathbf {s}}\) is sampled from \(\mathcal{S}\) and \(\varvec{\mathbf {e}}\) is (continuous, say) Gaussian noise with parameter \(\sigma \), and \(\mathcal{A}^*\) is the optimal maximal likelihood decoder for \(\varvec{\mathbf {s}}\), namely \(\mathcal{A}^*(\varvec{\mathbf {y}}) = \arg \max _{\varvec{\mathbf {s}}} \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}} [\varvec{\mathbf {s}} | \varvec{\mathbf {y}}=\varvec{\mathbf {s}}+\varvec{\mathbf {e}}]\). This notion is a min-entropy analogue to the notion of equivocation for Shannon-entropy, and can be seen as a guaranteed information loss of a gaussian channel (rather than average information loss).

We advocate for noise lossiness as a new and natural measure for a distribution and show that it allows to get a good handle on the entropic LWE question. We do this by showing that distributions with sufficiently high noise lossiness lead to hard instances of Entropic LWE (under assumptions, see details below). We then show that high min-entropy implies (some limited level of) noise lossiness, which allows us to derive hardness results for general Entropic LWE. We furthermore show that results for distributions supported inside a ball can also be derived using our technique and show that noise lossiness of such distributions is larger than that of general distributions.Footnote 1 Finally, we show that our bounds for the general entropic setting are essentially tight. See below for details.

Noise Lossiness Implies Entropic LWE Hardness (Sect. 4). We show that high noise lossiness implies entropic hardness. Our result relies on the hardness of the decision version of LWE (with “standard” secret distribution). Whereas the variant we discussed so far is the search variant, which asserts that finding \(\varvec{\mathbf {s}}\) given \((\mathbf {{A}}, \varvec{\mathbf {y}})\) should be hard, the decision variant dLWE asserts that it is computationally hard to even distinguish \((\mathbf {{A}}, \varvec{\mathbf {y}})\) from \((\mathbf {{A}}, \varvec{\mathbf {u}})\) where \(\varvec{\mathbf {u}} \in {\mathbb Z}_q^m\) is uniform. The hardness of decision LWE immediately implies hardness for search LWE, and the converse is also true but not for every noise distribution and via a reduction that incurs some cost. This is also the case in the entropic setting. By default when we refer to (Entropic) LWE in this work, we refer to the search version. We will mention explicitly when referring to the decision version.

Our results in this setting are as follows.

Theorem 1.1

(Main Theorem, Informal). Assume that decision LWE with dimension k, modulus q and Gaussian noise parameter \(\gamma \) is hard. Let \(\mathcal{S}\) be a distribution over \({\mathbb Z}_q^n\) with \(\nu _{\sigma _1}(\mathcal{S}) \ge k \log (q) + \omega (\log \lambda )\) for some parameter \(\sigma _1\), then Entropic LWE with secret distribution \(\mathcal{S}\) and Gaussian noise parameter \(\sigma \approx \sigma _1 \gamma \sqrt{m}\) is hard.

Our actual theorem is even more expressive on two aspects. First, while the above result applies for search Entropic LWE for all values of q, but in some cases, e.g. when q is prime, it also applies to decision Entropic LWE. Second, in the case where \(\mathcal{S}\) is supported inside a ball, the term \(k \log (q)\) can be relaxed to roughly \(k \log (\gamma r)\) where r is the radius of the ball (this only applies to the search version).

We note that we incur a loss in noise that depends on \(\sqrt{m}\), i.e. depends on the number of LWE samples. This is inherent in our proof technique, but using known statistical or computational rerandomization results, this dependence can be replaced by dependence on \(n, \gamma \).

As explained above, most of our results imply hardness for search Entropic LWE and do not directly imply hardness for the decision version (albeit search-to-decision reductions can be applied, as we explained below). We note that this is an artifact of the applicability of our proof technique even in cases where the decision problem is not hard at all. We view this as a potentially useful property which may find future applications. To illustrate, consider the setting where the distributions of \(\varvec{\mathbf {s}}\) and \(\varvec{\mathbf {e}}\), as well as the modulus q, are all even. (Indeed, usually we consider the coordinates of \(\varvec{\mathbf {e}}\) to be continuous Gaussians or a discrete Gaussians over \({\mathbb Z}\), but one may be interested in a setting where they are, say, discrete Gaussian over \(2{\mathbb Z}\).) In this setting, decision LWE is trivially easy, but search LWE remains hard. Our techniques (as detailed in the technical overview below) naturally extend to this setting and can be used to prove entropic hardness in this case as well.

In the standard regime of parameters, where \(\varvec{\mathbf {e}}\) is a continuous Gaussian, we can derive the hardness of the decision problem using known search-to-decision reductions. The most generic version, as in e.g. [Reg05], runs in time \(q \cdot \mathsf {poly}(n)\) but in many cases the dependence on q can be eliminated [Pei09, MM11]. In particular we note that in the ball-bounded setting, search-to-decision does not incur dependence on q.

Noise-Lossiness and Entropy (Sect. 5). We analyze the relation between noise-lossiness and min-entropy of a distribution both in the general setting and in the ball-bounded setting. We derive the following bounds.

Lemma 1.2

(Noise-Lossiness of General Distributions). Let \(\mathcal{S}\) be a general distribution over \({\mathbb Z}_q^n\), then \(\nu _\sigma (\mathcal{S}) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - n \log (q/\sigma ) -1\).

Lemma 1.3

(Noise-Lossiness of Small Distributions). Let \(\mathcal{S}\) be a distribution over \({\mathbb Z}_q^n\) which is supported only inside a ball of radius r, then \(\nu _\sigma (\mathcal{S}) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - 2 r \sqrt{n} / \sigma \).

Putting these results together with our main theorem, we get general Entropic LWE hardness whenever \(\tilde{H}_\infty (\varvec{\mathbf {s}}) \gtrsim k \log (q) + n \log (q \gamma \sqrt{m} /\sigma )\). In the r-ball-bounded setting we require entropy \(\tilde{H}_\infty (\varvec{\mathbf {s}}) \gtrsim k \log (\gamma r) + 2 r \sqrt{n m} \gamma / \sigma \).Footnote 2 Note that if we make the very strong (yet not implausible) assumption that LWE is sub-exponentially secure, then we can use complexity leveraging and choose k to be polylogarithmic, we can choose \(\sigma \) to be large enough that the second term vanishes, and we get entropic hardness even with \(\tilde{H}_\infty (\varvec{\mathbf {s}})\) which is polylogarithmic in the security parameter, in particular independent of \(\log (q)\).

Tightness (Sects. 6 and 7). We provide two tightness results. The first one is essentially a restatement of a bound that was shown in the Ring-LWE setting by Bolboceanu et al. [BBPS19]. It is unconditional, but requires q to have a factor of a proper size.

Theorem 1.4

(Counterexample for Entropic LWE, Informal [BBPS19]). Let n, q, \(\sigma \) be LWE parameters. Then if there exists p s.t. p|q and \(p \approx \sigma \sqrt{n}\), then there exists a distribution \(\mathcal{S}\) with min-entropy roughly \(n \log (q/\sigma )\), such that Entropic LWE is insecure with respect to \(\mathcal{S}\).

However, the above requires that q has a factor of appropriate size. One could wonder whether one can do better for a prime q. While we do not have an explicit counterexample here, we can show that proving such a statement (i.e. security for Entropic LWE with entropy below roughly \(n \log (q/\sigma )\)) cannot be done by a black-box reduction to a standard “game based” assumption. In particular if the reduction can only access the adversary and to the distribution of secrets as black-box, then the entropy bound \(n \log (q/\sigma )\) applies even for prime q.

Theorem 1.5

(Barrier for Entropic LWE, Informal). Let n, q, \(\sigma \) be LWE parameters. Then there is no black-box reduction from Entropic LWE with entropy \(\ll n \log (q/\sigma )\) to any game-based cryptographic assumption.

1.2 Technical Overview

We provide a technical overview of our main contributions.

The Lossiness Approach to Entropic LWE. The starting point of our proof is the lossiness approach. This approach (in some small variants) was used in all existing hardness results for Entropic LWE [GKPV10]. However, prior works were only able to use it for norm-bounded secrets. We show a minor yet crucial modification that allows to relate the hardness of Entropic LWE to the noise-lossiness of the noise distribution.

Fix parameters \(n,q,\sigma \) and recall that the adversary is given \((\mathbf {{A}}, \varvec{\mathbf {y}})\), where \(\mathbf {{A}}\) is uniform, \(\varvec{\mathbf {y}} = \varvec{\mathbf {s}} \mathbf {{A}} + \varvec{\mathbf {e}} \pmod {q}\), \(\varvec{\mathbf {s}}\) sampled from \(\mathcal{S}\) and \(\varvec{\mathbf {e}}\) is a (continuous) Gaussian with parameter \(\sigma \). The lossiness approach replaces the uniform matrix \(\mathbf {{A}}\) with an “LWE matrix” of the form: \(\mathbf {{B}}\mathbf {{C}}+\mathbf {{F}}\), where \(\mathbf {{B}} \in {\mathbb Z}_q^{n \times k}\), \(\mathbf {{C}} \in {\mathbb Z}_q^{k \times m}\) are uniform, and \(k \ll n, m\), and where \(\mathbf {{F}}\) is a matrix whose every element is a (discrete) Gaussian with parameter \(\gamma \). The decisional LWE assumption with dimension k, modulus q and noise parameter \(\gamma \) asserts that \(\mathbf {{B}}\mathbf {{C}}+\mathbf {{F}}\) is computationally indistinguishable from a uniform matrix, and therefore the adversary should also be able to recover \(\varvec{\mathbf {s}}\) when \((\mathbf {{A}}, \varvec{\mathbf {y}})\) is generated using \(\mathbf {{A}} = \mathbf {{B}}\mathbf {{C}}+\mathbf {{F}}\). At this point, the vector \(\varvec{\mathbf {y}}\) is distributed as

$$ \varvec{\mathbf {y}} = \varvec{\mathbf {s}} \mathbf {{A}} + \varvec{\mathbf {e}} = \varvec{\mathbf {s}}\mathbf {{B}}\mathbf {{C}}+\varvec{\mathbf {s}}\mathbf {{F}} + \varvec{\mathbf {e}}{.} $$

The strategies on how to continue from here diverge. The [GKPV10] approach is to say that when \(\varvec{\mathbf {s}}\) is confined inside a ball, and when \(\varvec{\mathbf {e}}\) is a wide enough Gaussian, then the value \(\varvec{\mathbf {s}}\mathbf {{F}} + \varvec{\mathbf {e}}\) is “essentially independent” of \(\varvec{\mathbf {s}}\). This is sometimes referred to as “noise flooding” since the noise \(\varvec{\mathbf {e}}\) “floods” the value \(\varvec{\mathbf {s}}\mathbf {{F}}\) and minimizes its effect. This allows to apply the leftover hash lemma to argue that \(\varvec{\mathbf {s}}\mathbf {{B}}\) is statistically close to a uniform \(\varvec{\mathbf {s}}'\) and obtain a new “standard” LWE instance. The [BLP+13, Mic18] approaches can be viewed as variants of this method, where the argument on \(\varvec{\mathbf {s}}\mathbf {{F}} + \varvec{\mathbf {e}}\) is refined in non-trivial ways to achieve better parameters.

This type of argument cannot work for the general setting (i.e. when \(\varvec{\mathbf {s}}\) is not short) since in this case \(\varvec{\mathbf {s}}\mathbf {{F}} + \varvec{\mathbf {e}}\) can reveal noticeable information about \(\varvec{\mathbf {s}}\). For example, if \(\varvec{\mathbf {s}}\) is a multiple of some large enough factor then the noise \(\varvec{\mathbf {e}}\) can just be rounded away (indeed this will be the starting point for our tightness result, as we explain further below).

Our approach therefore is to resort to a weaker claim. We do not try to change \(\varvec{\mathbf {y}}\) into a form of standard LWE, but instead all we show is that \(\varvec{\mathbf {y}}\) loses information about \(\varvec{\mathbf {s}}\). Namely, we will show that even information-theoretically it is not possible to recover \(\varvec{\mathbf {s}}\) from \((\mathbf {{A}}, \varvec{\mathbf {y}})\). This approach was taken, for example, by Alwen et al. [AKPW13], but they were unable to show lossiness for the general setting. The reason, essentially, is that they also use a refined version of noise flooding, one that did not require that \(\varvec{\mathbf {e}}\) completely floods \(\varvec{\mathbf {s}}\mathbf {{F}}\), only slightly perturb it. We can call it “gentle flooding” for the purpose of this work. A similar argument was used in [DM13] to establish hardness of LWE with uniform errors from a short interval.

We note that in all flooding methods, it is beneficial if \(\mathbf {{F}}\) contains small values as much as possible. Therefore in order to show hardness for \(\varvec{\mathbf {s}}\) with as low entropy as possible, the parameter \(\gamma \) is to be taken as small as possible, while still supporting the hardness of distinguishing \(\mathbf {{B}}\mathbf {{C}}+\mathbf {{F}}\) from uniform.

Our Approach: Gentle Flooding at the Source. Our approach can be viewed in hindsight as a very simple modification of the lossiness/flooding approach, that results in a very clean statement, and the characterization of the noise lossiness as the “right” parameter for the hardness of Entropic LWE.

We take another look at the term \(\varvec{\mathbf {s}}\mathbf {{F}} + \varvec{\mathbf {e}}\) and recall that our goal is to use \(\varvec{\mathbf {e}}\) to lose information about \(\varvec{\mathbf {s}}\). Clearly, if \(\varvec{\mathbf {e}}\) was of the form \(\varvec{\mathbf {e}}_1 \mathbf {{F}}\), then things would be more approachable since then we would simply have \((\varvec{\mathbf {s}}+\varvec{\mathbf {e}}_1)\mathbf {{F}}\), and we will simply need to argue about the lossiness of \(\varvec{\mathbf {s}}\) under additive Gaussian noise (which is exactly our notion of noise lossiness for the distribution \(\mathcal{S}\)). Our observation is that even though \(\varvec{\mathbf {e}}\) does not have this form, the properties of the Gaussian distribution allow to present \(\varvec{\mathbf {e}}\) as \(\varvec{\mathbf {e}} = \varvec{\mathbf {e}}_1 \mathbf {{F}}+ \varvec{\mathbf {e}}_2\), where \(\varvec{\mathbf {e}}_1, \varvec{\mathbf {e}}_2\) are independent random variables (but the distribution of \(\varvec{\mathbf {e}}_2\) depends on \(\mathbf {{F}}\)). This is easiest to analyze when \(\varvec{\mathbf {e}}\) is a continuous Gaussian, which is the approach we take in this work.Footnote 3

It can be shown essentially by definition that the sum of two independent Gaussian vectors with covariance matrices \(\mathbf {{\varSigma }}_1, \mathbf {{\varSigma }}_2\) is a Gaussian with covariance matrix \(\mathbf {{\varSigma }}_1 + \mathbf {{\varSigma }}_2\). It follows that if we choose \(\varvec{\mathbf {e}}_1\) to be a spherical Gaussian with parameter \(\sigma _1\) then \(\varvec{\mathbf {e}}_1 \mathbf {{F}}\) will have covariance matrix \(\sigma _1 \mathbf {{F}}^T \mathbf {{F}}\). Therefore if we choose \(\varvec{\mathbf {e}}_2\) to be an aspherical Gaussian with covariance \(\sigma \mathbf {{I}} - \sigma _1 \mathbf {{F}}^T \mathbf {{F}}\), we get that \(\varvec{\mathbf {e}} = \varvec{\mathbf {e}}_1 \mathbf {{F}} + \varvec{\mathbf {e}}_2\) is indeed a spherical \(\sigma \) Gaussian. There is an important emphasis here, the matrix \(\sigma \mathbf {{I}} - \sigma _1 \mathbf {{F}}^T \mathbf {{F}}\) must be a valid covariance matrix, i.e. positive semidefinite. To guarantee this, we must set the ratio \(\sigma /\sigma _1\) to be at least the largest singular value of the matrix \(\mathbf {{F}}\). Standard results on singular values of Gaussian matrices imply that the largest singular value is roughly \(\sqrt{m} \gamma \), which governs the ratio between \(\sigma _1\) and \(\sigma \). We stress again that \(\varvec{\mathbf {e}}_1\) and \(\varvec{\mathbf {e}}_2\) are independent random variables.

Once we established the decomposition of the Gaussian, we can write \(\varvec{\mathbf {y}}\) as

$$ \varvec{\mathbf {y}} = \varvec{\mathbf {s}} \mathbf {{A}} + \varvec{\mathbf {e}} = \varvec{\mathbf {s}}\mathbf {{B}}\mathbf {{C}}+(\varvec{\mathbf {s}}+\varvec{\mathbf {e}}_1)\mathbf {{F}} + \varvec{\mathbf {e}}_2{.} $$

Now, our noise lossiness term \(\nu _{\sigma _1}(\mathcal{S}) = \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}}+\varvec{\mathbf {e}}_1)\) naturally emerges. Note that \(\varvec{\mathbf {y}}\) cannot provide more information about \(\varvec{\mathbf {s}}\) than the two variables \((\varvec{\mathbf {s}}\mathbf {{B}}, \varvec{\mathbf {s}}+\varvec{\mathbf {e}}_1)\). Since the former contains only \(k \log q\) bits, it follows that if the noise lossiness is sufficiently larger than \(k \log q\), then naturally \(\tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}}+\varvec{\mathbf {e}}_1, \varvec{\mathbf {s}}\mathbf {{B}})\) is non-trivial (we need \(\omega (\log \lambda )\) where \(\lambda \) is the security parameter), which implies that finding \(\varvec{\mathbf {s}}\) is information theoretically hard. Thus the hardness of Entropic (search) LWE is established.

If in addition \(\mathbf {{B}}\) can serve as an extractor (this is the case when the modulus q is prime, or when the \(\mathcal{S}\) is binary), then we can make a stronger claim, that \(\varvec{\mathbf {s}}\mathbf {{B}}\) is statistically close to uniform, and then apply (standard) LWE again in order to obtain hardness for Entropic dLWE directly.

Finally, we notice that for norm-bounded distributions we can improve the parameters further by using LWE in Hermite Normal Form (HNF) which has been shown to be equivalent to standard LWE in [ACPS09]. HNF LWE allows to argue that \(\mathbf {{B}}\mathbf {{C}}+\mathbf {{F}}\) is indistinguishable from uniform even when the elements of \(\mathbf {{B}}\) are also sampled from a Gaussian with parameter \(\gamma \) (same as \(\mathbf {{F}}\)). Using HNF, we can further bound the entropy loss caused by the term \(\varvec{\mathbf {s}}\mathbf {{B}}\) and achieve a bound that is independent of q, and only depends on \(\gamma , r, \sigma \). We can only apply this technique for Entropic search LWE.

For the complete analysis and formal statement of the result, see Sect. 4.

Computing The Noise Lossiness. We briefly explain the intuition behind the noise lossiness computation. The exact details require calculation and are detailed in Sect. 5.

For the sake of this overview, let us consider only “flat” distributions, i.e. ones that are uniform over a set of K strings (and thus have min-entropy \(\log K\)). We will provide an upper bound on the probability \(\Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\mathcal{A}^*(\varvec{\mathbf {s}}+\varvec{\mathbf {e}}) = \varvec{\mathbf {s}}]\) from Eq. (1), which will immediately translate to a bound on the noise-lossiness.

For general distributions, we note that we can write

$$ \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\mathcal{A}^*(\varvec{\mathbf {s}}+\varvec{\mathbf {e}}) = \varvec{\mathbf {s}}] = \int _{\varvec{\mathbf {y}}} \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\varvec{\mathbf {s}}+\varvec{\mathbf {e}} = \varvec{\mathbf {y}} \wedge \mathcal{A}^*(\varvec{\mathbf {y}}) = \varvec{\mathbf {s}} ] dy{,} $$

where the integral is over the entire q-cube (we use integral since we use a continuous distribution for \(\varvec{\mathbf {e}}\), but a calculation with discrete Gaussian will be very similar). Note that the expression \(\Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\varvec{\mathbf {s}}+\varvec{\mathbf {e}} = \varvec{\mathbf {y}} \wedge \mathcal{A}^*(\varvec{\mathbf {y}}) = \varvec{\mathbf {s}} ]\) can be written as \(\Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\mathcal{A}^*(\varvec{\mathbf {y}})+\varvec{\mathbf {e}} = \varvec{\mathbf {y}} \wedge \mathcal{A}^*(\varvec{\mathbf {y}}) = \varvec{\mathbf {s}} ]\), which can then be decomposed since the event \(\mathcal{A}^*(\varvec{\mathbf {y}})+\varvec{\mathbf {e}} = \varvec{\mathbf {y}}\) depends only on \(\varvec{\mathbf {e}}\) and the event \(\mathcal{A}^*(\varvec{\mathbf {y}}) = \varvec{\mathbf {s}}\) depends only on \(\varvec{\mathbf {s}}\) (recall that \(\varvec{\mathbf {y}}\) is fixed at this point). We can therefore write

$$ \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\mathcal{A}^*(\varvec{\mathbf {s}}+\varvec{\mathbf {e}}) = \varvec{\mathbf {s}}] = \int _{\varvec{\mathbf {y}}} \Pr _{\varvec{\mathbf {e}}}[\varvec{\mathbf {e}} = \varvec{\mathbf {y}} - \mathcal{A}^*(\varvec{\mathbf {y}})] \cdot \Pr _{\varvec{\mathbf {s}}}[\mathcal{A}^*(\varvec{\mathbf {y}}) = \varvec{\mathbf {s}}]dy{.} $$

Now, for all \(\varvec{\mathbf {y}}\) it holds that \(\Pr _{\varvec{\mathbf {s}}}[\mathcal{A}^*(\varvec{\mathbf {y}}) = \varvec{\mathbf {s}}] \le 1/K\), simply since \(\mathcal{A}^*(\varvec{\mathbf {y}})\) is a fixed value. It also holds that \(\Pr _{\varvec{\mathbf {e}}}[\varvec{\mathbf {e}} = \varvec{\mathbf {y}} - \mathcal{A}^*(\varvec{\mathbf {y}})]\) is bounded by the maximum value of the Gaussian mass function, which is \(1/\sigma ^n\). We get that

$$ \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\mathcal{A}^*(\varvec{\mathbf {s}}+\varvec{\mathbf {e}}) = \varvec{\mathbf {s}}] \le \frac{1}{K \sigma ^n} \int _{\varvec{\mathbf {y}}} dy = \frac{q^n}{K \sigma ^n}{,} $$

and Lemma 1.2 follows.

For the setting of Lemma 1.3, recall that \(\mathcal{S}\) is supported only over r-norm-bounded vectors. Note that the analysis above is correct up to and including the conclusion that \(\Pr _{\varvec{\mathbf {s}}}[\mathcal{A}^*(\varvec{\mathbf {y}}) = \varvec{\mathbf {s}}] \le 1/K\). Furthermore, \(\mathcal{A}^*(\varvec{\mathbf {y}})\) must return a value in the support of \(\mathcal{S}\), that is small. We therefore remain with the challenge of bounding \(\int _{\varvec{\mathbf {y}}} \Pr _{\varvec{\mathbf {e}}}[\varvec{\mathbf {e}} = \varvec{\mathbf {y}} - \mathcal{A}^*(\varvec{\mathbf {y}})] dy\), when we are guaranteed that \(\left\| \mathcal{A}^*(\varvec{\mathbf {y}})\right\| \le r\). We can deduce that this can only induce a minor perturbation to the \(\varvec{\mathbf {e}}\) Gaussian. Using Gaussian tail bounds the result follows.

Tightness. The result of [BBPS19] (Theorem 1.4 above) is quite straightforward in our setting (they showed a ring variant which is somewhat more involved). The idea to choose \(\mathcal{S}\) to be uniform over the set of all vectors that are multiples of p (or in the [BBPS19] terminology, uniform over an ideal dividing the ideal q). This distribution has min-entropy \(n \log (q/p) \approx n \log (q/\sigma )\) (since \(p \approx \sigma \)), and it clearly leads to an insecure LWE instance since the instance can be taken modulo p in order to recover the noise, and then once the noise is removed the secret can easily be recovered.

The above argument seems to “unfairly” rely on the structure of the modulus q, and one could hope that for prime q, which has no factors, a better result can be achieved. We extend a methodology due to Wichs [Wic13] to show that if such a result exists then it will require non-black-box use of the adversary and/or the sampler for the distribution \(\mathcal{S}\). Consider a black-box reduction that given access to an entropic LWE adversary \(\mathcal{A}\) and a sampler for \(\mathcal{S}\) (we overload the notation and denote the sampler by \(\mathcal{S}\) as well), manages to solve some hard problem, e.g. solve a standard LWE instance.

We show that it is possible to efficiently (jointly) simulate \(\mathcal{A}, \mathcal{S}\), such that in the eyes of a reduction they are indistinguishable from a real high-entropy distribution \(\mathcal{S}\) and an adversary \(\mathcal{A}\) that solves Entropic LWE on it, thus leading to an efficient unconditional algorithm for said hard problem. The basic idea relies on the natural intuition that it is hard to generate a “valid” LWE instance without knowing the value of \(\varvec{\mathbf {s}}\) that is being used. While this intuition is false in many situations, we show that in the entropic setting with black-box reductions it can be made formal.

Specifically, consider \(\mathcal{S}\) that is just a uniform distribution over a set of K randomly chosen strings (note that this distribution does not have an efficient sampler, but a black-box reduction is required to work in such a setting as well, and we will show how to simulate \(\mathcal{S}\) efficiently). The adversary \(\mathcal{A}\), upon receiving an instance \((\mathbf {{A}}, \varvec{\mathbf {y}})\) first checks that \(\mathbf {{A}}\) is full rank (otherwise return \(\bot \)), and if so it brute-forces \(\varvec{\mathbf {s}}\) out of \(\varvec{\mathbf {y}}\) by trying all possible \(\varvec{\mathbf {s}}^*\) in the support of \(\mathcal{S}\), and if there is one for which \(\varvec{\mathbf {y}} -\varvec{\mathbf {s}}^* \mathbf {{A}} \pmod {q}\) is short (i.e. of the length that we expect from noise with Gaussian parameter \(\sigma \)), then return a random such \(\varvec{\mathbf {s}}^*\) as answer (otherwise return \(\bot \)). This is a valid adversary for Entropic LWE and therefore it should allow the reduction to solve the hard problem.

Now, let us show how to simulate \(\mathcal{S}, \mathcal{A}\) efficiently. The idea is to rely on the intuition that the reduction cannot generate valid LWE instances with values of \(\mathcal{S}\) that it does not know, and since the distribution is sparse, the reduction cannot generate strings in the support of \(\mathcal{S}\) in any way except calling the \(\mathcal{S}\) sampler. Furthermore, since the reduction can only make polynomially many queries to the sampler, there are only polynomially many options for \(\varvec{\mathbf {s}}\) for which it can generate valid LWE instances, and our efficient implementation of \(\mathcal{A}\) can just check these polynomially many options. (Note that throughout this intuitive outline we keep referring to valid Entropic LWE instances, the above argument actually fails without a proper notion of validity as will be explained below.)

Concretely, we will simulate the adversary using a stateful procedure, i.e. one that keeps state. However, in the eyes of the reduction this will simulate the original stateless adversary and therefore will suffice for our argument. We will simulate \(\mathcal{S}\) using “lazy sampling”. Whenever the reduction makes a call to \(\mathcal{S}\), we will just sample a new random string \(\varvec{\mathbf {s}}\), and save the new sample to its internal state. When a query \((\mathbf {{A}}, \varvec{\mathbf {y}})\) to \(\mathcal{A}\) is made, then we first check that \(\mathbf {{A}}\) is indeed full rank (otherwise return \(\bot \)), and if it is the case, go over all vectors \(\varvec{\mathbf {s}}^*\) that we generated so far (and are stored in the state), and check whether \(\varvec{\mathbf {y}} -\varvec{\mathbf {s}}^* \mathbf {{A}} \pmod {q}\) is short (in the same sense as above, i.e. of the length that we expect from noise with Gaussian parameter \(\sigma \)). If it is the case then a random such \(\varvec{\mathbf {s}}^*\) is returned as the Entropic LWE answer. If the scan did not reveal any adequate candidate, then return \(\bot \).

We want to argue that the above simulates the stateless process. The first step is to show that if there is no \(\varvec{\mathbf {s}}^*\) in the state and thus our simulated adversary returns \(\bot \), then the inefficient adversary would also have returned \(\bot \) with all but negligible probability. Secondly, noticing that when our simulated adversary does return a value \(\varvec{\mathbf {s}}^*\), this \(\varvec{\mathbf {s}}^*\) is a value that the reduction already received as a response to a \(\mathcal{S}\) query, and only one such \(\varvec{\mathbf {s}}^*\) exists. In fact, both of these concerns boil down to properly defining a notion of validity of the Entropic LWE instance that will prevent both of these concerns.

To this end, we notice that the original inefficient adversary return a non-\(\bot \) value only on instances where \(\mathbf {{A}}\) is full rank, and there exists a short \(\varvec{\mathbf {e}}^*\) and value \(\varvec{\mathbf {s}}^*\) in the support of \(\mathcal{S}\) such that \(\varvec{\mathbf {y}} = \varvec{\mathbf {s}}^*\mathbf {{A}}+\varvec{\mathbf {e}}^*\). We will prove that it is not possible to find an instance which is valid for \(\varvec{\mathbf {s}}\) in the support of \(\mathcal{S}\) which has not been seen by the reduction. This will address both concerns and can be proven since the unseen elements of \(\mathcal{S}\) are just randomly sampled strings, so we can think of the vectors as sampled after the matrix \(\mathbf {{A}}\) is determined. The probability of a random vector \(\varvec{\mathbf {s}}\) to be s.t. \(\varvec{\mathbf {y}} - \varvec{\mathbf {s}} \mathbf {{A}}\) is \(\sigma \)-short, where \(\varvec{\mathbf {y}}\) is arbitrary and \(\mathbf {{A}}\) is full rank, is roughly \((\sigma /q)^n\). This translates to the cardinality K of \(\mathcal{S}\) being as large as (roughly) \(n \log (q/\sigma )\) and still allowing to apply the union bound. The result thus follows.

Maybe somewhat interestingly, while our security proofs for entropic LWE are technically similar to converse coding theorems [Sha48, W+59], our barrier result resembles the random coding arguments used to prove the coding theorem [Sha48, Sha49].

2 Preliminaries

We will denote the security parameter by \(\lambda \). We say a function \(\nu (\lambda )\) is negligible if \(\nu (\lambda ) \in \lambda ^{- \omega (1)}\). We will generally denote row vectors by \(\varvec{\mathbf {x}}\) and column vectors by \(\varvec{\mathbf {x}}^\top \). We will denote the \(L_2\) norm of a vector \(\varvec{\mathbf {x}}\) by \(\Vert \varvec{\mathbf {x}} \Vert = \sqrt{\sum _i x_i^2}\) and the \(L_\infty \) norm by \(\Vert \varvec{\mathbf {x}} \Vert _\infty = \max _i |x_i|\).

We denote by \({\mathbb T}_q = {\mathbb R}/ q{\mathbb Z}\) be the real torus of scale q. We can embed \({\mathbb Z}_q = {\mathbb Z}/ q{\mathbb Z}\) into \({\mathbb T}_q\) in the natural way. \({\mathbb T}_q\) is an abelian group and therefore a \({\mathbb Z}\)-algebra. Thus multiplication of vectors from \({\mathbb T}_q^n\) with \({\mathbb Z}\)-matrices is well-defined. \({\mathbb T}_q\) is however not a \({\mathbb Z}_q\)-algebra. We will represent \({\mathbb T}_q\) elements by their central residue class representation in \([-q/2,q/2)\).

For a continuous random variable \(\varvec{\mathbf {x}}\), we will denote the probability-density function of \(\varvec{\mathbf {x}}\) by \(p_{\varvec{\mathbf {x}}}(\cdot )\). We will denote the probability density of \(\varvec{\mathbf {x}}\) conditioned on an event E by \(p_{\varvec{\mathbf {x}} | E}(\cdot )\). Let XY be two discrete random variables defined on a common support \(\mathcal {X}\). We define the statistical distance between X and Y as \(\varDelta (X,Y) = \sum _{x \in \mathcal {X}} |\Pr [X = x] - \Pr [Y = x]|\). Likewise, if X and Y are two continuous random variables defined on a measurable set \(\mathcal {X}\), we define the statistical distance between X and Y as \(\varDelta (X,Y) = \int _{x \in \mathcal {X}} |p_X(x) - p_Y(x)|\).

Random Matrices. Let p be a prime modulus. Let \(\mathbf {A} \leftarrow _\$\mathbb {Z}_p^{n \times m}\) be chosen uniformly at random. Then the probability that \(\mathbf {A}\) is not invertible (i.e. does not have an invertible column-submatrix)

$$ \Pr [\mathbf {A} \text { not invertible}] = 1 - \prod _{i = 0}^{n-1} (1 - p^{i - m}) \le p^{n - m}. $$

For an arbitrary modulus q, a matrix \(\mathbf {A}\) is invertible if and only if it is invertible modulo all prime factors \(p_i\) of q. As we can bound the number of prime factors of q by \(\log (q)\), we get for an \(\mathbf {A} \leftarrow _\$\mathbb {Z}_p^{n \times m}\) that

$$ \Pr [\mathbf {A} \text { not invertible}] \le \log (q) \cdot 2^{n - m}. $$

2.1 Min-entropy

Let \(\varvec{\mathbf {x}}\) be a discrete random variable supported on a set X and \(\varvec{\mathbf {z}}\) be a possibly (continuous) random variable supported on a (measurable) set Z. The conditional min-entropy \(\tilde{H}_\infty (\varvec{\mathbf {x}} | \varvec{\mathbf {z}})\) of \(\varvec{\mathbf {x}}\) given \(\varvec{\mathbf {z}}\) is defined by

$$\begin{aligned} \tilde{H}_\infty (\varvec{\mathbf {x}} | \varvec{\mathbf {z}})&= - \log \left( \mathsf {E}_{\varvec{\mathbf {z}}'} \left[ \max _{\varvec{\mathbf {x}}' \in X} \Pr [\varvec{\mathbf {x}} = \varvec{\mathbf {x}}' | \varvec{\mathbf {z}} = \varvec{\mathbf {z}}'] \right] \right) . \end{aligned}$$

In the case that \(\varvec{\mathbf {z}}\) is continuous, this becomes

$$ \tilde{H}_\infty (\varvec{\mathbf {x}} | \varvec{\mathbf {z}}) = - \log \left( \int _{\varvec{\mathbf {z}}'} p_{\varvec{\mathbf {z}}}(\varvec{\mathbf {z}}') \max _{\varvec{\mathbf {x}}' \in X} \Pr [\varvec{\mathbf {x}} = \varvec{\mathbf {x}}' | \varvec{\mathbf {z}} = \varvec{\mathbf {z}}'] \right) , $$

where \(p_{\varvec{\mathbf {z}}}(\cdot )\) is the probability density of \(\varvec{\mathbf {z}}\).

2.2 Leftover Hashing

We recall the generalized leftover hash lemma [DORS08, Reg05]

Lemma 2.1

Let q be a modulus and let nk be integers. Let \(\varvec{\mathbf {s}}\) be a random variable defined on \(\mathbb {Z}_q^n\) and let \(\mathbf {B} \leftarrow _\$\mathbb {Z}_q^{n \times k}\) be chosen uniformly random. Furthermore let Y be a random-variable (possibly) correlated with \(\varvec{\mathbf {s}}\). Then it holds that

$$ \varDelta ((\mathbf {B},\varvec{\mathbf {s}}\mathbf {B},Y),(\mathbf {B},\varvec{\mathbf {u}},Y)) \le \sqrt{q^k \cdot 2^{- \tilde{H}_\infty (\varvec{\mathbf {s}} | Y)}}. $$

2.3 Gaussians

Continuous Gaussians. A matrix \(\mathbf {{\varSigma }} \in {\mathbb R}^{n \times n}\) is called positive definite, if it holds for every \(\varvec{\mathbf {x}} \in {\mathbb R}^n \backslash \{ \varvec{\mathbf {0}} \}\) that \(\varvec{\mathbf {x}} \mathbf {{\varSigma }} \varvec{\mathbf {x}}^\top > 0\). For every positive definite matrix \(\mathbf {{\varSigma }}\) there exists a unique positive definite matrix \(\sqrt{\mathbf {{\varSigma }}}\) such that \((\sqrt{\mathbf {{\varSigma }}})^2 = \mathbf {{\varSigma }}\).

For a parameter \(\sigma > 0\) define the n-dimensional gaussian function \(\rho _\sigma : {\mathbb R}^n \rightarrow (0,1]\) by

$$ \rho _\sigma (\varvec{\mathbf {x}}) = e^{- \pi \Vert \varvec{\mathbf {x}} \Vert ^2 / \sigma ^2}. $$

For a positive definite matrix \(\mathbf {{\varSigma }} \in {\mathbb R}^{n \times n}\), define the function \(\rho _{\sqrt{\mathbf {{\varSigma }}}}: {\mathbb R}^n \rightarrow (0,1]\) by

$$ \rho _{\sqrt{\mathbf {{\varSigma }}}}(\varvec{\mathbf {x}}) := e^{- \pi \varvec{\mathbf {x}} \mathbf {{\varSigma }}^{-1} \varvec{\mathbf {x}}^\top }. $$

For a scalar \(\sigma > 0\), we will define

$$ \rho _{\sigma }(\varvec{\mathbf {x}}) := \rho _{\sigma \cdot \mathbf {{I}}}(\varvec{\mathbf {x}}) = e^{- \pi \Vert \varvec{\mathbf {x}}\Vert ^2 / \sigma ^2}. $$

The total measure of \(\rho _{\sqrt{\mathbf {{\varSigma }}}}\) over \({\mathbb R}^n\) is

$$ \rho _{\sqrt{\mathbf {{\varSigma }}}}({\mathbb R}^n) = \int _{{\mathbb R}^n} \rho _{\sqrt{\mathbf {{\varSigma }}}}(\varvec{\mathbf {x}}) d\varvec{\mathbf {x}} = \sqrt{\det (\mathbf {{\varSigma }})}. $$

In the scalar case this becomes

$$ \rho _{\sigma }({\mathbb R}^n) = \int _{{\mathbb R}^n} \rho _{\sigma }(\varvec{\mathbf {x}}) d\varvec{\mathbf {x}} = \sigma ^n. $$

Normalizing \(\rho _{\sqrt{\mathbf {{\varSigma }}}}\) by \(\rho _{\sqrt{\mathbf {{\varSigma }}}}({\mathbb R}^n)\) yields the probability density for the continuous gaussian distribution \(D_{\sqrt{\mathbf {{\varSigma }}}}\) over \({\mathbb R}^n\).

For a discrete set \(S \subseteq {\mathbb R}^n\) we define \(\rho _{\sqrt{\mathbf {{\varSigma }}}}(S)\) by

$$ \rho _{\sqrt{\mathbf {{\varSigma }}}}(S) := \sum _{\varvec{\mathbf {s}} \in S} \rho _{\sqrt{\mathbf {{\varSigma }}}}(\varvec{\mathbf {s}}). $$

In particular, for a integer q we have

$$ \rho _{\sqrt{\mathbf {{\varSigma }}}}(q \mathbb {Z}^n) = \sum _{\varvec{\mathbf {z}} \in q \mathbb {Z}^n} \rho _{\sqrt{\mathbf {{\varSigma }}}}(\varvec{\mathbf {z}}). $$

For a gaussian \(x \sim D_\sigma \) we get the tail-bound

$$ \Pr [ |x| \ge t] \le 2\cdot e^{- \frac{t^2}{2 \sigma ^2}}. $$

As a simple consequence we get \(\Pr [ |x| \ge (\log (\lambda )) \cdot \sigma ] \le \mathsf {negl}(\lambda )\).

Discrete Gaussians. We say a random variable x defined on \(\mathbb {Z}\) follows the discrete gaussian distribution \(D_{\mathbb {Z},\sigma }\) for a parameter \(\sigma > 0\), if the probability mass function of x is given by

$$ \Pr [x = x'] = \frac{\rho _\sigma (x')}{\rho _\sigma (\mathbb {Z})} $$

for every \(x' \in \mathbb {Z}\).

Modular Gaussians. For a modulus q, we also define the q-periodic gaussian function \(\tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}: \) by

$$ \tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}(\varvec{\mathbf {x}}) := \sum _{\varvec{\mathbf {z}} \in q \mathbb {Z}^n} \rho _{q,\sqrt{\mathbf {{\varSigma }}}}(\varvec{\mathbf {x}} - \varvec{\mathbf {z}}). $$

We define \(\tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}({\mathbb T}_q^n)\) by

$$ \tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}({\mathbb T}_q^n) := \tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}([-q/2,q/2)^n) = \int _{[-q/2,q/2)^n} \tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}(\varvec{\mathbf {x}}) d\varvec{\mathbf {x}} = \rho _{\sqrt{\mathbf {{\varSigma }}}}({\mathbb R}^n). $$

Consequently, normalizing \(\tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}\) by \(\tilde{\rho }_{q,\sqrt{\mathbf {{\varSigma }}}}({\mathbb T}_q^n)\) yields a probability density on \({\mathbb T}_q^n\). We call the corresponding distribution \(D_{\sqrt{\mathbf {{\varSigma }}}} \mod q\) a modular gaussian. A \(\varvec{\mathbf {x}} \sim D_{\sqrt{\mathbf {{\varSigma }}}} \mod q\) can be sampled by sampling and \(\varvec{\mathbf {x}}' \leftarrow _\$D_{\sqrt{\mathbf {{\varSigma }}}}\) and computing \(\varvec{\mathbf {x}} \leftarrow \varvec{\mathbf {x}}' \mod q\).

In order to prove our strong converse coding theorems, we need various upper bounds for the periodic gaussian function. We will use the following variant of the smoothing lemma of Micciancio and Regev [MR04]Footnote 4.

Lemma 2.2

(Smoothing Lemma [MR04]). Let \(\epsilon > 0\). Given that \(\frac{1}{\sigma } \ge \sqrt{\frac{\ln (2n(1 + 1/\epsilon ))}{\pi }} \cdot \frac{1}{q}\), then it holds that

$$ \rho _\sigma (q \mathbb {Z}^n \backslash \{ \varvec{\mathbf {0}} \}) \le \epsilon . $$

Lemma 2.3

The periodic gaussian function \(\tilde{\rho }_{q,\sigma }\) assumes its maximum at \(q \cdot \mathbb {Z}^n\). In particular, it holds for all \(\varvec{\mathbf {x}} \in {\mathbb R}^n\) that \(\tilde{\rho }_{q,\sigma }(\varvec{\mathbf {x}}) \le \tilde{\rho }_{q,\sigma }(\varvec{\mathbf {0}})\).

See the full version [BD20] for proof.

Lemma 2.4

If \(\frac{q}{\sigma } \ge \sqrt{\frac{\ln (4n)}{\pi }}\), then it holds for all \(\varvec{\mathbf {x}} \in {\mathbb R}^n\) that

$$ \tilde{\rho }_{q,\sigma }(\varvec{\mathbf {x}}) \le 2. $$

See the full version [BD20] for proof.

We will use the following estimate for shifted gaussians.

Lemma 2.5

Let \(\sigma _2> \sigma _1 > 0\). Then it holds for all \(\varvec{\mathbf {x}} \in {\mathbb R}^n\) and \(\varvec{\mathbf {t}} \in {\mathbb R}^n\) that

$$ \rho _{\sigma _1}(\varvec{\mathbf {x}} - \varvec{\mathbf {t}}) \le e^{\pi \frac{\Vert \varvec{\mathbf {t}} \Vert ^2}{\sigma _2^2 - \sigma _1^2} } \cdot \rho _{\sigma _2}(\varvec{\mathbf {x}}). $$

Moreover, the same holds for the q-periodic gaussian function \(\hat{\rho }_{q \mathbb {Z}^n, \sigma _1}\), i.e.

$$ \hat{\rho }_{q \mathbb {Z}^n,\sigma _1}(\varvec{\mathbf {x}} - \varvec{\mathbf {t}}) \le e^{\pi \frac{\Vert \varvec{\mathbf {t}} \Vert ^2}{\sigma _2^2 - \sigma _1^2} } \cdot \hat{\rho }_{q \mathbb {Z}^n,\sigma _2}(\varvec{\mathbf {x}}). $$

See the full version [BD20] for proof.

2.4 Learning with Errors

The learning with errors (LWE) problem was defined by Regev [Reg05]. The search problem \(\mathsf {LWE}(n,m,q,\chi )\), for \(n,m,q\in \mathbb {N}\) and for a distribution \(\chi \) supported over the torus \({\mathbb T}_q\) is to find \(\varvec{\mathbf {s}}\) given \((\mathbf {A}, \varvec{\mathbf {s}}\mathbf {A}+\varvec{\mathbf {e}})\), where \(\mathbf {A} \leftarrow _\$\mathbb {Z}_q^{n \times m}\) is chosen uniformly random and \(\varvec{\mathbf {e}} \leftarrow _\$\chi ^m\) is chosen according to \(\chi ^m\). The decisional version \(\mathsf {dLWE}(n,m,q,\chi )\) asks to distinguish between the distributions \((\mathbf {A}, \varvec{\mathbf {s}}\mathbf {A}+\varvec{\mathbf {e}})\) and \((\mathbf {A}, \varvec{\mathbf {u}} + \varvec{\mathbf {e}})\), where \(\mathbf {A}\), \(\varvec{\mathbf {s}}\) and \(\varvec{\mathbf {e}}\) are as in the search version and \(\varvec{\mathbf {u}} \leftarrow _\$\mathbb {Z}_q^m\) is chosen uniformly random. We also consider the hardness of solving \(\mathsf {dLWE}\) for any \(m=\mathsf {poly}(n \log q)\). This problem is denoted \(\mathsf {dLWE}(n,q,\chi )\). The matrix version of this problem asks to distinguish \((\mathbf {A},\mathbf {S}\cdot \mathbf {A} + \mathbf {E})\) from \((\mathbf {A},\mathbf {U})\), where \(\mathbf {S} \leftarrow _\$\mathbb {Z}_q^{k \times n}\), \(\mathbf {E} \leftarrow _\$\chi ^{k \times m}\) and \(\mathbf {U} \leftarrow \mathbb {Z}_q^{k \times m}\). The hardness of the matrix version for any \(k = \mathsf {poly}(n)\) can be established from \(\mathsf {dLWE}_{n,m,q,\chi }\) via a routine hybrid-argument. Moreover, Applebaum et al. [ACPS09] showed that if the error-distribution \(\chi \) is supported on \(\mathbb {Z}_q\), then the matrix \(\mathbf {S}\) can also be chosen from \(\chi ^{k \times m}\) without affecting the hardness of the problem.

As shown in [Reg05], the \(\mathsf {LWE}(n,q,\chi )\) problem with \(\chi \) being a continuous Gaussian distribution with parameter \(\sigma = \alpha q \ge 2 \sqrt{n}\) is at least as hard as approximating the shortest independent vector problem (\(\mathsf {SIVP}\)) to within a factor of \(\gamma = \tilde{O}({n}/\alpha )\) in worst case dimension n lattices. This is proven using a quantum reduction. Classical reductions (to a slightly different problem) exist as well [Pei09, BLP+13] but with somewhat worse parameters. The best known (classical or quantum) algorithms for these problems run in time \(2^{\tilde{O}(n/\log \gamma )}\), and in particular they are conjectured to be intractable for \(\gamma = \mathsf {poly}(n)\).

Regev also provided a search-to-decision reduction which bases the hardness of the decisional problem \(\mathsf {dLWE}(n,q,\chi )\) on the search version \(\mathsf {LWE}(n,q,\chi )\) whenever q is prime of polynomial size. This reduction has been generalized to more general classes of moduli [Pei09, BLP+13]. Moreover, there exists a sample preserving reduction which [MM11] which bases the hardness of \(\mathsf {dLWE}(n,m,q,\chi )\) on \(\mathsf {LWE}(n,m,q,\chi )\) for certain moduli q without affecting the number of samples m.

Finally, Peikert [Pei10] provided a randomized rounding algorithm which allows to base the hardness of \(\mathsf {LWE}(n,m,q,D_{\mathbb {Z},\sigma '})\) (i.e. LWE with a discrete gaussian error \(D_{\mathbb {Z},\sigma '}\)) on \(\mathsf {LWE}(n,m,q,D_{\sigma })\) (continuous gaussian error), where \(\sigma '\) is only slightly larger than \(\sigma \).

2.5 Entropic LWE

We will now consider LWE with entropic secrets, entropic LWE for short. In this variant, we allow the distribution of secrets \(\mathcal {S}\) to be chosen from a family of distributions \(\bar{\mathcal {S}}= \{ \mathcal {S} _i \}_i\). This captures the idea the distribution of secrets can be worst-case from a certain family.

Definition 2.6

(Entropic LWE). Let \(q = q(\lambda )\) be a modulus and \(n,m = \mathsf {poly}(\lambda )\). Let \(\chi \) be an error-distribution on \({\mathbb T}_q\). Let \(\bar{\mathcal {S}} = \mathcal {S}(\lambda ,q,n,m)\) be a family of distributions on \(\mathbb {Z}_q^n\). We say that the search problem \(\mathsf {ent\text {-}LWE}(q,n,m,\bar{\mathcal {S}},\chi )\) is hard, if it holds for every PPT adversary \(\mathcal {A}\) and every \(\mathcal {S} \in \bar{\mathcal {S}}\) that

$$ \Pr [\mathcal {A}(1^\lambda ,\mathbf {A},\varvec{\mathbf {s}}\cdot \mathbf {A} + \varvec{\mathbf {e}}) = \varvec{\mathbf {s}}] \le \mathsf {negl}(\lambda ), $$

where \(\mathbf {A} \leftarrow _\$\mathbb {Z}_q^{m \times n}\), \(\varvec{\mathbf {s}} \leftarrow _\$\mathcal {S}\) and \(\varvec{\mathbf {e}} \leftarrow _\$\chi ^m\). Likewise, we say that the decisional problem \(\mathsf {ent\text {-}dLWE}(q,n,m,\bar{\mathcal {S}},\chi )\) is hard, if it holds for every PPT distinguisher \(\mathcal {D}\) and every \(\mathcal {S} \in \bar{\mathcal {S}}\) that

$$ |\Pr [\mathcal {D}(1^\lambda ,\mathbf {A},\varvec{\mathbf {s}}\mathbf {A} + \varvec{\mathbf {e}}) = 1] - \Pr [\mathcal {D}(1^\lambda ,\varvec{\mathbf {u}} + \varvec{\mathbf {e}}) = 1]| \le \mathsf {negl}(\lambda ), $$

where \(\mathbf {A} \leftarrow _\$\mathbb {Z}_q^{m \times n}\), \(\varvec{\mathbf {s}} \leftarrow _\$\mathcal {S}\), \(\varvec{\mathbf {e}} \leftarrow _\$\chi ^m\) and \(\varvec{\mathbf {u}} \leftarrow _\$\mathbb {Z}_q^m\).

3 Probability-Theoretic Tools

3.1 Singular Values of Discrete Gaussian Matrices

Consider a real valued matrix \(\mathbf {{A}} \in {\mathbb R}^{n\times m}\), assume for convenience that \(m \ge n\). The singular values of \(\mathbf {{A}}\) are the square roots of the eigenvalues of the positive semidefinite (PSD) matrix \(\mathbf {{A}}\mathbf {{A}}^\top \). They are denoted \(\sigma _1(\mathbf {{A}}) \ge \cdots \ge \sigma _n(\mathbf {{A}}) \ge 0\). The spectral norm of \(\mathbf {{A}}\) is \(\sigma _1(\mathbf {{A}})\), and we will also denote it by \(\sigma _A\). It holds that

$$ \sigma _{\mathbf {{A}}} = \sigma _1(\mathbf {{A}}) = \max _{\varvec{\mathbf {x}} \in {\mathbb R}^m \setminus \{\varvec{\mathbf {0}}\}} \frac{\left\| \mathbf {{A}}\varvec{\mathbf {x}}\right\| }{\left\| x\right\| }{.} $$

We will be interested in the of discrete Gaussian matrices.

Proposition 3.1

([MP12, Lemma 2.8, 2.9]). Let \(\mathbf {{F}} \sim D_{{\mathbb Z}, \gamma }^{n \times m}\), assume for convenience that \(m \ge n\). Then with all but \(2^{-m}\) probability it holds that \(\sigma _{\mathbf {{F}}} \le \gamma \cdot C \cdot \sqrt{m}\), where C is a global constant.

3.2 Decomposition Theorem for Continuous Gaussians

The following proposition is an immediate corollary of the properties of (continuous) Gaussian vectors. We provide a proof for the sake of completeness.

Proposition 3.2

Let \(\mathbf {{F}} \in {\mathbb Z}^{n \times m}\) be an arbitrary matrix with spectral norm \(\sigma _F\). Let \(\sigma , \sigma _1 >0\) be s.t. \(\sigma > \sigma _1 \cdot \sigma _F\). Let \(\varvec{\mathbf {e}}_1 \sim D^n_{\sigma _1}\) and let \(\varvec{\mathbf {e}}_2 \sim D_{\sqrt{\mathbf {{\varSigma }}}}\) for \(\mathbf {{\varSigma }} = \sigma ^2 \mathbf {{I}} - \sigma _1^2 \mathbf {{F}}^\top \mathbf {{F}}\). Then the random variable \(\varvec{\mathbf {e}} = \varvec{\mathbf {e}}_1 \mathbf {{F}} + \varvec{\mathbf {e}}_2\) is distributed according to \(D^m_{\sigma }\).

Proof

First note that \(\varSigma \) is positive definite: It holds for any \(\varvec{\mathbf {x}} \in \mathbb {R}^m \backslash \{ \varvec{\mathbf {0}} \}\) that

$$ \varvec{\mathbf {x}} \mathbf {{\varSigma }} \varvec{\mathbf {x}}^\top = \sigma ^2 \Vert \varvec{\mathbf {x}} \Vert ^2 - \sigma _1^2 \Vert \varvec{\mathbf {x}} \mathbf {F} \Vert ^2 \ge \sigma ^2 \Vert \varvec{\mathbf {x}} \Vert - \sigma ^2 \sigma _{\mathbf {F}}^2 \Vert \varvec{\mathbf {x}} \Vert ^2 \ge (\sigma ^2 - \sigma _1^2 \sigma _{\mathbf {F}}^2) \cdot \Vert \varvec{\mathbf {x}} \Vert ^2 > 0, $$

as \(\sigma > \sigma _1 \cdot \sigma _{\mathbf {F}}\). Since \(\varvec{\mathbf {e}}_1, \varvec{\mathbf {e}}_2\) are independent Gaussian vectors, they are also jointly Gaussian, and therefore \(\varvec{\mathbf {e}}\) is also a Gaussian vector. Since \(\varvec{\mathbf {e}}_1, \varvec{\mathbf {e}}_2\) have expectation 0, then so does \(\varvec{\mathbf {e}}\). The covariance matrix of \(\varvec{\mathbf {e}}\) is given by a direct calculation, recalling that \(\varvec{\mathbf {e}}_1, \varvec{\mathbf {e}}_2\) are independent:

$$\begin{aligned} \mathop {{\mathbb E}}[\varvec{\mathbf {e}}^\top \varvec{\mathbf {e}}]&= \mathop {{\mathbb E}}[\mathbf {{F}}^\top \varvec{\mathbf {e}}^\top \varvec{\mathbf {e}} \mathbf {{F}}] + \mathop {{\mathbb E}}[\varvec{\mathbf {e}}_2^\top \varvec{\mathbf {e}}_2]\\&= \mathbf {{F}}^\top \sigma _1^2 \mathbf {{I}} \mathbf {{F}} + \mathbf {{\varSigma }}\\&= \sigma _1^2 \mathbf {{F}}^\top \mathbf {{F}} + \sigma ^2 \mathbf {{I}} - \sigma _1^2 \mathbf {{F}}^\top \mathbf {{F}}\\&= \sigma ^2 \mathbf {{I}}{,} \end{aligned}$$

and the statement follows.   \(\square \)

4 Hardness of Entropic LWE with Gaussian Noise

In this Section we will establish our main result, the hardness of entropic search LWE with continuous gaussian noise. Using standard techniques, we can conclude that entropic search LWE with discrete gaussian noise is also hard. Finally for suitable moduli a search-to-decision reduction can be used to establish the hardness of entropic decisional LWE.

Theorem 4.1

Let C be the global constant from Proposition 3.1. Let \(q = q(\lambda )\) be a modulus and \(n,m = \mathsf {poly}(\lambda )\) where \(m \ge n\), and let \(r, \gamma , \sigma _1 > 0\). Let \(\varvec{\mathbf {s}}\) be a random variable on \(\mathbb {Z}_q^n\) distributed according to some distribution \(\mathcal {S}\). Let \(\varvec{\mathbf {e}}_1 \sim D_{\sigma _1} \mod q\) be an error term. Assume that \(\varvec{\mathbf {s}}\) is \(r\)-bounded, where we assume that \(r= q\) if no bound for \(\varvec{\mathbf {s}}\) is known. Further assume that

$$ \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}_1) \ge k \cdot \log (\min \{2 C \cdot \gamma \cdot \sqrt{n} r, q\}) + \omega (\log (\lambda )) $$

Let \(\sigma > C \cdot \sqrt{m} \cdot \gamma \cdot \sigma _1\). Then the search problem \(\mathsf {ent\text {-}LWE}(q,n,m,\mathcal {S},D_\sigma )\) is hard, provided that \(\mathsf {dLWE}(q,k,D_{\mathbb {Z},\gamma })\) is hard.

Furthermore, if \(\tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}_1) \ge k \cdot \log (q) + \omega (\log (\lambda ))\) and we have that either q is prime or \(\varvec{\mathbf {s}} \in \{0,1\}^n\), then the decisional problem \(\mathsf {ent\text {-}dLWE}(q,n,m,\mathcal {S},D_\sigma )\) is hard, provided that \(\mathsf {dLWE}(q,k,D_{\mathbb {Z},\gamma })\) and \(\mathsf {dLWE}(q,k,m,D_\sigma )\) are hard.

See the full version [BD20] for proof.

5 Noise-Lossiness for Modular Gaussians

In this Section, we will compute the noise lossiness for general high-minentropy distributions. We further show that considerable improvements can be achieved when considering short distributions. Our Lemmas in this Section can be seen as strong converse coding theorems for gaussian channels. I.e. if a distribution codes above a certain information rate, then information must be lost and noise lossiness quantifies how much information is lost. The following lemma will allow us to bound \(\tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}})\) by suitably bounding \(\max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {e}}}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*)\).

Lemma 5.1

Let \(q \in \mathbb {N}\) be a modulus and fix \(n,m \in \mathbb {N}\) with \(m > n\). Let \(\varvec{\mathbf {s}}\) be a random variable on \(\mathbb {Z}_q^k\) with min-entropy \(\tilde{H}_\infty (\varvec{\mathbf {s}})\). Let \(\chi \) be a noise distribution over \({\mathbb R}^n\) and let \(\varvec{\mathbf {e}} \sim \chi \). Then it holds that

$$ \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \log \left( \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {e}}}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}}\right) $$

in the case that \(\chi \) is continuous and

$$ \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \log \left( \sum _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} \Pr _{\varvec{\mathbf {e}}}[\varvec{\mathbf {e}} = \varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*]\right) $$

in the case that \(\chi \) is discrete. Moreover, if \(\varvec{\mathbf {s}}\) is a flat distribution then equality holds.

Proof

The lemma follows from the following derivation in the continuous case. The discrete case follows analogously.

$$\begin{aligned} \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}})&= -\log \left( \mathop {{\mathbb E}}_{\varvec{\mathbf {y}}}[ \max _{\varvec{\mathbf {s}}^*\in S} \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\varvec{\mathbf {s}} = \varvec{\mathbf {s}}^*| \varvec{\mathbf {s}} + \varvec{\mathbf {e}} = \varvec{\mathbf {y}}]] \right) \\&= -\log \left( \int _{\varvec{\mathbf {y}}} p_{\varvec{\mathbf {s}} + \varvec{\mathbf {e}}}(\varvec{\mathbf {y}}) \cdot \max _{\varvec{\mathbf {s}}^*} \Pr _{\varvec{\mathbf {s}}, \varvec{\mathbf {e}}}[\varvec{\mathbf {s}} = \varvec{\mathbf {s}}^*| \varvec{\mathbf {s}} + \varvec{\mathbf {e}} = \varvec{\mathbf {y}}] d\varvec{\mathbf {y}} \right) \\&= -\log \left( \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {s}}, \varvec{\mathbf {s}} + \varvec{\mathbf {e}}}(\varvec{\mathbf {s}}^*,\varvec{\mathbf {y}}) d\varvec{\mathbf {y}} \right) \\&= -\log \left( \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {s}} + \varvec{\mathbf {e}} | \varvec{\mathbf {s}} = \varvec{\mathbf {s}}^*}(\varvec{\mathbf {y}}) \cdot \underbrace{\Pr [\varvec{\mathbf {s}} = \varvec{\mathbf {s}}^*]}_{\le 2^{-\tilde{H}_\infty (\varvec{\mathbf {s}})}} d\varvec{\mathbf {y}}\right) \\&\ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \log \left( \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {e}}}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}}\right) {.} \end{aligned}$$

To see that equality holds for flat distributions, note that in this case we have \(\Pr [\varvec{\mathbf {s}} = \varvec{\mathbf {s}}^*] = 2^{-\tilde{H}_\infty (\varvec{\mathbf {s}})}\).

5.1 General High Entropy Secrets

We first turn to the case of general high-entropy secrets and prove the following lemma.

Lemma 5.2

Let n be an integer, let q be a modulus and \(\sigma _1\) be a parameter for a gaussian. Assume that

$$ \frac{q}{\sigma _1} \ge \sqrt{\frac{\ln (4n)}{\pi }}. $$

Let \(\varvec{\mathbf {s}}\) be a random variable on \(\mathbb {Z}_q^n\) and \(\varvec{\mathbf {e}}_1 \sim D_{\sigma _1} \mod q\). Then it holds that

$$ \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}_1) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - n \cdot \log (q / \sigma _1) - 1 $$

We remark that the requirement \(\frac{q}{\sigma _1} \ge \sqrt{\frac{\ln (4n)}{\pi }}\) is made for technical reasons, but we restrict ourselves to keep the proof simple. We also remark that this condition is essentially trivially fulfilled by interesting parameter choices.

We can instantiate Theorem 4.1 with Lemma 5.2 obtaining the following corollary.

Corollary 5.3

Let C be a global constant. Let \(q = q(\lambda )\) be a modulus and let \(n,m,k = \mathsf {poly}(\lambda )\). Let \(\gamma , \sigma _1 > 0\). Assume that \(\mathcal {S}\) is a distribution on \(\mathbb {Z}_q^n\) with \(\tilde{H}_\infty (\varvec{\mathbf {s}}) > k \cdot \log (q) + n \cdot \log (q/\sigma _1) + \omega (\log (\lambda ))\). Now let \(\sigma > C \cdot \sqrt{m} \cdot \gamma \sigma _1\). Then \(\mathsf {ent\text {-}LWE}(q,n,m,\mathcal {S},D_\sigma )\) is hard, provided that \(\mathsf {dLWE}(q,k,D_{\mathbb {Z},\gamma })\) is hard.

Proof

(of Lemma 5.2). It holds that

$$\begin{aligned} \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {e}}}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}}&= \frac{1}{\rho _{\sigma _1}({\mathbb R}^n)} \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} \hat{\rho }_{q \mathbb {Z}^n, \sigma _1}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}}\\&\le \frac{1}{\rho _{\sigma _1}({\mathbb R}^n)} \cdot \int _{\varvec{\mathbf {y}}} 2 d\varvec{\mathbf {y}}\\&= 2 \cdot \frac{q^n}{\rho _{\sigma _1}({\mathbb R}^n)}\\&= 2 \cdot \frac{q^n}{\sigma _1^n}, \end{aligned}$$

where the \(\hat{\rho }_{q \mathbb {Z}^n, \sigma _1}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) \le 2\) follows by Lemma 2.4 as \(\frac{q}{\sigma _1} \ge \sqrt{\frac{\ln (4n)}{\pi }}\). We can conclude by Lemma 5.1 that

$$\begin{aligned} \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}})&\ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \log \left( \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {e}}}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}} \right) \\&\ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - n \cdot \log (q/\sigma _1) - 1. \end{aligned}$$

5.2 Short Secrets

We will now turn to the case where the secret has bounded norm.

Lemma 5.4

Let n be an integer, let q be a modulus and \(\sigma _1\) be a parameter for a gaussian. Assume that \(\varvec{\mathbf {s}}\) is a random-variable on \(\mathbb {Z}_q^n\) such that \(\Vert \varvec{\mathbf {s}} \Vert \le r\) for a parameter \(r= r(\lambda )\). Let \(\varvec{\mathbf {e}}_1 \sim D_{\sigma _1} \mod q\) Then it holds that

$$ \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}_1) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \sqrt{2 \pi n} \cdot \frac{r}{\sigma _1} \log (e). $$

In particular, if \(\sigma _1 > \sqrt{n} \cdot r\), then \(\tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}_1) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \pi \log (e)\). We can instantiate Theorem 4.1 with Lemma 5.4 obtaining the following corollary.

Corollary 5.5

Let C be a global constant. Let \(q = q(\lambda )\) be a modulus and let \(n,m,k = \mathsf {poly}(\lambda )\). Let \(\gamma = \gamma (\lambda ) > 0\) and \(\sigma _1 = \sigma _1(\lambda ) > 0\). Assume that \(\mathcal {S}\) is a \(r\)-bounded distribution with \(\tilde{H}_\infty (\varvec{\mathbf {s}}) > k \cdot \log (2 C \cdot \gamma \cdot \sigma _1) + \sqrt{2 \pi n} \cdot \frac{r}{\sigma _1} \log (e) + \omega (\log (\lambda ))\). Now let \(\sigma > C \cdot \sqrt{m} \sigma _1 \cdot \gamma \). Then \(\mathsf {ent\text {-}LWE}(q,n,m,\mathcal {S},D_\sigma )\) is hard, provided that \(\mathsf {dLWE}(q,k,D_{\mathbb {Z},\gamma })\) is hard.

Proof

(of Lemma 5.4). Fix some \(\sigma _2 > \sigma _1\). Since it holds that \(\Vert \varvec{\mathbf {s}} \Vert \le r\), it holds that

$$\begin{aligned} \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {e}}}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}}&= \frac{1}{\rho _{\sigma _1}({\mathbb R}^n)} \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} \hat{\rho }_{q \mathbb {Z}^n, \sigma _1}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}}\\&\le \frac{1}{\rho _{\sigma _1}({\mathbb R}^n)} \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} e^{\pi \frac{\Vert \varvec{\mathbf {s}}^*\Vert ^2}{\sigma _2^2 - \sigma _1^2}} \cdot \hat{\rho }_{q \mathbb {Z}^n,\sigma _2}(\varvec{\mathbf {y}}) d\varvec{\mathbf {y}}\\&\le \frac{1 }{\rho _{\sigma _1}({\mathbb R}^n)} \cdot e^{\pi \frac{r^2}{\sigma _2^2 - \sigma _1^2}} \cdot \int _{\varvec{\mathbf {y}}} \hat{\rho }_{q\mathbb {Z}^n,\sigma _2}(\varvec{\mathbf {y}}) d\varvec{\mathbf {y}}\\&= e^{\pi \frac{r^2}{\sigma _2^2 - \sigma _1^2}} \cdot \frac{\rho _{\sigma _2}({\mathbb R}^n)}{\rho _{\sigma _1}({\mathbb R}^n)}\\&= e^{\pi \frac{r^2}{\sigma _2^2 - \sigma _1^2}} \cdot \left( \frac{\sigma _2}{\sigma _1} \right) ^n \end{aligned}$$

Now, setting \(\sigma _2 = \sigma _1 \cdot \sqrt{1 + \eta }\) we get that

$$\begin{aligned} \int _{\varvec{\mathbf {y}}} \max _{\varvec{\mathbf {s}}^*} p_{\varvec{\mathbf {e}}}(\varvec{\mathbf {y}} - \varvec{\mathbf {s}}^*) d\varvec{\mathbf {y}}&\le e^{\pi \frac{r^2}{\sigma _2^2 - \sigma _1^2}} \cdot \left( \frac{\sigma _2}{\sigma _1} \right) ^n&= e^{\pi \frac{r^2}{\eta \sigma _1^2}} \cdot (1 + \eta )^{n/2}&\le e^{\pi \frac{r^2}{\eta \sigma _1^2} + \frac{n \eta }{2}} \end{aligned}$$

By Lemma 5.1, we can conclude that

$$ \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}_1) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \left( \pi \frac{r^2}{\eta \sigma _1^2} + \frac{n \eta }{2} \right) \log (e). $$

Recall that \(\eta \) is still a free parameter. This expression is minimized by choosing \(\eta = \sqrt{\frac{2 \pi }{n}}\frac{r}{\sigma _1}\), which yields

$$ \tilde{H}_\infty (\varvec{\mathbf {s}} | \varvec{\mathbf {s}} + \varvec{\mathbf {e}}_1) \ge \tilde{H}_\infty (\varvec{\mathbf {s}}) - \sqrt{2 \pi n} \cdot \frac{r}{\sigma _1} \log (e). $$

6 Tightness of the Result

In this Section, we will show that for general moduli and general min-entropy distributions our result is tight up to polynomial factors.

For a modulus q and a noise parameter \(\sigma \), we will provide an example of a distribution \(\varvec{\mathbf {s}}\) with min-entropy \(\approx n \cdot \log (q/\sigma )\), such that \(\mathsf {ent\text {-}LWE}(q,n,m,\mathcal {X},\chi )\) is easy. For this counter-example, the choice of the modulus q is critical.

Lemma 6.1

Let \(q = q(\lambda )\) be a modulus such that q has a divisor p of size \(|p| > 2B + 1\), let \(n,m = \mathsf {poly}(\lambda )\) and let \(\chi \) be a B-bounded error-distribution. Define the distribution \(\mathcal {S}\) to be the uniform distribution on \(p \cdot \mathbb {Z}_q^n\). Then there exists an efficient algorithm \(\mathcal {A}\) that solves \(\mathsf {ent\text {-}LWE}(q,n,m,\mathcal {S},\chi )\).

Corollary 6.2

There exist moduli q and distributions \(\mathcal {S}\) with min-entropy \(\ge n \cdot (\log (q/\sigma ) - \log (\log (\lambda ))))\) such that \(\mathsf {ent\text {-}LWE}(q,n,m,\mathcal {S},D_{\sigma })\) is easy.

The corollary follows from Lemma 6.1 by choosing p such that \(p = 2\log (\lambda ) \cdot \sigma + 1\) and noting that a gaussian of parameter \(\sigma \) is \(\log (\lambda ) \cdot \sigma \) bounded, except with negligible probability. Moreover, for this choice of p the distribution \(\mathcal {S}\) in Lemma 6.1 has min-entropy \(n \cdot \log (q / p) \ge n \cdot \log (q / \sigma ) - 2 \log (\log (\lambda ))\).

Proof

(of Lemma 6.1). Assume that reduction modulor p computes a central residue class representation in \([-p/2,p/2]\). The algorithm \(\mathcal {A}\) proceeds as follows.

  • \({{\mathcal {A}(\mathbf {A},\varvec{\mathbf {y}})}}\) :

    • Compute \(\varvec{\mathbf {e}} \leftarrow \varvec{\mathbf {y}} \mod p\).

    • Solve the equation system \(\varvec{\mathbf {s}} \cdot \mathbf {A} = \varvec{\mathbf {y}} - \varvec{\mathbf {e}}\) for \(\varvec{\mathbf {s}}\), e.g. via Gaussian elimination.

    • Output \(\varvec{\mathbf {s}}\).

    To see that the algorithm \(\mathcal {A}\) is correct, note that

    $$ \varvec{\mathbf {y}} \mod p = (\varvec{\mathbf {s}} \cdot \mathbf {A} + \varvec{\mathbf {e}}) \mod p = (p \cdot \varvec{\mathbf {r}} \cdot \mathbf {A} + \varvec{\mathbf {e}}) \mod p = \varvec{\mathbf {e}} $$

    as \(p \ge 2B\) and \(\Vert \varvec{\mathbf {e}} \Vert \le B\).

7 Barriers for Entropic LWE

In the last Section we provided an attack on entropic LWE when the min-entropy of the secret is below \(n \cdot \log (q / \sigma )\) for a worst-case choice of the modulus q. On might still hope that for more benign choices of the modulus q this problem might be hard in this entropy regime. In this section we will provide a barrier for the hardness of entropic LWE in this regime for any modulus. In particular, we will show that for entropies below \(n \cdot \log (q/\sigma )\), the hardness of entropic LWE does not follow from any standard assumption in a black-box way. This leaves open the possibility that in this regime the hardness of entropic LWE may be established from more exotic knowledge assumptions. To establish our result, we will use a framework developed by Wichs [Wic13].

7.1 Simulatable Attacks

We first recall the notion of cryptographic games as a way to characterize cryptographic standard assumptions due to Haitner and Holenstein [HH09]. This characterization captures essentially all falsifiable assumptions [Nao03] used in cryptography, such as LWE.

Definition 7.1

(Cryptographic Games [HH09]). A cryptographic game \(\mathcal{C}= (\varGamma ,c)\) is defined by a (possibly inefficient) randomized machine \(\varGamma \), called the challenger, and a constant \(c \in [0,1)\). On input a security parameter \(1^\lambda \), the challenger interacts with an attack \(\mathcal {A}(1^\lambda )\) and outputs a bit b. Denote this by \(\varGamma (1^\lambda ) \leftrightarrows \mathcal {A}(1^\lambda )\). The advantage of an attacker \(\mathcal {A}\) against \(\mathcal{C}\) is defined by

$$ \mathsf {Adv}_{\mathcal{C}}^{\mathcal {A}}(1^\lambda ) = \Pr [(\varGamma (1^\lambda ) \leftrightarrows \mathcal {A}(1^\lambda )) = 1] - c. $$

We say that a cryptographic game \(\mathcal{C}\) is secure if for all PPT attackers \(\mathcal {A}\) the advantage \(\mathsf {Adv}_{\mathcal{C}}^{\AA }(\lambda )\) is negligible.

Definition 7.2

(Black-Box Reduction). Let \(\mathcal{C}_1\) and \(\mathcal{C}_2\) be cryptographic games. A black-box reduction deriving the security of \(\mathcal{C}_2\) from the security of \(\mathcal{C}_1\) is an oracle PPT-machine \(\mathcal {B}^{(\cdot )}\) for which there are constants \(c,\lambda _0\) such that for all \(\lambda \ge \lambda _0\) and all (possibly inefficient, non-uniform) attackers \(\mathcal {A}_\lambda \) with advantage \(\mathsf {Adv}_{\mathcal{C}_1}^{\mathcal {A}_\lambda }(\lambda ) \ge 1/2\), we have \(\mathsf {Adv}_{\mathcal{C}_2}^{\mathcal {B}^{\mathcal {A}_\lambda }}(\lambda ) \ge \lambda ^{-c}\).

We remark that the choice of the constant 1/2 for the advantage of \(\mathcal {A}_\lambda \) is arbitrary and can be replaced by a non-negligible function (depending \(\mathcal {A}_\lambda \)). We now recall the notion of simulatable attacks [Wic13].

Definition 7.3

(Simulatable Attacks [Wic13]). An \(\epsilon \)-simulatable attack on an assumption \(\mathcal {C}\) is a tuple \((\mathcal {A}, \mathsf {Sim})\) such that \(\mathcal {A}\) is a stateless, non-uniform possibly inefficient attacker against \(\mathcal{C}\), and \(\mathsf {Sim}\) is a stateful PPT simulator. We require the following two properties to hold.

  • The (inefficient) attacker \(\mathcal {A}\) successfully breaks \(\mathcal {C}\) with advantage \(1 - \mathsf {negl}(\lambda )\).

  • For every (possibly inefficient) oracle machine \(\mathcal {M}^{(\cdot )}\) making at most q queries to its oracle it holds that

    $$ | \Pr [\mathcal {M}^{\mathcal {A}(1^\lambda ,1)}(1^\lambda ) = 1] - \Pr [\mathcal {M}^{\mathsf {Sim}(1^\lambda )} = 1] | \le \mathsf {poly}(q) \cdot \epsilon . $$

    where the probabilities are taken over all the random choices involved.

We use the shorthand simulatable attack for \(\epsilon \)-simulatable attack with some negligible \(\epsilon \).

We remark that for reasons of conceptual simplicity Wichs [Wic13] required the advantage of the simulatable adversary \(\mathcal {A}\) to be 1. But it can easily be verified that Theorem 7.4 below also works with our slightly relaxed notion which allows the unbounded adversary to have advantage \(1 - \mathsf {negl}(\lambda )\). The following theorem by Wichs [Wic13] shows that the existence of a simulatable attack for some assumption \(\mathcal {C}_1\) implies that there cannot by a reduction \(\mathcal {B}\) which reduces the hardness of \(\mathcal {C}_1\) to any standard assumption \(\mathcal {C}_2\), where \(\mathcal {C}_1\) and \(\mathcal {C}_2\) are cryptographic games in the sense of Definition 7.1.

Theorem 7.4

([Wic13] Theorem 4.2). If there exists a simulatable attack against some assumption \(\mathcal {C}_1\) and there is a black-box reduction \(\mathcal {B}\) reducing the security of \(\mathcal {C}_1\) to some assumption \(\mathcal {C}_2\), then \(\mathcal {C}_2\) is not secure.

The idea for the proof of this theorem is simple: If an attack \(\mathcal {A}\) against \(\mathcal {C}_1\) is simulatable, then the behavior of \(\mathcal {B}^{\mathsf {Sim}}\) will be indistinguishable from \(\mathcal {B}^{\mathcal {A}}\). But since \(\mathcal {A}\) breaks \(\mathcal {C}_1\), it holds that \(\mathcal {B}^{\mathcal {A}}\) breaks \(\mathcal {C}_2\). Therefore, the efficient algorithm \(\mathcal {B}^{\mathsf {Sim}}\) must also break \(\mathcal {C}_2\), implying that \(\mathcal {C}_2\) is insecure.

7.2 A Simulatable Attack for Entropic LWE

We will now provide a simulatable attack against entropic (search-)LWE. The attack consists of a pair of a min-entropy distribution \(\mathcal {S}\) and an attacker \(\mathcal {A}\). Since we want to prove a result for general min-entropy distributions, we assume that both the adversary and the min-entropy distribution \(\mathcal {S}\) are adversarially chosen. Thus, we can consider the distribution \(\mathcal {S}\) as running a coordinated attack with the attacker \(\mathcal {A}\). More importantly, any black-box reduction \(\mathcal {B}\) reducing the entropic LWE to a standard assumption will only have black-box access to the distribution \(\mathcal {S}\). We remark that, to the best of our knowledge, currently all reductions in the realm of leakage resilient cryptography only make black-box use of the distribution. Making effective non-black box use of an adversarially chosen sampling circuit seems out of reach for current techniques. Assume in the following that \(m \ge 2n\) and let \(\chi \) be a B-bounded error distribution. Furthermore let k be a positive integer. Consider the following attacker, consisting of the adversary \(\mathcal {A}\) and the distribution \(\mathcal {S}\).

  • The distribution \(\mathcal {S}\) is a flat distribution on a set S of size \(2^k\), where the set S is chosen uniformly random.

  • \(\mathcal {A}_S(\mathbf {A},\varvec{\mathbf {y}})\): Given a pair \((\mathbf {A},\varvec{\mathbf {y}})\), the attacker \(\mathcal {A}\) proceeds as follows:

    • Check if the matrix \(\mathbf {A}\) has an invertible column-submatrix, if not abort and output \(\bot \) (this check can be performed efficiently using linear algebra).

    • Compute a set \(I \subseteq [m]\) of size n such that the column-submatrix \(\mathbf {A}_I\) is invertible (where \(\mathbf {A}_I\) is obtained by dropping all columns of \(\mathbf {A}\) that do not have indices in I).

    • Set \(\mathbf {A}' = \mathbf {A}_I\) and \(\varvec{\mathbf {y}}' = \varvec{\mathbf {y}}_I\) (i.e. \(\varvec{\mathbf {y}}'\) is \(\varvec{\mathbf {y}}\) projected to the coordinates in I).

    • Initialize a set \(S' = \emptyset \).

    • For every \(\varvec{\mathbf {s}} \in S\), check if \(\Vert \varvec{\mathbf {y}} - \varvec{\mathbf {s}}\mathbf {A} \Vert _\infty \le B\), if so include \(\varvec{\mathbf {s}}\) in the set \(S'\).

    • Choose an \(\varvec{\mathbf {s}} \leftarrow _\$S'\) uniformly random and output \(\varvec{\mathbf {s}}\).

First observe that whenever the matrix \(\mathbf {A}\) has an invertible submatrix, then \(\mathcal {A}\) does have advantage 1. The probability that \(\mathbf {A}\) does not have an invertible submatrix is at most \(\log (q) \cdot 2^{n - m} = \log (q) \cdot 2^{-n}\), which is negligible (see Sect. 2). Consequently, \(\mathcal {A}\) breaks \(\mathsf {ent\text {-}LWE}(q,n,m,\mathcal {S},\chi )\) with probability \(1 - \mathsf {negl}(\lambda )\).

We will now provide our simulator for the adversary \(\mathcal {A}\) and the distribution \(\mathcal {S}\). The simulator jointly simulates the distribution \(\mathcal {S}\) and the attacker \(\mathcal {A}\), i.e. from the interface of an oracle machine \(\mathcal {B}\) it holds that \(\mathsf {Sim}(1^\lambda ,\cdot ,\cdot )\) simulates \((\mathcal {S}(\cdot ),\mathcal {A}(\cdot ))\). The advantage of the simulator stems from having a joint view of the samples provided so far and the inputs of the adversary \(\mathcal {A}\). The main idea of our simulator is that is samples the set S lazily and keeps track of all the samples \(S^*\) it gave out so far. When provided with an instance \((\mathbf {A},\varvec{\mathbf {y}})\), it will perform the same check as \(\mathcal {A}\) but restricted to the set \(X'\) and therefore run in time O(q). Recall that the simulator is stateful.

  • Simulator \(\mathsf {Sim}(1^\lambda ,\cdot ,\cdot )\):

    • Initialize a set \(S^*= \emptyset \).

    • Whenever a sample is queried from \(\mathcal {S}\), choose \(\varvec{\mathbf {s}} \leftarrow _\$Z_q^n\) uniformly random, include \(\varvec{\mathbf {s}}\) in the set \(S^*\) and output \(\varvec{\mathbf {s}}\).

    • Whenever an instance is provided to \(\mathcal {A}\), do the following:

      \(*\):

      Initialize a set \(S' = \emptyset \).

      \(*\):

      Check for every \(\varvec{\mathbf {s}} \in S^*\), check if \(\Vert \varvec{\mathbf {y}} - \varvec{\mathbf {s}}\mathbf {A} \Vert _\infty \le B\), if so include \(\varvec{\mathbf {s}}\) in the set \(S'\).

      \(*\):

      Choose an \(\varvec{\mathbf {s}} \leftarrow _\$S'\) uniformly random and output \(\varvec{\mathbf {s}}\).

We will now show that the simulator \(\mathsf {Sim}\) simulates the attack \((\mathcal {A},\mathcal {X})\) with negligible error. We need the following lemma.

Lemma 7.5

Let \(\varvec{\mathbf {z}} \leftarrow _\$\mathbb {Z}_q^n\) be distributed uniformly random. Then it holds that

$$ \Pr [\Vert \varvec{\mathbf {z}} \Vert _\infty \le B] \le ((2B+1)/q)^n. $$

Proof

Since all the components \(z_i\) of \(\varvec{\mathbf {z}}\) are distributed uniformly and independently, it holds that

$$ \Pr [\Vert \varvec{\mathbf {z}} \Vert _\infty \le B] = \prod _{i = 1}^n \Pr [|z_i| \le B] \le ((2B+1)/q)^n. $$

Theorem 7.6

Let \(\chi = \chi (\lambda )\) be a B-bounded error-distribution. Further, let \(k < n \cdot \log (q/(2B + 1)) - \omega (\log (\lambda ))\) be an integer. Let \(\bar{\mathcal {S}}\) be the family of all distributions on \(\mathbb {Z}_q^n\) with min-entropy at most k. Then, if there is a reduction \(\mathcal {B}\) from \(\mathsf {ent\text {-}LWE}(q,n,m,\bar{\mathcal {S}},\chi )\) to any cryptographic game \(\mathcal {C}\), then \(\mathcal {C}\) is not secure.

See the full version [BD20] for proof.