1 Introduction

Pseudorandom functions (PRFs), defined by Goldreich, Goldwasser, and Micali [29], are keyed functions that are indistinguishable from truly random functions given black-box access. In this work we focus on pseudorandom functions that can be represented by simple matrix branching programs; we refer to these objects as “matrix PRFs”. In the simplest setting, a matrix PRF takes a key specified by \(\ell \) pairs of \(w \times w\) matrices \( \left\{ \mathbf {M}_{i,b} \right\} _{i \in [\ell ], b \in \{0,1\}}\) where

$$\begin{aligned} \mathsf {PRF}( \left\{ \mathbf {M}_{i,b} \right\} _{i \in [\ell ], b \in \{0,1\}},\mathbf x\in \{0,1\}^\ell ) := \prod _{i=1}^{\ell } \mathbf {M}_{i,x_i} \end{aligned}$$

Matrix PRFs are attractive due to their simplicity, strong connections to complexity theory and group theory [1, 12, 44], and recent applications in program obfuscation [11, 27].

Existing Constructions. First, we note that the Naor-Reingold PRF [37] (extended to matrices in [34]) and the Banerjee-Peikert-Rosen PRF [7] may be viewed as matrix PRFs with post-processing, corresponding to group exponentiation and entry-wise rounding respectively. However, the applications we have in mind do not allow such post-processing. Instead, we turn to a more general definition of read-c matrix PRFs, where the key is specified by \(h := c \cdot \ell \) pairs of \(w \times w\) matrices \( \left\{ \mathbf {M}_{i,b} \right\} _{i \in [h], b \in \{0,1\}}\) where

$$\mathsf {PRF}( \left\{ \mathbf {M}_{i,b} \right\} _{i \in [h], b \in \{0,1\}},\mathbf x) := \mathbf u_L \cdot \prod _{i=1}^{h} \mathbf {M}_{i,x_{i \text { mod }\ell }} \cdot \mathbf u_R$$

Here, \(\mathbf u_L,\mathbf u_R\) correspond to fixed vectors independent of the key. This corresponds exactly to PRFs computable by read-c matrix branching programs. By applying Barrington’s theorem on the existing PRFs in \(\mathsf {NC}^1\), such as the two PRFs we just mentioned [7, 37], we obtain read-poly\((\ell )\) matrix PRFs based on standard assumptions like DDH and LWE.

This Work. In this work, we initiate a systematic study of matrix PRFs.

  • From the constructive perspective, we investigate whether there are “simpler” constructions of matrix PRF, or hardness assumptions over matrix products that can be used to build matrix PRFs. Here “simpler” means the matrices \(\mathbf {M}_{i,b}\)’s are drawn from some “natural” distribution, for instance, independently at random from the same distribution. Note that the constructions obtained by apply Barrington’s theorem [10] on PRFs in \(\mathsf {NC}^1\) yield highly correlated and structured distributions.

  • From the attacker’s perspective, the use of matrices opens the gate for simple linear algebraic attacks in breaking the hardness assumptions. We would like to understand what are the characteristics that a matrix PRF could or could not have, by trying different linear algebraic attacks. These characteristics include the distribution of the underlying matrices, as well as the complexity of the underlying branching program.

  • Finally, we revisit the application of matrix PRFs to program obfuscation as a mechanism for immunizing against known attacks.

1.1 Our Contributions

Our contributions may be broadly classified into three categories, corresponding to the three lines of questions mentioned above.

Constructions. We show how to build a matrix PRF starting from simple assumptions over matrix products via the Naor-Reingold paradigm [37], and we present candidates for these assumptions. Concretely, we consider the assumption

$$\begin{aligned} \bigg ( \{\mathbf {A}_{i,b}\}_{i\in [k], b\in \{0,1\}}, \prod _{i=1}^{k} (\mathbf {A}_{i,0} \mathbf {B}), \prod _{i=1}^{k} (\mathbf {A}_{i,1}\mathbf {B}) \bigg ) \approx _c \bigg ( \{\mathbf {A}_{i,b}\}_{i\in [k], b\in \{0,1\}}, \mathbf {B}_0, \mathbf {B}_1 \bigg ) \end{aligned}$$
(1)

where the matrices \(\mathbf {A}_{i,b}\), \(\mathbf {B}\), \(\mathbf {B}_0\) and \(\mathbf {B}_1\) are uniformly random over some simple matrix groups. We clarify that the ensuing matrix PRF while efficiently computable, requires a product of \(O(k^\ell )\) matrices where \(\ell \) is the length of the PRF input.

Attacks. We show that any matrix PRF that is computable by a read-c, width-w branching program can be broken in time poly\((w^c)\); this means that any matrix PRF based on constant-width matrices must read each input bit \(\omega (\log (\lambda ))\) times. Our attack and the analysis are inspired by previous zeroizing attacks on obfuscation [6, 18, 23]; we also provide some simplification along the way. We note that the case of \(c=1\) appears to be folklore.

The Attack. The attack is remarkably simple: given oracle access to a function \(F: \{0,1\}^\ell \rightarrow R\),

  1. 1.

    pick any \(L := w^{2c}\) distinct strings \(x_1,\ldots ,x_L \in \{0,1\}^{\ell /2}\);

  2. 2.

    compute \(\mathbf {V}\in R^{L \times L}\) whose (ij)’th entry is \(F(x_i \Vert x_j)\);

  3. 3.

    output \(\mathsf {rank}(\mathbf {V})\)

If F is a truly random function, then \(\mathbf {V}\) has full rank w.h.p. On the other hand, if F is computable by a read-c, width w branching program, then we show that \(F(x_i \Vert x_j)\) can be written in the form \(\langle \mathbf u_i, \mathbf v_j \rangle \) for some fixed \(\mathbf u_1,\ldots ,\mathbf u_L,\mathbf v_1,\ldots ,\mathbf v_L \in R^{w^{2c-1}}\). This means that we can write

$$\mathbf {V}= \underbrace{\begin{pmatrix} \leftarrow \mathbf u_1 \rightarrow \\ \vdots \\ \leftarrow \mathbf u_L \rightarrow \end{pmatrix}}_{L \times w^{2c-1}} \underbrace{\begin{pmatrix} \uparrow &{} &{} \uparrow \\ \mathbf v_1 &{} \cdots &{} \mathbf v_L \\ \downarrow &{} &{} \downarrow \end{pmatrix}}_{w^{2c-1} \times L}$$

which implies \(\mathsf {rank}(\mathbf {V}) \le w^{2c-1}\).

Next, we sketch how we can decompose \(F(x_i \Vert x_j)\) into \(\langle \mathbf u_i,\mathbf v_j \rangle \). This was already shown in [23, Section 4.2], but we believe our analysis is simpler and more intuitive. Consider a read-thrice branching program of width w where

$$\mathbf {M}_{x \Vert y} = \mathbf u_L \mathbf {M}^1_x \mathbf {N} ^1_y \mathbf {M}^2_x \mathbf {N} ^2_y \mathbf {M}^3_x \mathbf {N} ^3_y \mathbf u_R$$

Suppose we can rewrite \(\mathbf {M}_{x \Vert y}\) as

$$\begin{aligned}&\hat{\mathbf u}_L \cdot (\mathbf {M}^1_x \mathbf {N} ^1_y) \otimes (\mathbf {M}^2_x \mathbf {N} ^2_y) \otimes (\mathbf {M}^3_x \mathbf {N} ^3_y) \cdot \hat{\mathbf u}_R\\= & {} \underbrace{\hat{\mathbf u}_L \cdot (\mathbf {M}^1_x \otimes \mathbf {M}^2_x \otimes \mathbf {M}^3_x)}_{1 \times w^3} \cdot \underbrace{( \mathbf {N} ^1_y \otimes \mathbf {N} ^2_y \otimes \mathbf {N} ^3_y) \cdot \hat{\mathbf u}_R}_{w^3 \times 1} \end{aligned}$$

for some suitable choices of \(\hat{\mathbf u}_L,\hat{\mathbf u}_R\). Unfortunately, such a statement appears to be false. Nonetheless, we are able to prove a similar decomposition where we replace \(\hat{\mathbf u}_L \cdot (\mathbf {M}^1_x \otimes \mathbf {M}^2_x \otimes \mathbf {M}^3_x)\) on the left with

$$\underbrace{\mathsf {flat}\bigl ( \overbrace{\mathbf u_L \mathbf {M}^1_x \otimes \mathbf {M}^2_x \otimes \mathbf {M}^3_x}^{w^2 \times w^3} \bigr )}_{1 \times w^5}$$

where \(\mathsf {flat}\) “flattens” a \(n \times m\) matrix into a \(1 \times nm\) row vector by concatenating the rows of the input matrix.

Applications to IO. We show that augmenting the CVW18 GGH15-based IO candidate with a matrix PRF provably immunizes the candidate against known algebraic and statistical zeroizing attacks, as captured by a new and simple adversarial model.

Our IO Candidate. Our IO candidate on a branching program for a function \(f : \{0,1\}^\ell \rightarrow \{0,1\}\) samples random Gaussian matrices \( \left\{ \mathbf {S}_{i,b} \right\} _{i \in [h], b \in \{0,1\}}\), a random vector \( \mathbf {a} _h\) over \(\mathbb {Z}_q\) and a random matrix PRF \(\mathsf {PRF}_{\mathbf {M}} : \{0,1\}^\ell \rightarrow [0,2^\tau ]\) where \(2^\tau \ll q\), and outputs

$$\begin{aligned} \mathbf {A}_J, \left\{ \mathbf {D}_{i,b} \right\} _{i \in [h], b \in \{0,1\}} \end{aligned}$$

The construction basically follows that in [18], with the matrix PRF embedded along the diagonal. By padding the programs, we may assume that the input program and the matrix PRF share the same input-to-index function \(\varpi :\{0,1\}^h\rightarrow \{0,1\}^\ell \). Then, we have

$$\begin{aligned} \mathbf {A}_J \mathbf {D}_{\varpi (\mathbf x)} \bmod q \approx {\left\{ \begin{array}{ll} 0 \cdot \mathbf {S}_{\varpi (\mathbf x)} \mathbf {a} _h + \mathsf {PRF}_{\mathbf {M}}(\mathbf x)&{}\text{ if } f(\mathbf x) = 1\\ (\ne 0) \cdot \mathbf {S}_{\varpi (\mathbf x)} \mathbf {a} _h + \mathsf {PRF}_{\mathbf {M}}(\mathbf x) &{}\text{ if } f(\mathbf x) = 0 \end{array}\right. } \end{aligned}$$

where \(\approx \) captures an error term which is much smaller than \(2^\tau \). Functionality is straight-forward: output 1 if \(\Vert \mathbf {A}_J \mathbf {D}_{\varpi (\mathbf x)}\Vert < 2^\tau \) and 0 otherwise.

Our Attack Model. We introduce the input-consistent evaluation model on GGH15-based IO candidates, where the adversary gets oracle access to

$$O_r(\mathbf x) := \mathbf {A}_J \mathbf {D}_{\varpi (\mathbf x)} \bmod q$$

instead of \(\mathbf {A}_J, \left\{ \mathbf {D}_{i,b} \right\} _{i \in [h], b \in \{0,1\}}\). Basically, all known attacks on GGH15-based IO candidates (including the rank attack and statistical zeroizing attacks [18, 19] can be implemented in this model. In fact, many of these attacks only make use of the low-norm quantities \(\{ O_r(\mathbf x) : f(\mathbf x) = 1 \}\), which are also referred to as encodings of zeros, and hence the name zeroizing attacks.

Note that our model allow the adversary to perform arbitrary polynomial-time computation on the output of \(O_r(\cdot )\), whereas the “weak multi-linear map model” in [11] only allows for algebraic computation of these quantities. The latter does not capture computing the norm of these quantities, as was done in the recent statistical zeroizing attacks [19]. In fact, we even allow the adversary access to \(\{ \mathbf {A}_J \mathbf {D}_{\varpi (\mathbf x)} \bmod q : f(\mathbf x) = 0\}\), quantities which none of the existing attack takes advantage of except the some attacks [18, 21] for a simple GGH15 obfuscation [31]. In fact, the class of adversaries that only does such evaluations appears to capture all known attacks for GGH15-based obfuscation.

We clarify that our attack model does not capture so-called mixed-input attacks, where the adversary computes \(\mathbf {A}_J \mathbf {D}_{\mathbf x'} \bmod q\) for some \(\mathbf x' \notin \varpi (\{0,1\}^\ell )\). As in prior works, we make sure that such quantities do not have small norm, but pre-processing the branching program to reject all \(\mathbf x' \notin \varpi (\{0,1\}^\ell )\) (see Construction of Subprograms in Sect. 6.1 for details).

Analysis. We show that for our IO candidate, we can simulate oracle access to \(O_r(\cdot )\) given oracle access to \(f(\cdot )\) under the LWE assumption (which in particular implies the existence of matrix PRFs). This basically says that our IO candidate achieves “virtual black-box security” in the input-consistent evaluation model.

The proof strategy is quite simple: we hide the lower bits by using the embedded matrix PRFs, and hide the higher bits using lattice-based PRFs [7, 14]. In more detail, observe that the lower \(\tau \) bits of of \(O_r(\cdot )\) are pseudorandom, thanks to pseudorandomness of \(\mathsf {PRF}_\mathbf {M}(\cdot )\). We can then simulate the higher \(\log q - \tau \) bits exactly as in [18]:

  • if \(f(\mathbf x) = 1\), then these bits are just 0.

  • if \(f(\mathbf x) = 0\), then we can just rely on the pseudorandomness of existing LWE-based PRFs [7, 14], which tells us that the higher \(\log q - \tau \) bits of \(\mathbf {S}_{\varpi (\mathbf x)} \mathbf {a} _h\) are pseudorandom.

Note that the idea of embedding a matrix PRF into an IO candidate already appeared in [27, Section 1.3]; however, the use of matrix PRF for “noise flooding” the encodings of zeros and the lower-order bits as in our analysis –while perfectly natural in hindsight– appears to be novel to this work. In prior works [11, 27], the matrix PRF is merely used to rule out non-trivial algebraic relations amongst the encodings of zeros, namely that there is no low-degree polynomial that vanishes over a large number of pseudorandom values.

1.2 Discussion

Implications for IO. Our results demonstrate new connections between matrix PRFs and IO in this work and shed new insights into existing IO constructions and candidates:

  • Many candidates for IO follow the template laid out in [26]: start out with a branching program \( \left\{ \mathbf {M}_{i,b} \right\} _{i \in [h], b \in \{0,1\}}\), perform some pre-processing, and encode the latter using graded encodings. To achieve security in the generic group model [9] or to defeat against the rank attack [18], the pre-processing would add significant redundancy or blow up the length of the underlying branching program. In particular, even if we start out with a read-once branching program as considered in [31], the program we encode would be a read-\(\ell \) (e.g. for so-called dual-input branching programs) or read-\(\lambda \) branching program. But, why read-\(\ell \) or read-\(\lambda \)? Our results –both translating existing IO attacks to attacks on matrix PRFs, and showing how to embed a matrix PRF to achieve resilience against existing attacks– suggest that the blow-up is closely related to the complexity of computing matrix PRFs.

  • A recent series of works demonstrated a close connection between building functional encryption (and thus IO) to that of low-degree pseudorandom generators (PRG) over the integers [2, 5, 35], where the role of the PRGs is to flood any leakage from the error term during FHE decryption [30]. Here, we show to exploit matrix PRFs –again over the integers– to flood any leakage from the error term in the GGH15 encodings (but unlike the setting of PRGs, we do not require the output of the PRFs to have polynomially bounded domain). Both these lines of works point to understanding pseudorandomness over the integers as a crucial step towards building IO.

  • Our results suggest new avenues for attacks using input-inconsistent evaluations, namely to carefully exploit the quantities \(\{ \mathbf {A}_J \mathbf {D}_{\mathbf x'} \bmod q : \mathbf x' \notin \varpi (\{0,1\}^\ell )\}\) instead of the input-consistent evaluations.

We note that our attacks also play a useful pedagogical role: explaining the core idea of existing zeroizing attacks on IO in the much simpler context of breaking pseudorandomness of matrix PRFs.

Additional Related Works. Let us remark that recently Boneh et al. [13] also look for (weak) PRFs with simple structures, albeit with a different flavor of simplicity. Their candidates in fact use the change of modulus, which is what we are trying to avoid.

2 Preliminaries

Notations and Terminology. Let \(\mathbb {R}, \mathbb {Z}, \mathbb {N}\) be the set of real numbers, integers and positive integers. Denote \(\mathbb {Z}/(q\mathbb {Z})\) by \(\mathbb {Z}_q\). For \(n\in \mathbb {N}\), let \([n] := \left\{ 1, ..., n \right\} \). A vector in \(\mathbb {R}^n\) (represented in column form by default) is written as a bold lower-case letter, e.g. \( \mathbf {v} \). For a vector \( \mathbf {v} \), the \(i^{th}\) component of \( \mathbf {v} \) will be denoted by \(v_i\). A matrix is written as a bold capital letter, e.g. \( \mathbf {A} \). The \(i^{th}\) column vector of \( \mathbf {A} \) is denoted \( \mathbf {a} _i\).

Subset products (of matrices) appear frequently in this article. For a given \(h\in \mathbb {N}\), a bit-string \( \mathbf {v} \in \{0,1\}^h\), we use \( \mathbf {X} _{ \mathbf {v} }\) to denote \(\prod _{i\in [h]} \mathbf {X} _{i,v_{i}}\) (it is implicit that \( \left\{ \mathbf {X} _{i, b} \right\} _{i\in [h], b\in \{0,1\}}\) are well-defined).

The tensor product (Kronecker product) for matrices \( \mathbf {A} \in \mathbb {R}^{\ell \times m}\), \( \mathbf {B} \in \mathbb {R}^{n\times p}\) is defined as

$$\begin{aligned} \mathbf {A} \otimes \mathbf {B} = \begin{bmatrix} a_{1,1} \mathbf {B} , &{} \ldots , &{} a_{1,m} \mathbf {B} \\ \ldots , &{} \ldots , &{} \ldots \\ a_{\ell ,1} \mathbf {B} , &{} \ldots , &{} a_{\ell ,m} \mathbf {B} \end{bmatrix}\in \mathbb {R}^{\ell n\times mp}. \end{aligned}$$
(2)

For matrices \( \mathbf {A} \in \mathbb {R}^{\ell \times m}\), \( \mathbf {B} \in \mathbb {R}^{n\times p}\), \( \mathbf {C} \in \mathbb {R}^{m\times u}\), \( \mathbf {D} \in \mathbb {R}^{p\times v}\),

$$\begin{aligned} ( \mathbf {A} \mathbf {C} ) \otimes ( \mathbf {B} \mathbf {D} ) = ( \mathbf {A} \otimes \mathbf {B} )\cdot ( \mathbf {C} \otimes \mathbf {D} ). \end{aligned}$$
(3)

Matrix Rings/Groups. Let \(M_n(R)\) denote a matrix ring, i.e., the ring of \(n \times n\) matrices with coefficients in a ring R. When \(M_n(R)\) is called a matrix group, we consider matrix multiplication as the group operation. By default we assume R is a commutative ring with unity. The rank of a matrix \( \mathbf {M} \in M_n(R)\) refers to its R-rank.

Let \(\mathsf {GL}(n, R)\) be the group of units in \(M_n(R)\), i.e., the group of invertible \(n \times n\) matrices with coefficients in R. Let \(\mathsf {SL}(n, F)\) be the group of \(n \times n\) matrices with determinant 1 over a field F. When \(q = p^k\) is a prime power, let \(\mathsf {GL}(n, q)\), \(\mathsf {SL}(n, q)\) denote the corresponding matrix groups over the finite field \(\mathbb {F}_q\).

Cryptographic Notions. In cryptography, the security parameter (denoted as \(\lambda \)) is a variable that is used to parameterize the computational complexity of the cryptographic algorithm or protocol, and the adversary’s probability of breaking security. An algorithm is “efficient” if it runs in (probabilistic) polynomial time over \(\lambda \).

When a variable v is drawn randomly from the set S we denote as \(v{\mathop {\leftarrow }\limits ^{\$}}S\) or \(v\leftarrow U(S)\), sometimes abbreviated as v when the context is clear. We use \(\approx _s\) and \(\approx _c\) as the abbreviations for statistically close and computationally indistinguishable.

Definition 2.1

(Pseudorandom function [29]). A family of deterministic functions \(\mathcal {F}= \left\{ F_k: D_\lambda \rightarrow R_\lambda \right\} _{\lambda \in {\mathbb {N}}}\) is pseudorandom if there exists a negligible function \(\mathop {{\text {negl}}}(\cdot )\) for any probabilistic polynomial time adversary \(\mathsf {Adv}\), such that

$$\begin{aligned} \left| \Pr _{k, \mathsf {Adv}}[ \mathsf {Adv}^{F_k(\cdot )}(1^\lambda ) = 1 ] - \Pr _{O, \mathsf {Adv}}[ \mathsf {Adv}^{O(\cdot )}(1^\lambda ) = 1 ] \right| \le \mathop {{\text {negl}}}(\lambda ), \end{aligned}$$

where \(O(\cdot )\) denotes a truly random function.

3 Direct Attacks on Matrix PRFs

In this section we stand from the attacker’s point of view to examine what are the basic characteristics that a matrix PRF should (or should not) have. Let \(\mathbb {G}= M_w(R)\), \(h = c\cdot \ell \). We consider read-c matrix PRFs of the form:

$$\begin{aligned} F: \{0,1\}^\ell \rightarrow R, ~~~ x \mapsto \mathbf u_L \cdot \prod _{i=1}^{h} \mathbf {M}_{i,x_{i \text { mod }\ell }} \cdot \mathbf u_R \end{aligned}$$
(4)

where \(\mathbf u_L, \mathbf u_R\) denote the left and right bookend vectors. The seed is given by

$$\mathbf u_L, \{ \mathbf {M}_{i,b}\in \mathbb {G}\}_{i \in [ h ], b \in \{0,1\}}, \mathbf u_R .$$

3.1 Rank Attack

We describe the rank attack which runs in time and space \(w^{O(c)}\), where w is the dimension of the \(\mathbf {M}\) matrices, c is the number of repetitions of each input bits in the branching program steps. The attack is originated from the zeroizing attack plus tensoring analysis in the obfuscation literature [6, 18, 23].

The main idea of the attack is to form a matrix from the evaluations on different inputs. We argue that the rank of such a matrix is bounded by \(w^{O(c)}\), whereas for a truly random function, the matrix is full-rank with high probability.

Algorithm 3.1

(Rank attack). The algorithm proceeds as follows.

  1. 1.

    Let \(\rho > w^{2c-1}\). Divide the \(\ell \) input bits into 2 intervals \([\ell ] = \mathcal {X}\mid \mathcal {Y} \) such that \(|\mathcal {X} |, |\mathcal {Y} |\ge \left\lceil \log \rho \right\rceil \).

  2. 2.

    For \(1\le i, j \le \rho \), evaluate the function F on \(\rho ^2\) different inputs of the form \(u^{(i,j)}=x^{(i)}\mid y^{(j)}\in \{0,1\}^{\ell }\). Let \(v^{(i,j)}\in R\) be the evaluation result on \(u^{(i,j)}\):

    $$\begin{aligned} v^{(i,j)} := F( u^{(i,j)} ) \end{aligned}$$
  3. 3.

    Output the rank of matrix \( \mathbf {V} = (v^{(i,j)})\in R^{\rho \times \rho }\).

Analysis for Read-Once Branching Programs. First we analyze the case where \(c= 1\), i.e. the function is read-once. For a truly random function, the R-rank of \( \mathbf {V} \) is \(\rho \) with non-negligible probability.

However, for the function F in Eq. (4), the R-rank of \( \mathbf {V} \) is bounded by w, since

(5)

Here we abuse the subset product notation at \( \mathbf {M} _{y^{(j)}}\) by assuming the index of the string \(y^{(j)}\) starts at the \((|\mathcal {X}|+1)^{th}\) step, for \(j\in [\rho ]\).

Analysis for Matrix PRFs with Multiple Repetitions. The analysis for read-once width w branching programs simply uses the fact that \(\mathbf {M}_{x \Vert y}\) can be written as an inner product of two vectors of length w which depend only on x and y respectively. Here, we show that for read-c width w branching programs, \(\mathbf {M}_{x \Vert y}\) can be written as an inner product of two vectors of length \(w^{2c-1}\). Note that this was already shown in [23, Section 4.2], but we believe our analysis is simpler and more intuitive.

Flattening Matrices. For a matrix \( \mathbf {A} = \begin{pmatrix} \mathbf {a} _1 \mid ... \mid \mathbf {a} _m \end{pmatrix}\in \mathbb {R}^{n\times m}\), let \(\mathsf {flat}( \mathbf {A} ) \in \mathbb {R}^{1 \times nm}\) denote the row vector formed by concatenating the rows of \( \mathbf {A} \). As it turns out, we can write

$$\begin{aligned} \mathbf {a} \mathbf {B}_1 \mathbf {B}_2 \ldots \mathbf {B}_c = \mathsf {flat}( \mathbf {a} \mathbf {B}_1 \otimes \mathbf {B}_2 \otimes \cdots \otimes \mathbf {B}_c) \mathbf {J} \end{aligned}$$
(6)

where \( \mathbf {J} \) is a fixed matrix over \(\{0,1\}\) independent of \( \mathbf {a} ,\mathbf {B}_1,\mathbf {B}_2,\ldots ,\mathbf {B}_c\).Footnote 1 The intuition for the identity is that each entry in the row vector \( \mathbf {a} \mathbf {B}_1 \cdots \mathbf {B}_c\) is a linear combination of terms, each a product of entries in \( \mathbf {a} \mathbf {B}_1,\ldots ,\mathbf {B}_c\), which appears as an entry in \( \mathbf {a} \mathbf {B}_1 \otimes \cdots \otimes \mathbf {B}_c\).

In addition, we also have the identity

$$\begin{aligned} \mathsf {flat}(\mathbf {A}\mathbf {B}) = \mathsf {flat}(\mathbf {A}) \cdot (\mathbf {I}_n \otimes \mathbf {B}) \end{aligned}$$
(7)

where n is the height of \(\mathbf {A}\).Footnote 2

Decomposing Read-Many Branching Programs. Given a read-c branching program of width w, we can write \(\mathbf {M}_{x \Vert y}\) as

That is, \(\mathbf {M}_{x \Vert y}\) can be written as an inner product of two vectors of length \(w^{2c-1}\). Therefore, the rank of \( \mathbf {V} \) is at most \(w^{2c-1}\).

Comparison With [6, 23]. We briefly mention that the previous analysis in [6, 23] works by iterating applying the identity

$$\mathsf {flat}(\mathbf {A}\cdot \mathbf {X} \cdot \mathbf {B}) = \mathsf {flat}( \mathbf {X} ) \cdot (\mathbf {A}^{\!\scriptscriptstyle {\top }}\otimes \mathbf {B})$$

c times along with the mixed-product property to switch the order of the matrix product. (The papers refer to “vectorization” \(\mathsf {vec}\), which is the column analogue of \(\mathsf {flat}\).) Our analysis is one-shot and avoids this iterative approach, and also avoids keeping track of matrix transposes.

Open Problem. Can we prove the following generalization of the rank attack? Let g be a polynomial of total degree at most d in the variables \(x_1,\ldots ,x_n,y_1,\ldots ,y_n\) over \(\mathbb {F}_q\) (or even \(\mathbb {Z}\)), which computes a function \(\{0,1\}^{n} \times \{0,1\}^n \rightarrow \mathbb {F}_q\). Now, pick some arbitrary \(X_1,\ldots ,X_L,Y_1,\ldots ,Y_L \in \{0,1\}^n\), and consider the matrix

$$\mathbf {V}:= ( g(X_i,Y_j) ) \in \mathbb {F}_q^{L \times L}$$

Conjecture:

$$\mathsf {rank}(\mathbf {V}) \le \max \{ L, n^{O(d)} \}$$

If the conjecture is true, then we obtain an attack that works not only for matrix products, but basically any low-degree polynomial.

Here’s a potential approach to prove the conjecture (based on the analysis of the rank attack). Write g as a sum of monomials \(g_k\). We can write \(\mathbf {V}\) as a sum of matrices \(\mathbf {V}_k\) where \(\mathbf {V}_k := ( g_k(X_i,Y_j) )\). Each \(\mathbf {V}_k\) can be written as a product of two matrices, which allows us to bound the rank of \(\mathbf {V}_k\). Then, use the fact that \(\mathsf {rank}(\mathbf {V}) \le \sum _k \mathsf {rank}(\mathbf {V}_k)\). A related question is, can we use this approach to distinguish g from random low-degree polynomials? A related challenge appears here in [1].

3.2 Implication of the Rank Attack

We briefly discuss the implication of the rank attack to two relevant proposals (or paradigms) of constructing efficient PRFs [12] and cryptographic hash functions [43, 44]. Both proposals use the group operations over a sequence of group elements as the evaluation functions. The rank attack implies when the underlying group \(\mathbb {G}\) admits an efficiently computable homomorphism to a matrix group \(M_n(R)\), and when each input bit chooses a constant number of steps in the evaluation, then the resulting function is not a PRF (resp. the resulting hash function cannot be used as a random oracle).

Let us remark that our attack does not refute any explicit claims in those two proposals. It mainly serves as a sanity check for the future proposals of instantiating PRFs (resp. hash functions) following those two paradigms. Let us also remark that the rank attack is preventable by adding an one-way extraction function at the end of the evaluation. But when the PRF (resp. hash function) is used inside other applications, an extraction function that is compatible with the application may not be easy to construct. As an example, when the matrix PRFs are used in safeguarding the branching-program obfuscator like [26, 27], it is not clear how to apply an extraction function that is compatible with the obfuscator.

Efficient PRF Based on the Conjugacy Problem. In the conference on mathematics of cryptography at UCI, 2015, Boneh proposed a simple construction of PRF based on the hardness of conjugacy problem, and suggested to look for suitable non-abelian groups for which the conjugacy problem is hard [12]. If such a group is found, it might lead to a PRF that is as efficient as AES. However, even without worrying about efficiency, it is not clear how to find a group where the decisional conjugacy problem is hard.

Here is a brief explanation of the conjugacy problem and the PRF construction [12]. Let K be a non-abelian group, G be a subset of K, H be a subgroup of K. Given \(g{\mathop {\leftarrow }\limits ^{\$}}G\), \(z = h \circ g \circ h^{-1}\) where \(h{\mathop {\leftarrow }\limits ^{\$}}H\), the search conjugacy problem asks to find h.

The PRF construction relies on the following decision version of the conjugacy problem. Let m be a polynomial. For \(h{\mathop {\leftarrow }\limits ^{\$}}H\), \(g_1\), \(g_2\), ..., \(g_m{\mathop {\leftarrow }\limits ^{\$}}G^m\). The decisional problem asks to distinguish

$$\begin{aligned} g_1, h \circ g_1 \circ h^{-1}, ..., g_m, h \circ g_m \circ h^{-1} \end{aligned}$$

from 2m random elements in G.

Let the input be \(x \in \{0,1\}^\ell \), the key be \(k = g, \left\{ h_{i,b} \right\} _{i \in [\ell ], b\in \{0,1\}}\). Then the following construction is a PRF assuming the decisional conjugacy problem is hard.

$$\begin{aligned} F_k(x) := h_{\ell , x_\ell } \circ h_{\ell -1, x_{\ell -1}} \circ ... \circ h_{1, x_{1}} \circ g \circ h_{1, x_{1}}^{-1} \circ ... \circ h_{\ell , x_{\ell }}^{-1} \end{aligned}$$

The proof follows the augmented cascade technique of [15].

Note that F only has \(2\ell -1\) steps, with each index in the input repeating for at most 2 times. So if G admits an efficient homomorphism to a matrix group, then the rank attack applies.

Finally, let us remark that there are candidate group for which the search conjugacy problem is hard, e.g. the braid group [33]. But the decisional conjugacy problem over the braid group is broken exactly using a representation as a matrix group [22].

Cryptographic Hash Functions Based on Cayley Graphs. We first recall the hard problems on Cayley graphs and their applications in building cryptographic hash functions [41]. Let \(\mathbb {G}\) be a finite non-abelian group, and \(S = \left\{ s_0, ..., s_m \right\} \) be a small generation set. The Cayley graph with respect to \((\mathbb {G}, S)\) is defined as follows: each element \(v\in \mathbb {G}\) defines a vertex; there is an edge between two vertices \(v_i\) and \(v_j\) if \(v_i = v_j\circ s\) for some \(s\in S\). The factorization problem asks to express an element of the group \(\mathbb {G}\) as a “short” product of elements from S. For certain groups and generation sets, the factorization problem is conjectured to be hard.

In 1991, Zémor [44] introduced a cryptographic hash function based on a Cayley graph with respect to the group \(\mathbb {G}= \mathsf {SL}(2, \mathbb {F}_p)\) and the set \(S = \left\{ s_0 = \begin{pmatrix}1,&{} 1\\ 0,&{} 1\end{pmatrix}, s_1 = \begin{pmatrix}1,&{} 0\\ 1,&{} 1\end{pmatrix} \right\} \). Let the input of the hash function be \(x \in \{0,1\}^\ell \). The evaluation of the hash function is simply

$$\begin{aligned} H(x) := \prod _{i = 1}^{\ell } s_{x_i}. \end{aligned}$$

The collision resistance of this function is based on the hardness of the factorization problem.

The factorization problem with respect to the original proposal of Zémor was solved by [43]. Then alternative proposals of the group \(\mathbb {G}\) and generation set S have since then been given (see the survey of [41]). Most of the groups in these proposals are still matrix groups.

We observe that since H is read-once, if the underlying group \(\mathbb {G}\) is a matrix group, then the rank attack is able to distinguish the hash function from a random oracle.

Finally, let us clarify that the original authors of the Cayley hash function proposals do not claim to achieve the random-oracle like properties, and most of the analyses of the Cayley graph-based hash function focus on its collision resistance (which is directly related to the factorization problem). Still, many applications of cryptographic hash functions require random-oracle like properties (e.g. in the Fiat-Shamir transformation), so we think it is worth to point out that the Cayley graph-based hash function does not achieve those strong properties when instantiated with matrix groups.

4 PRFs from Hard Matrix Problems

In this section, we propose plausibly hard problems related to matrix products, from which we can build a matrix PRF using the Naor-Reingold paradigm. We start from a few simple problems and explain how these problems can be solved efficiently. Then we generalize the attack methodology. Finally, we conclude with the final assumptions which survive our cryptanalytic attempts.

4.1 The Initial Attempts

First Take and the Determinant Attack. Our first assumption sets \(\mathbb {G}\) to be the group \(\mathsf {GL}(n,p)\) where we think of n as being the security parameter. Let m be an arbitrarily polynomially large integer. The assumption says that the following two distributions are computationally indistinguishable:

$$\begin{aligned} ( \mathbf {A}_1, ..., \mathbf {A}_m, (\mathbf {A}_1 \mathbf {B})^k, ..., (\mathbf {A}_m\mathbf {B})^k) \approx _c (\mathbf {A}_1, ..., \mathbf {A}_m, \mathbf {U}_1, ..., \mathbf {U}_m) \end{aligned}$$
(8)

where all the matrices are chosen uniformly at random from \(\mathsf {GL}(n,p)\).

Let us explain the choice of k. When \(k=1\), the assumption is trivially broken since we can just compute \( \mathbf {B} \) on the LHS. When k is a constant, we are still able to break the assumption using a linear algebraic technique detailed in Sect. 3. So we set k to be as large as the security parameter.

Unfortunately, even with a large k the assumption is broken, since on the LHS we have

$$\begin{aligned} \mathsf {det}((\mathbf {A}_2 \mathbf {B})^k) \cdot \mathsf {det}(\mathbf {A}_1)^k = \mathsf {det}((\mathbf {A}_1 \mathbf {B})^k) \cdot \mathsf {det}(\mathbf {A}_2)^k \end{aligned}$$

In general, any group homomorphism from \(\mathbb {G}\) to an Abelian group \(\mathcal {H}\) allows us to carry out this attack.

Second Take and the Order Attack. The easy fix for this is to take the group to be \(\mathsf {SL}(n,p)\), the group of n-by-n matrices with determinant 1. It is known that for several choices of n and p, \(\mathsf {SL}(n,p)\) is simple, namely, it has no normal subgroups. Consequently, it admits no non-trivial group homomorphisms to any Abelian group.

Fact 1

(see, e.g., [32]). The following are true about the special linear group \(\mathsf {SL}(n,p)\).

  1. 1.

    The projective special linear group \(\mathsf {PSL}(n,p)\) defined as the quotient \(\mathsf {SL}(n,p)/Z(\mathsf {SL}(n,p))\) is simple for any n and p, except when \(n=2\) and \(p=2,3\). Here, Z(G) denotes the center of group G, the set of elements in G that commute with any other element of G.

  2. 2.

    For n and p where \(\mathsf {gcd}(n,p-1) = 1\), the center of \(\mathsf {SL}(n,p)\) is trivial. Namely, \(Z(\mathsf {SL}(n,p)) = \{I_n\}\).

  3. 3.

    As a consequence of (1) and (2) above, for \(n\ge 3\) and p such that \(\mathsf {gcd}(n,p-1) = 1\), \(\mathsf {SL}(n,p)\) is simple.

In particular, we will pick \(p=2\) and \(n\ge 3\) to be a large number.

However, we notice that there is a way to break the assumption simply using the group order.

Fact 2

(see, e.g., [32]). The order of \(\mathsf {SL}(n,p)\) is easily computable. It is

$$ r := |\mathsf {SL}(n,p)| = p^{n(n-1)/2}\cdot (p^n-1)\cdot (p^{n-1}-1) \cdot \ldots \cdot (p^2-1)$$

Therefore, when k is relatively prime to r, we can compute \(\mathbf {A}_1 \mathbf {B}\) from \((\mathbf {A}_1 \mathbf {B})^k\) as follows: let \(s = k^{-1}\bmod r\) and compute \(\left( (\mathbf {A}_1 \mathbf {B})^k\right) ^s=\mathbf {A}_1 \mathbf {B}\). Consequently, the similar assumption for group \(\mathsf {SL}(n,p)\) is also broken easily.

One may hope that the assumption holds for certain subgroup of \(\mathbb {G}\subset \mathsf {GL}(n,p)\). To rule out the order attack, however, we should choose either (1) to hide the order of group \(\mathbb {G}\) or (2) fix the order of group to have many divisors, but neither is a nontrivial. We instead seek another way as follows.

Summary. From the first two attempts we rule out some choices of the group and parameters. Here is a quick summary.

  • k has to be as large as the security parameter \(\lambda \) to avoid the rank attack.

  • The determinant attack can be generalized to the case when there is an (efficiently computable) homomorphism f from \(\mathbb {G}\) to an abelian group H, since it crucially relies on the fact that \(f((\mathbf {A}_2 \mathbf {B})^k) \cdot f(\mathbf {A}_1)^k = f((\mathbf {A}_1 \mathbf {B})^k) \cdot f(\mathbf {A}_2)^k \) for \(f=\mathsf {det}\). To rule out this class of attacks, we fix \(\mathbb {G}\) to be non-abelian simple group.

  • The order attack heavily relies on the fact that one can cancel out \(\mathbf {A}_1\) in the left-end of the product. We thus use multiple \(\mathbf {A}\)’s to avoid this canceling with non-abelian group.

4.2 The First Formal Assumption and Construction

Let \(\mathbb {G}\) be a non-commutative simple group where the group elements can be efficiently represented by matrices (for example, the alternating group \(A_n\) for a polynomially large \(n\ge 5\)). Let k be as large as the security parameter \(\lambda \). Our assumption is

$$\begin{aligned} \bigg ( \{\mathbf {A}_{i,b}\}_{i\in [k], b\in \{0,1\}}, \prod _{i=1}^{k} (\mathbf {A}_{i,0} \mathbf {B}), \prod _{i=1}^{k} (\mathbf {A}_{i,1}\mathbf {B}) \bigg ) \approx _c \bigg ( \{\mathbf {A}_{i,b}\}_{i\in [k], b\in \{0,1\}}, \mathbf {B}_0, \mathbf {B}_1 \bigg ) \end{aligned}$$
(9)

where the matrices \(\{\mathbf {A}_{i,b}\}_{i\in [k], b\in \{0,1\}}\), \(\mathbf {B}\), \(\mathbf {B}_0\) and \(\mathbf {B}_1\) are chosen from \(U(\mathbb {G})\).

The PRF Construction. The family of pseudorandom functions is defined iteratively as follows.

Construction 4.1

The construction is parameterized by matrices \(\mathbf {A}_{1,0}, \mathbf {A}_{1,1}, \ldots , \mathbf {A}_{k,0}, \mathbf {A}_{k,1}\) sampled uniformly random from \(\mathbb {G}\).

$$\begin{aligned}&\mathsf {PRF}^{(i)}(x_1x_2\ldots x_{i}) = \prod _{j=1}^k (\mathbf {A}_{j,x_i}\cdot \mathsf {PRF}^{(i-1)}(x_1x_2\ldots x_{i-1})) \\&\mathsf {PRF}^{(0)}(\epsilon ) = \mathbf {I}\end{aligned}$$

where \(\epsilon \) is the empty string and \(\mathbf {I}\) is the identity matrix.

The proof follows a Naor-Reingold style argument and proceeds by showing, inductively, that \(\mathsf {PRF}^{(i-1)}(x_1x_2\ldots x_{i-1})\) is pseudorandom. If we now denote this matrix by \(\mathbf {B}\),

$$ \bigg (\mathsf {PRF}^{(i)}(x_1x_2\ldots 0), \mathsf {PRF}^{(i)}(x_1x_2\ldots 1)\bigg ) = \bigg ( \prod _{j=1}^k (\mathbf {A}_{j,0}\cdot \mathbf {B}), \prod _{j=1}^k(\mathbf {A}_{j,1}\cdot \mathbf {B}) \bigg ) $$

which, by Assumption 9, is pseudorandom.

4.3 Another Assumption and the Synthesizer-Based PRF Construction

In the second assumption, we still choose \(\mathbb {G}\) as a non-commutative simple group where the group elements can be efficiently represented by matrices. Let \(m_1, m_2\) be arbitrarily polynomially large integers, \(k = O(\lambda )\). Let \(\{ \mathbf {A}_{i, 1}, ..., \mathbf {A}_{i, k} \leftarrow U(\mathbb {G}^k) \}_{i\in [m_1]}\), \(\{ \mathbf {B}_{j, 1}, ..., \mathbf {B}_{j, k} \leftarrow U(\mathbb {G}^k) \}_{j\in [m_2]}\). Our assumption is

$$\begin{aligned} \bigg ( \prod _{v = 1}^{k}(\mathbf {A}_{i,v} \mathbf {B}_{j,v}) \bigg )_{i\in [m_1], j\in [m_2]} \approx _c \bigg ( \mathbf {U} _{i,j} \leftarrow U(\mathbb {G}) \bigg )_{i\in [m_1], j\in [m_2]} \end{aligned}$$
(10)

The Synthesizer-Based PRF Construction. To assist the construction of a synthesizer-based matrix PRF from Assumption (9), let us first define the lists of indices used in the induction.

Let \(k = O(\lambda )\), \(v = \left\lceil \log k \right\rceil \). Let \(\ell \in \mathop {{\text {poly}}}(\lambda )\) be the input length of the PRF. Let \(\epsilon \) denote the empty string. Let || be the symbol of list concatenation. For any list S of length t, let \(S^L\) denote the sublist of the \(\left\lfloor t/2 \right\rceil \) items from the left, let \(S^R\) denote the sublist of the \(t-\left\lfloor t/2 \right\rceil \) items from the right.

Define the initial index list as \(S_{\epsilon }:= \left\{ i_1, i_2, ..., i_\ell \right\} \). Define the “counter” list as \(C:= \left\{ a_1, ..., a_v \right\} \). Let \(r\in \{0,1\}^*\cup \epsilon \), iteratively define \(S_{r0}\) and \(S_{r1}\) as:

$$\begin{aligned} \text {if } S_r \text { is defined and }|S_r|\ge 4v, ~~~&S_{r0} := S_{r}^L || C, ~S_{r1} := S_{r}^R || C \\ \text {if } S_r \text { is defined and }|S_r|< 4v, ~~~&\bot . \end{aligned}$$

Let \(d\in \mathbb {Z}\) be the depth of the induction, i.e., any defined list \(S_r\) has \(|r|\le d\). We have \(2^d\ge \ell \ge \left( \frac{4-1}{3-1}\right) ^d = 1.5^d\). Since \(\ell \in \mathop {{\text {poly}}}(\lambda )\), we have \(2^d\in \mathop {{\text {poly}}}(\lambda )\).

Construction 4.2

The PRF is keyed by \(2^{4v} \cdot 2^d\in \mathop {{\text {poly}}}(\lambda )\) random matrices \( \left\{ \mathbf {A} _{i,S_r}\leftarrow U(\mathbb {G}) \right\} _{i\in \{0,1\}^{4v}, r\in \{0,1\}^d}\). The evaluation formula \(\mathsf {PRF}(x):=\mathsf {PRF}_{S_\epsilon }(x_1x_2\ldots x_{\ell })\) is defined inductively as

$$\begin{aligned} \text {if }|S_r|\ge 4v ~~~&\mathsf {PRF}_{S_r}(x_1x_2\ldots x_{t}) = \prod _{j=1}^k \left( \mathsf {PRF}_{S_{r0}}(x_1x_2\ldots x_{\left\lfloor t/2 \right\rceil }\tilde{j}) \cdot \mathsf {PRF}_{S_{r1}}(x_{\left\lfloor t/2 \right\rceil +1} \ldots x_{t}\tilde{j}) \right) \\ \text {if }|S_r|< 4v ~~~&\mathsf {PRF}_{S_r}(x_1x_2\ldots x_{t}) = \mathbf {A} _{x_1x_2\ldots x_{t}, S_r}. \end{aligned}$$

where \(\tilde{j}\) denotes the bit-decomposition of j.

4.4 Open Problems

Open Problem 1. In both of our PRF constructions, the numbers of steps in the final branching program (i.e., the number of matrices in each product) are super-polynomial. In Construction 4.1 it takes roughly \(O(k^\ell )\) steps; in Construction 4.2 it takes roughly \(O(k^d)\) steps. Although those PRFs are efficiently computable (the key is to reuse intermediate products), the numbers of steps are enormous. Is there a way to obtain a matrix PRF with polynomial number of steps from inductive assumptions?

Open Problem 2. Any PRF in \(\mathsf {NC}^{1}\) gives rise a matrix PRF, with a possibly different order of products. Is there a canonical order and a canonical group such that the security of any \(\mathsf {NC}^1\) PRF can be reduced to one construction? This would possibly give us a (nice) universal PRF.

5 Matrix Attacks for the Candidate Block-Local PRG from BBKK18

A pseudorandom generator \(f: \{0,1\}^{bn}\rightarrow \{0,1\}^m\) is called \(\ell \)-block-local if the input can be separated into n blocks, each of size b bits, such that every output bit of f depends on at most \(\ell \) blocks. When roughly \(m\ge \tilde{\varOmega }(n^{\ell /2})\)Footnote 3, there is a generic attack on \(\ell \)-block-local PRGs [8]. Specific to 3-block-local PRGs, no generic attack is known for \(m<n^{1.5}\).

In [8], the authors propose a simple candidate \(\ell \)-block-local PRG from group theory, where m can be as large as \(n^{\ell /2 - \epsilon }\). Let us recall their candidate, with \(\ell = 3\) for the simplicity of description. Let \(\mathbb {G}\) be a finite group that does not have any abelian quotient group. Choose 3m random indices \( \left\{ i_{j, k}{\mathop {\leftarrow }\limits ^{\$}}[n] \right\} _{j\in [m], k\in [3]}\). The 3-block-local-PRG f is mapping from \(\mathbb {G}^n\) to \(\mathbb {G}^m\) as

$$\begin{aligned} f_j(x_1, ..., x_n) = x_{j,1}\circ x_{j,2}\circ x_{j,3}. \end{aligned}$$

In particular, the authors mentioned that \(\mathbb {G}\) can be a non-commutative simple group.

We show that when \(\mathbb {G}\) admits an efficiently computable homomorphism to a matrix group \(M_w(R)\) (e.g. when \(\mathbb {G}\) is an alternating groups \(A_w\) with \(w\ge 5\)), then there is an attack that rules out certain choices of combinations of indices in f. In particular, we show that when \(\mathbb {G}\) is chosen as the alternating group, then a non-negligible fraction of the candidates (where the randomness is taken over the choices of the indices) are not PRGs.

The attack uses the fact that for any two matrices \( \mathbf {A} , \mathbf {B} \in R^{w\times w}\), \(\chi ( \mathbf {A} \mathbf {B} ) = \chi ( \mathbf {B} \mathbf {A} )\), where \(\chi \) denotes the characteristic polynomial. For simplicity let us assume the group \(\mathbb {G}\) is super-polynomially large (e.g. \(\mathbb {G}= A_w\) where \(w = O(\lambda )\)). The distinguisher trys to find four output bits whose indices are of the pattern

$$\begin{aligned} (a,b,c), (d,e,f), (b,c,d), (e,f,a) \end{aligned}$$
(11)

where the same letter denote the same index.

Then for these four output group elements represented by matrices \( \mathbf {M} _1\), \( \mathbf {M} _2\), \( \mathbf {M} _3\), \( \mathbf {M} _4\), we always have \(\chi ( \mathbf {M} _1 \mathbf {M} _2) = \chi ( \mathbf {M} _3 \mathbf {M} _4)\) in the real case. In the random case, since we assume \(\mathbb {G}\) is super-polynomially large, the characteristic polynomials are unlikely to be equal.

Now we bound the probability for the existence of Pattern (11) if the indices are chosen randomly. The total number N of different layouts of the indices is:

$$\begin{aligned} N = n^{3m} \end{aligned}$$

The total number M of different layouts of the indices such that Pattern (11) occurs can be lower bounded by fixing Pattern (11) over 4 output bits, and choose the rest arbitrarily. I.e.

$$\begin{aligned} M \ge n^{3(m-4)} \end{aligned}$$

So \(M/N \ge n^{-12}\), which means as long as \(m\ge 4\), a non-negligible fraction of all the candidate 3-block-local-PRGs can be attacked when instantiated with \(\mathbb {G}\) as a matrix group.

The attack can be generalized to smaller \(\mathbb {G}\), and larger \(\ell \). On the positive side, the attack also seem to be avoidable by not choosing the indices that form Pattern (11).

6 Candidate Indistinguishability Obfuscation

In this section we give a candidate construction of indistinguishability obfuscation \(\mathcal O\), following [11, 18, 27].

Preliminaries. A branching program \(\varGamma \) is a set

$$\begin{aligned} \varGamma = \Bigl \{\mathbf u_L^{\mathbf {P}} \in \{0,1\}^{1 \times w}, \left\{ \mathbf {P}_{i,b} \in \{0,1\}^{w\times w}\right\} _{i \in [h], b \in \{0,1\}},\mathbf u_R,\varpi :\{0,1\}^\ell \rightarrow \{0,1\}^h\Bigr \} \end{aligned}$$

where w is called width of branching program and \(\varpi \) an input-to-index function. We write

$$\varGamma (\mathbf x'):= {\left\{ \begin{array}{ll} \mathbf u_L \mathbf {P}_{\mathbf x'} \mathbf u_R&{}\text { if }\mathbf x' \in \{0,1\}^h\\ \mathbf u_L \mathbf {P}_{\mathbf x'} &{}\text { if }\mathbf x'\in \{0,1\}^{<h} \end{array}\right. } $$

We say that a branching program \(\varGamma \) computes a function \(f: \{0,1\}^\ell \rightarrow \{0,1\}\) if

$$\begin{aligned} \forall \mathbf x\in \{0,1\}^\ell :\varGamma (\varpi (\mathbf x)) = 0 \Longleftrightarrow f(\mathbf x) = 1 \end{aligned}$$

We particularly consider a simple input-to-index function \(\varpi :\{0,1\}^\ell \rightarrow \{0,1\}^h\) that outputs \(h/\ell \) copies of \(\mathbf x\), i.e. \(\varpi (\mathbf x) = \mathbf x|\mathbf x|\cdots |\mathbf x\). We denote \(c:=h/\ell \) and call this branching program c-input-repeating. We define an index-to-input function \(\iota :[h] \rightarrow [\ell ]\) so that \(\iota :x \mapsto (x \bmod \ell ) +1.\) For a string \(\mathbf x\in \{0,1\}^*\), we denote the length of \(\mathbf x\) by \(|\mathbf x|.\) We say \(\mathbf x'\in \varpi (\{0,1\}^\ell )\) input-consistent or simply consistent.

Lattice Basics. We briefly describe the basic facts in the lattice problems and trapdoor functions. For more detailed discussion and review we refer [18] to readers. What we need for the construction is, roughly speaking, that there is an algorithm, given matrices \(\mathbf {A}\) and \(\mathbf {B}\) and a trapdoor \(\tau _\mathbf {A}\), to sample a (random) matrix \(\mathbf {D}\) whose entries follow the discrete Gaussian distribution with small variance such that \( \mathbf {A} \mathbf {D} = \mathbf {B} \bmod q.\) We denote this random small-norm Gaussian \(\mathbf {D}\) by \(\mathbf {A}^{-1}(\mathbf {B})\) following [18]. Readers who are not interested in the details may skip the detailed definitions and lemmas described here, since they are only used for technical details such as set parameters, etc.

We denote the discrete Gaussian distribution over \(\mathbb {Z}^n\) with parameter \(\sigma \) by \(D_{\mathbb {Z}^n,\sigma }\). Given matrix \(\mathbf {A}\in \mathbb {Z}_q^{n \times m},\) the kernel lattice of \(\mathbf {A}\) is denoted by

$$ \varLambda ^\bot (\mathbf {A}):= \{\mathbf c\in \mathbb {Z}^m: \mathbf {A}\cdot \mathbf c= \mathbf {0} ^n \bmod q \}. $$

Given \(\mathbf y\in \mathbb {Z}_q^n\) and \(\sigma >0\), we use \(\mathbf {A}^{-1}(\mathbf y,\sigma )\) to denote the distribution of a vector \(\mathbf d\) sampled from \(D_{\mathbb {Z}^m,\sigma }\) conditioned on \(\mathbf {A}\mathbf d= \mathbf y\bmod q.\) We sometimes omit \(\sigma \) when the context is clear.

Definition 6.1

(Decisional learning with errors (LWE) [42]). For \(n,m\in \mathbb {N}\) and modulus \(q \ge 2\), distributions for secret vector, public matrices, and error vectors \(\theta ,\pi \chi \subset \mathbb {Z}_q\). An LWE sample w.r.t. these parameters is obtained by sampling \(\mathbf s\leftarrow \theta ^n\), \(\mathbf {A}\leftarrow \pi ^{n \times m},\) \(\mathbf e\leftarrow \chi ^m\) and outputting \((\mathbf {A},\mathbf s^T \mathbf {A}+ \mathbf e^T \bmod q).\)

We say that an algorithm solves \(\mathsf {LWE}_{n,m,q,\theta ,\pi ,\chi }\) if it distinguishes the LWE sample from a random sample distributed as \(\pi ^{n\times m} \times U(\mathbb {Z}_q^{1\times m})\) with probability bigger than 1/2 plus non-negligible.

Lemma 6.2

(Standard form [16, 38, 39, 42]). For \(n \in \mathbb {N}\) and for any \(m = \mathop {{\text {poly}}}(n)\), \(q \le 2^{\mathop {{\text {poly}}}(n)}\). Let \(\theta =\pi =U(\mathbb {Z}_q)\) and \(\chi =D_{\mathbb {Z},\sigma }\) where \(\sigma \ge 2\sqrt{n}\). If there exist an efficient (possibly quantum) algorithm that solves \(\mathsf {LWE}_{n,m,q,\theta ,\pi ,\chi }\), then there exists an efficient (possibly quantum) algorithm for approximating \(\mathsf SIVP\) and \(\mathsf GapSVP\) in \(\ell _2\) norm, in the worst case, within \(\tilde{O}(nq/\sigma )\) factors.

Lemma 6.3

(LWE with small public matrices [14]). If \(n,m,q,\sigma \) are chosen as Lemma 6.2, then \(\mathsf {LWE}_{n',m,q,U(\mathbb {Z}_q),D_{\mathbb {Z},\sigma },D_{\mathbb {Z},\sigma }}\) is as hard as \(\mathsf {LWE}_{n,m,q,\theta ,\pi ,\chi }\) for \(n' \ge 2n \log q\).

Lemma 6.4

([3, 4, 28, 36]). There is a p.p.t. algorithms \(\mathsf{TrapSamp}(1^n,1^m,q)\) that, given modulus \(q\ge 2\) and dimension mn such that \(m\ge 2n \log q\), outputs \(\mathbf {A}\approx _s U(\mathbb {Z}_q^{n\times m})\) with a trapdoor \(\tau .\) Further, if \(\sigma \ge 2\sqrt{n \log q},\) there is a p.p.t. algorithm that, given \((\mathbf {A},\tau )\leftarrow \mathsf{TrapSam}(1^n,1^m,q)\) and \(\mathbf y\in \mathbb {Z}^n_q\), outputs a sample from \(\mathbf {A}^{-1}(\mathbf y,\sigma ).\) Further, it holds that

$$ \{\mathbf {A},\mathbf x,\mathbf y:\mathbf y\leftarrow U(\mathbb {Z}_q^n),\mathbf x\leftarrow \mathbf {A}^{-1}(\mathbf y,\sigma ) \} \approx _s \{\mathbf {A},\mathbf x,\mathbf y:\mathbf x\leftarrow D_{\mathbb {Z}^m,\sigma }, \mathbf y= \mathbf {A}\mathbf x\}. $$

6.1 Construction

Input. The obfuscation algorithm takes as input a c-input-repeating branching program \(\varGamma =\{\mathbf u_L \in \{0,1\}^{1\times w}, \left\{ \mathbf {P}_{i,b} \in \{0,1\}^{w\times w} \right\} _{i \in [h], b \in \{0,1\}},\mathbf u_R\}\) computing a function \(f : \{0,1\}^\ell \rightarrow \{0,1\}\).

We modify \(\varGamma \) to a new functionally equivalent branching program \(\varGamma '\) so that it satisfies \(\varGamma '(\mathbf x')\ne 0 \) for all \(\mathbf x' \notin \varpi (\{0,1\}^h)\) (as well as \(\mathbf x'\in \{0,1\}^{<h}\)). This can be done by padding an input-consistency check program in the right-bottom diagonal of \(\mathbf {P}\), which only slightly increases w and the bound of entries. Concretely we follow Construction 6.1. For brevity, we just assume that the input program is of the form

$$\begin{aligned} \varGamma =\{\mathbf u_L \in \{0,1,\cdots ,T\}^{1\times w}, \left\{ \mathbf {P}_{i,b} \in \{0,1,\cdots ,T\}^{w\times w} \right\} _{i \in [h], b \in \{0,1\}},\mathbf u_R\} \end{aligned}$$

and assume that it satisfies the condition above without loss of generality. In particular, \(|\varGamma (\varpi (\mathbf x))| \le T\) in this construction.

Obfuscation Procedure

  • Set parameters \(n,m,q,\tau ,\nu ,B\in \mathbb {N}\) and \(\sigma \in \mathbb {R}^+\) as in Parameter (Sect. 6.1). Let \(d:=wn+5\tau + 3\ell \) be a dimension of pre-encoding.

  • Sample a matrix PRF \(\{\mathbf u_L^\mathbf {M}\in \{0,1\}^{1 \times 5\tau },\{\mathbf {M}_{i,b}\in \{0,1\}^{5\tau \times 5\tau }\}_{i\in [h],b \in \{0,1\}} , \mathbf u_R^\mathbf {M}\in \mathbb {Z}^{5\tau \times 1}\}\) with input length \(\ell \) and c-repetition whose range is \([0,2^\tau -1].\) Concretely, we follow Construction 6.1. By padding the programs, we may assume that the input program and the matrix PRF share the same input-to-index function \(\varpi :\{0,1\}^h\rightarrow \{0,1\}^\ell \).

  • Sample \(\left\{ \mathbf {S}_{i,b}\leftarrow D_{\mathbb {Z},\sigma }^{n\times n}\right\} _{i \in [h] , b \in \{0,1\}}\) and \({\mathbf a}_h\leftarrow U(\mathbb {Z}_q^{n\times 1})\), and compute pre-encodings as follows:

    $$\begin{aligned}&\mathbf {J}:= \begin{pmatrix}\mathbf u_L\otimes \mathbf {1} ^{1\times n}||\mathbf u_L^\mathbf {M}\end{pmatrix}, \quad \mathbf {L}:= \begin{pmatrix}\mathbf u_R\otimes \mathbf a_h\\ \mathbf u_R^\mathbf {M}\end{pmatrix},\\&\hat{ \mathbf {S} }_{i,b} := \begin{pmatrix} \mathbf {P}_{i,b} \otimes \mathbf {S} _{i,b}&{}\\ {} &{}\mathbf {M}_{i,b} \end{pmatrix}\quad \text { for }i\in [h] \end{aligned}$$

    For brevity we write \(\mathbf {S}(\mathbf x'):= \mathbf {1} ^{1 \times n} \cdot \mathbf {S}_{\mathbf x'} \cdot \mathbf a_h.\) In particular, for all \(\mathbf x' \in \{0,1\}^h\),

    Note that \( \varGamma (\mathbf x')\) is a scalar, thus \(\otimes \) is just a multiplication.

  • Sample error matrices \(\mathbf {E}_{i,b}\) from \(D_{\mathbb {Z},\sigma }\) with the corresponding dimension and computes

Output. The obfuscation algorithms outputs \(\{\mathbf {A}_J, \{\mathbf {D}_{i,b}\}_{i \in [h],b\in \{0,1\}}\}\) as an obfuscated program.

Evaluation. For input \(\mathbf x\in \{0,1\}^\ell \), returns 1 if \(|\mathbf {A}_J \cdot \mathbf {D}_{\varpi (\mathbf x)}\bmod q|<B\), and 0 otherwise.

Correctness. For \(\mathbf x\in \{0,1\}^{\le h}\) with length \(h'\),

(12)

where \(\mathbf {A}_h:=\mathbf {L}\). Note that all entries following the discrete Gaussian distribution is bounded by \(\sqrt{m} \sigma \) with overwhelming probability. The latter term, GGH15 errors, can be bounded, with all but negligible probability, as follows:

In particular, for \(\mathbf x'= \varpi (\mathbf x)\) and \(f(\mathbf x)=1,\) the first term is \(\mathsf {PRF}_{\mathbf {M}}(\mathbf x),\) which is bounded by \(2^\tau -1.\) We set \(B\ge 2^\tau + (2wd) \cdot h \cdot (m\sqrt{m} \sigma \cdot wT)^{h}\) so that for every \(\mathbf x\) satisfying \(f(\mathbf x)=1\) the obfuscation outputs correctly.

We also note that, if we set \(q>B \cdot \omega (\mathop {{\text {poly}}}(\lambda ))\),

$$ \varGamma (\mathbf x') = \mathbf {0} \Longleftrightarrow \mathbf x' =\varpi (\mathbf x)\wedge f(\mathbf x)=1 $$

holds for any \(\mathbf x' \in \{0,1\}^{\le h}\) since we pad the input-consistency check program at the beginning. This implies that the random matrix \(\mathbf {A}_h'\) the (partial) evaluation \(\mathbf {A}_J \cdot \mathbf {D}_{\mathbf x'}\) is not canceled. That is, the probability that the evaluation of obfuscation outputs 1 is negligible for an incomplete, inconsistent input \(\mathbf x'\) or an input \(\mathbf x'=\varpi (\mathbf x)\) satisfying \(f(\mathbf x)=0\).

Parameters. Our parameter settings follow [11, 18], which matches to the current existing safety mechanisms. Let \(\lambda \) be a security parameter of construction and \(\lambda _{\mathsf {LWE}}=\mathop {{\text {poly}}}(\lambda )\) a security parameter of underlying LWE problem. Let \(d:=w n + 5\tau \) be a dimension of pre-encodings. For trapdoor functionalities, \(m=\varOmega (d \log q )\) and \(\sigma = \varOmega (\sqrt{z \log q})\) by Lemma 6.4. Set \(n= \varOmega (\lambda _\mathsf {LWE}\log q)\) and \(\sigma = \varOmega (\sqrt{\lambda _\mathsf {LWE}})\) for the security of LWE as in Lemmas 6.2 and 6.3. Set \(q\le (\sigma /\lambda _\mathsf {LWE}) \cdot 2^{\lambda _\mathsf {LWE}^{1-\epsilon }}\) for an \(\epsilon \in (0,1)\). Also for the security proof in our model, we set \(2^\tau \ge (2wd) \cdot h \cdot (m\sqrt{m} \sigma \cdot wT)^{h} \cdot \omega (\mathop {{\text {poly}}}(\lambda )).\) On the other hand, we set \(B\ge 2^\tau + (2wd) \cdot h \cdot (m\sqrt{m} \sigma \cdot wT)^{h}\) and \(q \ge B \cdot \omega (\mathop {{\text {poly}}}(\lambda ))\) for the correctness.Footnote 4

Construction of Subprograms

Input-Consistency Check Program. We describe a read-once branching program for checking whether \(\mathbf x' \in \varpi (\{0,1\}^\ell )\); this plays the role of so-called “bundling scalars” or “bundling matrices” in prior constructions. For \(i \in [h]\) and \(b\in \{0,1\}\), compute \(\mathbf {C}_{i,b} \in \mathbb {Z}^{3\ell \times 3\ell }\) as the \(\mathsf{diag}(\mathbf {C}_{i,b}^{(1)},\cdots ,\mathbf {C}^{(\ell )}_{i,b})\) where

$$ \mathbf {C}_{i,b}^{(k)} = {\left\{ \begin{array}{ll} \mathbf {I}^{3\times 3}&{}\text { if }\iota (i)\ne k\\ \mathsf{diag}(1,0,1)&{}\text { if }\iota (i)=k\text { and }i \le (c-1)\ell \\ \mathsf{diag}(0,1,1)&{}\text { if }\iota (i)=k\text { and }i >(c-1)\ell \end{array}\right. } $$

Let \(\mathbf u_L^\mathbf {C}=B\cdot \mathbf {1} ^{1 \times 3\ell }\) and \(\mathbf u_R^\mathbf {C}= (1,1,-1)^T\otimes \mathbf {1} ^{\ell \times 1},\) where T is an integer satisfying \(\Vert \mathbf {P}(\mathbf x')\Vert _\infty < T\) for all \(\mathbf x' \in \{0,1\}^{\le h}\).

Then \(\{\mathbf u_L^\mathbf {C},\{\mathbf {C}_{i,b}\}_{i\in [h] b\in \{0,1\}},\mathbf u_R^\mathbf {C}\}\) is an input-consistency check program, and further \(\mathbf {C}(\mathbf x') + \mathbf {P}(\mathbf x') \ne \mathbf {0} \) for all \(\mathbf x' \notin \varpi (\{0,1\}^\ell )\) and \(\mathbf x' \in \{0,1\}^{<h}.\) That is, we concretely consider

$$ \varGamma ' = \left\{ {\mathbf u'}_L =(\mathbf u_L\Vert \mathbf u_L^\mathbf {C}), \left\{ \mathbf {P}'_{i,b}=\mathsf{diag}(\mathbf {P}_{i,b},\mathbf {C}_{i,b}) \right\} _{i \in [h], b \in \{0,1\}},{\mathbf u'}_R=\begin{pmatrix}\mathbf u_R\\ \mathbf u_R^\mathbf {C}\end{pmatrix}\right\} . $$

In particular, this gives \(w_\mathsf{new} = w+3\ell \) and the bound of entry \(T=2w\). Also we note that \(\varGamma ' (\varpi (\mathbf x)) = \varGamma (\varpi (\mathbf x)),\) thus this is bounded by T.

Remark 6.5

Usual construction of branching programs have a property that \(\mathbf u_L\cdot \mathbf {P}_\mathbf x' \in \{0,1\}^{1 \times w}\) for all \(\mathbf x' \in \{0,1\}^{< h}\) and \(|\mathbf {P}(\mathbf x')|\le w\), thus we can set \(T:=2w\); or set \(T=w^h\) safely. In our parameter setting, we used \(T=2w.\)

Matrix PRFs. For concreteness we provide the construction of matrix PRFs used in the obfuscation given in [27, Section 4.2]. By Barrington’s theorem [10], we know that there exist matrix PRFs that output a random binary value. WLOG, we assume that it is c-input-repetition branching program. We write this as \(\{\mathbf u_L^{(j)},\{\mathbf {M}_{i,b}^{(j)}\}_{i\in [h], b \in \{0,1\}}, \mathbf u_R^{(j)}\}_{j \in [\tau ]}\) that are independent to each others. Note that all entries are binary. We concatenate them as

$$ \mathbf u^\mathbf {M}_L = (\mathbf u_L^{(1)}\Vert \cdots \Vert \mathbf u_L^{(\tau )}),\quad \mathbf {M}_{i,b} = \mathsf{diag} (\mathbf {M}_{i,b}^{(1)},\cdots ,\mathbf {M}_{i,b}^{(\tau )}),\quad \mathbf v^\mathbf {M}_R = \begin{pmatrix} \mathbf v_R^{(1)}\\ 2\cdot \mathbf v_R^{(2)}\\ \cdots \\ 2^{\tau -1}\cdot \mathbf v_R^{(\tau )} \end{pmatrix} $$

then \(\mathsf {PRF}_\mathbf {M}:\mathbf x\mapsto \mathbf u_L^\mathbf {M}\cdot \mathbf {M}_{\varpi (\mathbf x)} \cdot \mathbf u_R^\mathbf {M}\in [0,2^\tau -1]\) is a pseudorandom function, which is the desired construction. Note that the width of this program is \(5\tau .\)

6.2 Security

Security Model. We note that almost all known attacks including the recently reported statistical zeroizing attack [19], rank attack and subtraction attack [18] only exploit the evaluations of \(\mathbf x' \in \varpi (\{0,1\}^\ell ).\) While some attacks called mixed-input attack are considered in the literature (e.g. [26]), however, there is only one actual attack [17] in such class for GGH15-based obfuscation so far, which only exploits several input-consistent evaluations as well in the first phase to extract the information to run mixed-input attack. Some attack that indeed use the mixed-inputs for other multilinear maps [24, 25], but the first step either uses the valid inputs [40] or decodes the multilinear map using known weakness of the NTRU problem [20].

From this motivation, we consider a restricted class of adversary which can gets oracle access to an input-consistent evaluation oracle

$$\begin{aligned} O_r : \mathbf x\mapsto \mathbf {A}_J \mathbf {D} _{\varpi (\mathbf x)} \bmod q, \; \forall \mathbf x\in \{0,1\}^{\ell } \end{aligned}$$

In our model that we call input-consistent evaluation model the purpose of adversary is to obtain any meaningful information of the implementation of \(\varGamma \) beyond the input-output behavior. More concretely, we say that the obfuscation procedure is VBB-secure in the input-consistent evaluation model if any p.p.t. adversary cannot distinguish the oracle \(O_r\) from the following oracle

$$\begin{aligned} F_r(\mathbf x) = {\left\{ \begin{array}{ll} U([0,2^\tau -1]) &{} \text{ if } f(\mathbf x)=0\\ U(\mathbb {Z}_q) &{} \text{ otherwise } \end{array}\right. } \end{aligned}$$
(13)

with non-negligible probability, i.e. \(O_r(\cdot ) \approx _c F_r(\cdot ).\)

Theorem 6.6

The obfuscation construction \(\mathcal O\) is VBB-secure in the input-consistent evaluation models.

The main strategy is to hide the lower bits by embedded matrix PRFs, and hide the higher bits using lattice-based PRFs [7, 14] stated as follows.

Lemma 6.7

([18, Lemma 7.4]). Let \(h,n,q, b \in \mathbb {N}\) and \(\sigma ,\sigma ^* \in \mathbb {R}\) s.t. \(n = \varOmega (\lambda \log q),\) \(\sigma = \varOmega (\sqrt{\lambda \log q})\), \(b\ge h \cdot (\sqrt{n} \sigma )^h,\) \(\sigma ^* > \omega (\mathop {{\text {poly}}}(\lambda ))\cdot b,\) \(q \ge \sigma ^* \omega (\mathop {{\text {poly}}}(\lambda )).\) Define a function family \(\mathcal F = \{f_{\mathbf a}:\{0,1\}^{h} \rightarrow \mathbb {Z}_{q}^{n}\},\) for which the key generation algorithm samples \(\mathbf a\leftarrow U(\mathbb {Z}_q^n)\) as the private key, \(\left\{ \mathbf {S}_{i,b}\leftarrow D_{\mathbb {Z},\sigma }^{n\times n}\right\} \) as the public parameters. The evaluation algorithm takes input \(\mathbf x' \in \{0,1\}^h\) and computes

$$ f_\mathbf a(\mathbf x' ) = \left( \prod _{i=1}^h \mathbf {S}_{i,x_i} \right) \cdot \mathbf a+\mathbf e_{\mathbf x'} = \mathbf {S}_{\mathbf x'}\cdot \mathbf a+ \mathbf e_{\mathbf x'} (\bmod q) $$

where \(\mathbf e_{\mathbf x'}\leftarrow D_{\mathbb {Z},\sigma ^*}^{n}\) is freshly sampled for every \(\mathbf x' \in \{0,1\}^h.\) Then, for \(d=\mathop {{\text {poly}}}(\lambda ),\) the distribution of evaluations \(\{f_\mathbf a(\mathbf x'_1),\cdots , f_\mathbf a(\mathbf x'_d)\}\) over the choice of \(\mathbf a\) and errors is computationally indistinguishable from d independent uniform random vectors from \(\mathbb {Z}_q^n\), assuming the hardness of \(\mathsf {LWE}_{n,\mathop {{\text {poly}}},q,U(\mathbb {Z}_q),D_{\mathbb {Z},\sigma },D_{\mathbb {Z},\sigma }}.\)

The proof of the main theorem is as follows.

Proof

(Proof of Theorem 6.6). We will show that the sequence of \(d=\mathop {{\text {poly}}}(\lambda )\) queries to \(O_r\) are indistinguishable to the corresponding queries to \(F_r\) as follows.

$$\begin{aligned} \left\{ O_r(\cdot )\right\}&=\left\{ \mathbf x\mapsto \varGamma (\varpi (\mathbf x)) \cdot \mathbf {S}(\varpi (\mathbf x)) + \mathsf {PRF}_\mathbf {M}(\varpi (\mathbf x))+\text {(GGH15 errors)}\right\} _{k \in [d]}\\&\approx _c \left\{ \mathbf x\mapsto \varGamma (\varpi (\mathbf x)) \cdot \mathbf {S}(\varpi (\mathbf x)) + U([0,2^\tau -1])+\text {(GGH15 errors)} \right\} _{k \in [d]}\\&\approx _s \left\{ \mathbf x\mapsto \varGamma (\varpi (\mathbf x)) \cdot (\mathbf {S}(\varpi (\mathbf x)) + \mathbf e_{\varpi (\mathbf x)}) + U([0,2^\tau -1]) \right\} _{k \in [d]}\\&\approx _s \left\{ \mathbf x\mapsto \varGamma (\varpi (\mathbf x)) \cdot U(\mathbb {Z}_q) + U([0,2^\tau -1]) \right\} _{k \in [d]}\\&\approx _s \{ F_r(\cdot ) \} \end{aligned}$$

Here, we are using noise-flooding applied to \(\varGamma (\varpi (\mathbf x)) e_{\varpi (\mathbf x)} + \) (GGH15 errors). More precisely, to invoke Lemma 6.7, it should hold that \(2^{\tau }\ge h \cdot (\sqrt{n} \sigma )^h \cdot \omega (\mathop {{\text {poly}}}(\lambda ))\) and \(2^\tau \ge (2wd) \cdot h \cdot (m\sqrt{m} \sigma \cdot wT)^{h} \cdot \omega (\mathop {{\text {poly}}}(\lambda )) \) to neglect GGH15 errors.

Remark 6.8

(weakening PRF requirements). We note that we only use the matrix PRF for noise-flooding, and therefore it suffices to relax pseudorandomness of \(F : \{0,1\}^\ell \rightarrow [0,2^\tau -1]\) to the following: for any efficiently computable B-bounded function \(g : \{0,1\}^\ell \rightarrow [B,-B]\) where \(B \ll 2^\tau \), we have

$$\{ \mathbf x\mapsto F(\mathbf x) \} \approx _c \{ \mathbf x\mapsto F(\mathbf x) + g(\mathbf x)\}$$

where \(+\) is computed over \(\mathbb {Z}\). A similar relaxation has been considered in the context of weaker pseudorandom generators for building IO [5]. For this notion, one could potentially have candidates where each \(\mathbf {M}_{i,b}\) is drawn uniformly at random from a Gaussian distribution but where \(\mathbf v^\mathbf {M}_R\) is the same as in Sect. 6.1.

6.3 Comparison

In this section we compare our model to the previous security model in [11].

First, we briefly review the security model in [11]. This model gives a stronger oracle to the adversary that allows the adversary to query a polynomial (or circuit) rather than an input \(\mathbf x\). More precisely, the adversary chooses a circuit C described by \( \left\{ \beta _{i,b}^{(k)} \right\} _{i \in [h], b \in \{0,1\},k\in K}\) and queries

$$\begin{aligned} T=\mathbf {A}_J \sum _{k\in K} \prod _{i=1}^h (\beta ^{(k)}_{i,0} \mathbf {D} _{i,0} + \beta ^{(k)}_{1,1} \mathbf {D} _{i,1}) \bmod q \end{aligned}$$

to a zero-testing oracle, and learns the value T only if it is sufficiently small compared to q. We index the zerotesting values obtained by the adversary by u, thus \(T_u\) is the adversary’s u-th successful zerotesting value. The purpose of adversary is to find any non-trivial algebraic relation between \(T_u\)’s and pre-encodings \(\hat{\mathbf {S}}\).Footnote 5 Despite the generality of oracle inputs, the statistical zeroizing attacks in [19] do not fall into this class; the adversary using the statistical zeroizing attacks is to check if an inequality holds.

On the other hand, our model gives an input-consistent oracle to adversary which is much weaker. Instead, the purpose of adversary is to find any information beyond input-output behavior of the program. That is, we do not restrict the goal of adversary to computing a nontrivial algebraic relations. This freedom allows us to capture almost all existing attacks.

An interesting question is to design a model that embrace both models, and construct a secure obfuscation procedure in such model. A candidate model is to allow the adversary to access both oracles described above. Note that [11, Lemma 8] states that the set of adversary’s successful zerotest is essentially a set of polynomially-many linear sum of input-consistent evaluations. With this lemma in mind, an obfuscation procedure satisfying the corresponding lemma as well as the VBB security in the input-consistent evaluation model may satisfy a meaningful security in this model.