Keywords

1 Introduction

Program obfuscation is the problem of modifying a computer program so to hide sensitive details of its code without changing its input/output behavior. While this problem has been known for several years in computer science, only in the last 15 years, researchers have considered the problem of provable program obfuscation; that is, the problem of program obfuscation, where sensitive code details are proved to remain hidden under a widely accepted intractability assumption (such as those often used in applied cryptography). Early results in the area implied the likely impossibility of constructing a single program capable of obfuscating an arbitrary polynomial-time program into a virtual black box [3]. Moreover, most recent results show the possibility of constructing different obfuscators for restricted families of functions, such as point functions (and extensions of them), under more or less accepted hardness assumptions (see, e.g., [2, 5, 10, 12, 16, 19]). Point functions can be seen as functions that return 1 if the input value is equal to a secret value stored in the program, and 0 otherwise. Although conceptually simple, point functions come with surprisingly interesting applicability to real-life problems. As often suggested in the literature (see, e.g., [16, 19]), point function obfuscation might be applicable to the password verification function in very commonly used login/password authentication methods.

In this paper we carry out an exploration of this suggestion. Our main result is a practical method for cryptographic password obfuscation under standard cryptographic assumptions (and, specifically, without using a random oracle assumption or a heuristic construction for a multilinear map), such as the existence of a one-way permutation or, of a block cipher for which there are no theoretical attacks faster than exhaustive key search. Known practical methods include the well-known password hashing method (i.e., at registration, server stores the cryptographic hash of the user’s password; at authentication, server checks that hash of the provided password is equal to the stored hash). This method was analyzed in a cryptographic program obfuscation context by [16], but is however only proved secure under the random oracle assumption. (Note that this assumption, although often accepted by practitioners, has been declared as almost certainly false in its generality [9], and is especially troublesome in light of the less and more recent attacks to widely considered or used hash functions such as MD4, MD5 and SHA1). By now, there are several known obfuscators for point functions that do not make the random oracle assumption, but they all assume secrets much longer than typical passwords. The only exception and the closer result in the literature to ours is an elegant construction of a perfectly one-way function from [10], which could be used to construct a point function obfuscator under the assumption of claw-free permutations. We note that this construction is not practical as it is estimated 4–5 orders of magnitude less efficient than the one we propose.

As all point function obfuscators in the literature use secrets of length about equal to the factoring-type security parameter (e.g., 2048), to increase capability to commonly used passwords and passphrases as well as secrets of arbitrary lengths, we have designed two new methods: (a) a new hash function that transforms these point function obfuscators in the literature so that they can work with arbitrarily longer secrets; (b) a new (multi-bit-output) point function obfuscator, which can work with secrets as short as the symmetric cryptography security parameter (e.g., 128). Our obfuscators satisfy a computational notion of functional correctness (i.e., no efficient adversary can find an input on which the obfuscated program differs from the original program), and a rather strong notion of obfuscation security (i.e., the obfuscated program is efficiently simulatable). An underlying technical contribution is the construction of an efficient second-preimage-resistant extractor that is simultaneously a second-preimage-resistant hash function, a pairwise almost-independent hash function, and has efficient instantiations from a single efficient cryptographic primitive. Our efficiency claims on this extractor and its resulting obfuscators are substantiated by implementations and performance results on commodity computing resources. Finally, we demonstrate that program obfuscators for point functions are usable in the following real-life applications: password verification (obfuscating a server’s algorithm verifying if a client’s input string is equal to the client’s previously registered password), passphrase verification (as for password verification, but with the variant that the client has registered a passphrase containing only structured text); and password manager (obfuscating a server’s verification and retrieval algorithms that verify the client’s master password or passphrase, and retrieve a client’s previously registered password for a specific server).

In particular, we have modified code on an open-source password manager (Pass, based on gpg2) to accommodate our (multi-bit-output) point function obfuscator instead of their current cryptographic solution (whose obfuscation properties can at best be proved using a random oracle assumption). The overall resulting runtime of a specific password retrieval on the modified application is less than \(3\%\) slower than the same operation on unmodified Pass. Solving these password and passphrase obfuscation problems without using a random oracle assumption are natural problems that have remained unsolved for decades. Details of our real-life application results are discussed in Appendix A.

2 Definitions and Preliminaries

In this section we first recall basic definitions and slightly modify the theory-oriented definition of program obfuscators into a practice-oriented definition that better fits a large class of obfuscator implementations (including ours for point functions) and where the correctness property only holds in a computational sense (i.e., even against a possibly malicious polynomial-time adversary). Finally, we discuss security notions for point function obfuscators.

Security Parameters. In our constructions and concrete security analysis, will use two types of security parameters, described below:

  1. 1.

    the ‘factoring-based’ security parameter, a global parameter \(\lambda _f\), currently set to 2048, that is typically used to determine key lengths in asymmetric cryptographic primitives (e.g., public-key encryption) proved secure under the hardness of number theoretic problems related to factoring and/or discrete logarithm problem; and

  2. 2.

    the ‘symmetric-cryptography’ security parameter, a global parameter \(\lambda _s\), currently set to 128, that is most typically used to determine key/output lengths in several symmetric cryptographic primitives (e.g., block ciphers).

Point Functions. We consider families of functions as families of maps from a domain to a range, where maps are parameterized by some values chosen according to some distribution on a parameter set. Let pF be a family of functions \(f_{par}:Dom\rightarrow \{0,1\}\), where \(Dom=\{0,1\}^n\), and each function is parameterized by value par from a parameter set \(Par=\{0,1\}^n\), for some length parameter n. We say that pF is the family of point functions if on input \(x\in Dom\), and secret value \(y\in Par\), the point function \(f_{y}\) returns 1 if \(x=y\) and 0 otherwise. When we assume an efficiently samplable distribution of secret values \(y\in Par\), we define the (rounded) min entropy parameter of pF as the largest integer t such that each \(y\in Par\) is sampled with probability \(\le 2^{-t}\). The family mbpF of multi-bit-output point functions is defined as follows: on input \(x\in Dom\), secret value \(y\in Par\), and output value z, the function \(f_{y}\) returns z if \(x=y\) and 0 otherwise.

Program Obfuscators: Formal Definitions. To define a cryptographic program obfuscator for a class of functions F, we consider a pair of efficient algorithms with the following syntax. On input function parameters fpar, including a description desc(f) of function \(f\in F\), the obfuscation generator genO returns generator output gpar. On input a description desc(f) of function \(f\in F\), generator output gpar, and evaluator input x, the obfuscation evaluator evalO returns evaluator output y.

Informally, we would like to define an obfuscator for the class pF of point functions as any such pair of algorithms satisfying some functionality correctness property (i.e., the obfuscated program computes the same function as the original program), some efficiency property (i.e., the obfuscated program is not much slower than the original program), and some obfuscation security property (i.e., the obfuscated program hides any sensitive information about the original program which is not computable by program evaluation). Here, we actually consider a slightly relaxed notion of the functionality correctness property, according to which the obfuscated program can return an output different from the original program for some of the inputs; however, these inputs are hard to find, even to an efficient algorithm that has access to the program’s secret value. Furthermore, we discuss some of the security notions in the literature, and eventually formally define the strongest known notion (implicit in [3] and saying, informally speaking, that any efficient adversary’s view of the obfuscated program can be efficiently simulated and thus the adversary learns nothing more than an upper bound on the program size), specialized to the class of point functions pF with secret distributions having high min-entropy. We now proceed more formally.

We say that the pair (genO, evalO) is a cryptographic program obfuscator for the class pF of point functions if it satisfies the following:

  1. 1.

    (Computational correctness): For any \(f_s\) in pF, with function parameters \(fpar=(s, desc(pF))\), and any efficient algorithm A, the event \(f_s(x')\ne y\) holds with probability \(\delta \), for some negligible (or very small) \(\delta \), where \(x',y\) are generated by the following probabilistic steps:

    • \(\,\,gpar\leftarrow genO(fpar)\),

    • \(\,\,x'\leftarrow A(gpar,fpar)\),

    • \(\,\,y\leftarrow evalO(gpar,x')\).

  2. 2.

    (Polynomial Blowup Efficiency): There exists a polynomial p such that for all \(f_s\) in pF, the running time of \(evalO(gpar, \cdot )\) is \(\le p(|f_s|)\), where \(|f_s|\) denotes the size of the (smallest) boolean circuit computing \(f_s\).

  3. 3.

    (Adversary view simulation security): Given any high min-entropy distribution D returning an n-bit secret, there exists a polynomial-time algorithm \(\text{ Sim }\) such that for any function \(f_s\), \(|s|=n\), in the class pF of point functions, with black-box access to \(f_s\) such that for all \(f_s\) in pF with parameters fpar, the distributions \(D_{view}\) and \(D_{sim}\) are computationally indistinguishable, where

    • \(D_{view}=\{s\leftarrow D;gout\leftarrow genO(s,desc(pF)): gout\}\),

    • \(D_{sim}=\{s\leftarrow D; gout\leftarrow \text{ Sim }(1^{|s|},desc(pF)): gout\}\).

Other security notions considered in the literature include adversary output black-box simulation (where the simulator has also access to a black-box computing the program [3] and targets simulating the adversary’s output bit), real-vs-random indistinguishability (where no efficient adversary can distinguish the obfuscation of the function f from an obfuscation of a random function in the class F) [5], and indistinguishability obfuscation (where no efficient adversary can distinguish the obfuscation of any two circuits computing the same function f) [3]. We note that an obfuscator satisfying the adversary view black-box simulation security notion also satisfies these latter 3 security notions.

Known Point Function Obfuscators. The obfuscator in [16] for the family of point functions satisfies adversary view black-box simulation under the random oracle assumption. This obfuscator essentially consists of computing a cryptographic hash of the secret, similarly as typically done for passwords in real-life systems. A previous result of [7], although formulated as a oracle hashing scheme, can be restated as an obfuscator satisfying a strong variant of real-vs-random indistinguishability under the Decisional Diffie Hellman assumption. The obfuscator in [19] satisfies adversary output black-box simulation under the existence of a strong type of one-way permutations. Moreover, one of the obfuscators in [5], based on any deterministic encryption scheme, satisfies real-vs-random indistinguishability, and has several instantiations. This follows as deterministic encryption schemes can be built using the hardness of the learning with rounding problem [20] or the existence of lossy trapdoor functions [6], and the latter have been built using any one of many group-theoretic assumptions (see, e.g., [13]). Some of the resulting obfuscators have efficient implementations [12]. Finally, an obfuscator was given in [2] using the hardness of the learning with error problem.

All results mentioned so far either make the random oracle assumption or work for secret distributions not significantly different than uniform. The only obfuscator working for arbitrary secret distributions of high min-entropy can be obtained using a result from [10] on perfectly one-way functions, constructed assuming the existence of claw-free permutations. This result is far from having an efficient implementation.

Our goal in the rest of this paper is to show an obfuscator for point functions that works for arbitrary secret distributions of high min-entropy, without making the random oracle assumption, and resulting in an efficient implementation.

Families of One-Way \(\alpha \) -Permutations. The term efficient is used for running time in both a practical and theoretical sense, as needed. We say that a family of functions \(\{F\}\) is efficiently samplable if there exists an efficient algorithm randomly choosing a function F from the family, and is efficiently computable if there exists an efficient algorithm that evaluates any function F from the family. We say that a family of functions \(\{F\}\), is a family of \(\alpha \) -permutations if the probability that, for a randomly chosen x, F(x) has \(>1\) preimages, is \(<\alpha \).

Families of Pairwise-Independent Hash Functions. We say that a family of hash functions \(\{H_{m,n}\}\), where \(H_{m,n}:\{0,1\}^m\rightarrow \{0,1\}^n\) is pairwise \(\delta \) -independent if for any \(x_0\ne x_1\in \{0,1\}^m\), and any \(y_0,y_1\in \{0,1\}^n\), it holds that \(\mathrm{Prob}[\,H(x_0)=y_0\,\wedge \,H(x_1)=y_1\,]\le \delta +2^{-2n}\). We say that family \(\{H_{m,n}\}\) is pairwise independent if it is is pairwise \(\delta \)-independent, for \(\delta =0\). Constructions for pairwise-independent hash functions include a random one-degree polynomial in a Galois field or a random one-degree polynomial modulo a prime [11], where by a random polynomial we denote a polynomial with coefficients randomly chosen in their domain set. All these constructions are efficiently sampleable and efficiently computable.

Families of Second-Preimage-Resistant Hash Functions. This cryptographic primitive was introduced in [17], under the name of universal one-way hash functions, and have also been called target-collision-resistant hash functions since [4] or second-preimage-resistant hash functions. We say that a family of functions \(\{h\,|\,h:\{0,1\}^a\rightarrow \{0,1\}^b\}\) is second-preimage-resistant over \(\{0,1\}^a\) if it satisfies the following three properties: (1) h is efficiently sampleable from its family; (2) every h in the family is efficiently computable; and (3) no efficient adversary can win, except with very small probability, in the following game: first, the adversary picks an input z, then a random function h is sampled from its family; finally, the adversary, given h(z), wins the game if it finds an input x such that \(h(x)=h(z)\). The first constructions for such families of functions were proposed in [17], based on families of one-way permutations with varying domain sizes and any family of pairwise-independent hash functions. Later, more practical constructions were proposed in [4, 18], based on collision-intractable hash functions. Generally speaking, second-preimage-resistant hash functions may or may not satisfy pairwise-independence properties.

Randomness Extractors. The statistical distance between two distributions \(D_1,D_2\) over the same space S is defined as \(sd(D_1,D_2)=\frac{1}{2}\,\Sigma _{x\in S}\,|\,\mathrm{Prob}[\,x\leftarrow D_1\,]-\mathrm{Prob}[\,x\leftarrow D_2\,]\,|\). We say that distributions \(D_1,D_2\) are \(\delta \)-close if it holds that \(sd(D_1,D_2)\le \delta \). We say that a distribution D is \(\delta \) -close to uniform, or, briefly, \(\delta \) -uniform, if it holds that \(sd(D,U)\le \delta \), where U denotes the uniform distribution over the same space S. The min-entropy of a distribution D is defined as \(H_{\infty }(D)=\min _x\{-\log _2(\mathrm{Prob}[\,x\leftarrow D\,])\}\). A function Ext\(:\{0,1\}^a\times \{0,1\}^b\rightarrow \{0,1\}^c\) is called a \((k,\epsilon )\)-extractor if for any distribution D on \(\{0,1\}^a\) with min-entropy at least k, the distribution N(D) is \(\epsilon \)-uniform, where \(N(D)=\{x\leftarrow D;e\leftarrow \{0,1\}^b;y=\) Ext\((x,e)\,:\,(e,y)\}\). The leftover hash lemma [14] says that if \(\{H_{m,n}\}\) is a family of pairwise-independent hash functions, value x is drawn according to a distribution D such that \(H_{\infty }(D)\ge k\), and \(n\ge k-2\log (1/\epsilon )\), then the function Ext\((x,H_{m,n})\) defined as \(y=\) Ext\((x,H_{m,n})=H_{m,n}(x)\) is a \((k,\epsilon )\)-extractor. By inspection of the proof in [15], we see that it can be directly extended to families of pairwise \(\delta \)-independent hash functions, as follows.

Lemma 1

For any \(\delta >0\), if \(\{H_{m,n}\}\) is a family of pairwise \(\delta \)-independent hash functions, value x is drawn according to a distribution D such that \(H_{\infty }(D)\ge k\), and \(n\le k-2\log (1/\epsilon )\), for some \(\epsilon \le (1/2)\log (1/\delta )\), then the function Ext\((x,H_{m,n})\) defined as \(y=\) Ext\((x,H_{m,n})=H_{m,n}(x)\) is a \((k,\epsilon )\)-extractor.

We say that a function Ext\(:\{0,1\}^a\times \{0,1\}^b\rightarrow \{0,1\}^c\) is a second-preimage-resistant \((k,\epsilon )\)-extractor if it is both a second-preimage-resistant hash function over \(\{0,1\}^a\) and a \((k,\epsilon )\)-extractor.

3 An Efficient Second-Preimage-Resistant Extractor

In this section we construct an efficient second-preimage-resistant extractor, or actually a family of hash functions which satisfies the following desirable combination of functionality, efficiency and security properties:

  1. 1.

    it achieves arbitrarily large compression, in that it maps an arbitrarily-long input string to a fixed-length output string;

  2. 2.

    it is an almost pairwise-independent hash function;

  3. 3.

    it is a one-way function with second-preimage resistance;

  4. 4.

    in addition to elementary operations, it only uses, as a black-box, a hash function satisfying above properties 2 and 3, and achieving small and fixed compression (specifically: it maps a fixed-length input string to a fixed-length output string, where the difference between the input string’s length and the output string’s length can be any small constant \(\ge 1\)).

Properties 1 and 4 (resp., 2 and 3) are used to satisfy functionality correctness and efficiency (resp., security) requirements. The closest constructions to ours from the literature only satisfy 3 out of 4 of the listed properties, as follows: two constructions in [17] missed properties 1 or 4, and a construction from [4, 18] missed property 2.

Formally, we achieve the following

Theorem 1

Let \(t_{F,sample}\) (resp., \(t_{F,eval}\)) denotes the running time to sample (resp., evaluate) a function F. Let \(\{aF\,|\,aF:\{0,1\}^b\rightarrow \{0,1\}^b\}\) be a family of one-way \(\alpha \)-permutations, and let \(\{piH\,|\,piH:\{0,1\}^a\rightarrow \{0,1\}^b\}\) be a family of pairwise \(\delta \)-independent hash functions. There exists (constructively) a family \(\{sprH\,|\,sprH:\{0,1\}^{\ell (a-b)}\rightarrow \{0,1\}^b\}\) of second-preimage-resistant \((k,\epsilon )\)-extractors such that

  • \(b\le k-2\log (1+\epsilon )\) and \(\epsilon \le (1/2)\log (1/\delta ')\), for \(\delta '=\ell (\delta +\alpha )\)

  • \(t_{sprH,sample}=O(\ell (t_{piH,sample})+t_{aF,sample})\), and

  • \(t_{sprH,eval}=O(\ell (t_{aF,eval}+t_{piH,eval}+t_{piH,sample})+t_{aF,sample})\).

The function sprH obtained in the proof of Theorem 1 will be applied to obtain the following two important new results: (1) in Sect. 4, it will be used in combination with the obfuscators from [5,6,7, 13], and [19], to design efficient obfuscators for point functions with secret length higher than the factoring-type security parameter (e.g., 2048); (2) in Sect. 5 it will be used to design an efficient obfuscator for multi-bit-output point functions with secret length greater than or equal to the symmetric-cryptography security parameter (e.g., 128). The rest of this section is devoted to proving Theorem 1.

Informal Description of Function sprH: Our goal is to define a family of functions, denoted as sprH, that satisfies the above properties 1–4. One higher-level view of our construction looks similar to the linear hash construction in [4, 18], and its lower-level component looks similar to a function from [17]. However, some technical differences with these papers actually allow us to achieve all 4 desired properties; most importantly:

  1. 1.

    sprH processes an arbitrarily long input by repeatedly applying an inner function with the same domain and codomain sizes (instead, in [17] domain and codomain sizes vary). This approach is important to achieve properties 1 and 4.

  2. 2.

    in sprH the inner function used at each iteration is both a second-preimage-resistant function and a pairwise almost-independent hash function (as opposed to only a collision-intractable hash function, as in [4, 18]). This approach is important to achieve properties 2 and 3.

Formal Description: Let desc(F) denote a conventional encoding of function F, and let a, b denote positive integers such that \(a> b\) and \(a-b\ge 1\) is a small constant. The construction for sprH uses a pairwise \(\delta \)-independent hash functions \(piH:\{0,1\}^a\rightarrow \{0,1\}^b\), and a one-way \(\alpha \)-permutations \(aP:\{0,1\}^b\rightarrow \{0,1\}^b\). We define function \(sprH:\{0,1\}^*\rightarrow \{0,1\}^b\), as follows.

Input to sprH: string \(x=x_1|\cdots |x_\ell \), where \(x_i\in \{0,1\}^{a-b}\), for \(i=1,\ldots ,\ell \).

Instructions for sprH:

  1. 1.

    Set \(u_0=0^{b}\)

  2. 2.

    Randomly sample a one-way \(\alpha \)-permutation aP

  3. 3.

    For \(i=1,\ldots ,\ell \),    randomly sample a pairwise \(\delta \)-independent hash function \(piH_i\)    compute \(v_i=aP(u_{i-1}|x_i)\) and \(u_i=piH_i(v_i)\)

  4. 4.

    Return: \((u_\ell ,\) desc(aP),  desc\((piH_1),\ldots ,\), desc\((piH_\ell ))\).

The running times \(t_{sprH,sample}\) and \(t_{sprH,eval}\) claimed in Theorem 1 are verified by algorithm inspection, observing that sprH can compress arbitrarily long inputs into b-bit outputs, and that it invokes \(\ell \) times a single function \(piH(aP(\cdot ))\) compressing a-bit outputs to b-bit outputs. In what follows, we show that sprH is a second-preimage-resistant (\(k,\epsilon ,\delta '\))-extractor with the parameters in Theorem 1, by showing that it is both a second-preimage-resistant hash function over \(\{0,1\}^a\) and a pairwise \(\delta '\)-independent hash function.

The proof that sprH is a second-preimage-resistant hash function directly follows by applying results in [4, 17, 18], as follows. First, we observe that the function obtained by cascading a one-way \(\alpha \)-permutation aP with a pairwise \(\delta \)-independent hash function piH, is a second-preimage-resistant hash function. This follows directly by Lemma 2.2 in [17], which proves the exact same result when \(\alpha =0,\delta =0\) and \(a-b=1\). We observe that no technical difficulty is encountered in extending this proof to values of \(\alpha ,\delta \) that are negligible or very small and a value of \(a-b\) that is a small constant (or even logarithmic in the security parameter). Because of this observation, we note that sprH can be considered as the linear hash iterated application of a second-preimage-resistant hash function, as in the linear hash construction from [4, 18]. In particular, we can apply Theorem 5.3 from [4] which proves our desired statement; i.e., the linear hash construction transforms a second-preimage-resistant hash function from a-bit strings to b-bit strings into a second-preimage-resistant hash function from arbitrary-length strings to b-bit strings.

The proof that sprH is a pairwise \(\delta '\)-independent hash function is obtained by induction over \(\ell \). The base case directly follows by observing that the assumptions that function \(piH_1\) is pairwise \(\delta \)-independent and that function aP is an \(\alpha \)-permutation imply that the composed function \(piH_1(aP(\cdot ))\) is a pairwise \(\delta '\)-independent hash function, for \(\delta '=\alpha +\delta \). The inductive case follows by combining the induction hypothesis with the fact that at the \(\ell \)-th iteration, function sprH computes \(u_\ell \) using function \(piH_\ell (aP(\cdot ))\) for an independently chosen pairwise \(\delta \)-independent hash function \(piH_\ell \).

Implementation: Primitive Setting. Families of pairwise-independent hash functions \(piH_i\) can be implemented as in Sect. 2. Function aP can be instantiated in 3 ways:

  1. 1.

    setting \(n=2048\), and using exponentiation modulo a prime; that is, \(aP_{g,p}(x)=g^x\,\mathrm{mod}\,p\), where publicly available parameters p, g are as follows: p is an \((n+1)\)-bit prime and g is a generator of \(\mathbb {Z}^*_p\);

  2. 2.

    using a length-preserving collision-intractable hash function \(cih_k:\{0,1\}^n\rightarrow \{0,1\}^n\) for which no theoretical attacks (faster than birthday attacks) are known, and assuming such a function is a one-way \(\alpha \)-permutation, for a value \(\alpha \) negligible in n or very small; that is, \(aP_{cih,k}(x)=cih_k(x)\);

  3. 3.

    as a block cipher \(bc:\{0,1\}^\kappa \times \{0,1\}^n\rightarrow \{0,1\}^n\) for which no theoretical attacks (faster than exhaustive search attacks) are known, to be run on a fixed, but randomly chosen, input block r, and assuming that the resulting function \(bc(\cdot ,r)\) is a one-way \(\alpha \)-permutation over the set of block cipher keys, for \(\alpha \) negligible in n or very small; that is, \(aP_{bc,r}(x)=bc(x,r)\).

In our implementation, we used the 3rd option for efficiency reasons, and based on the observation that function \(aP_{bc,r}(x)\), mapping the set of keys of the block cipher to the cipher’s output, is indeed expected to be a one-way \(\alpha \)-permutation. This observation is based on the fact that if function \(aP_{bc,r}(x)\) were not close to a one-way \(\alpha \)-permutation, a theoretical attack exhaustively searching for any of the colliding keys would be possible. Note that such an attack would be faster than exhaustive key-search, thus giving a theoretical break of the block cipher.

4 Obfuscators for Point Functions with Larger Secrets

In this section we show how to obtain point function obfuscators where the obfuscated secret value can have length and min entropy parameters arbitrarily greater than the factoring-type security parameter, starting from a point function obfuscator where the obfuscated secret value has fixed length and min-entropy parameter, which we already know how to build. Formally, we obtain the following

Theorem 2

Let \(\ell _a,e_a,\ell _u,\epsilon \) be integers such that \(\ell _u+2\epsilon \le e_a\le \ell _a\) and \(\epsilon \ge \lambda _s\), let sprH be a second-preimage-resistant \((\ell _a,\epsilon )\)-extractor, and let (\(genO_u\),\(evalO_u\)) be a cryptographic program obfuscator for the family of point functions with \(\epsilon \)-uniformly distributed \(\ell _u\)-bit secret values. Then there exists (constructively) a cryptographic program obfuscator obfuscator (\(genO_a\),\(evalO_a\)) for the family of point functions with respect to \(\ell _a\)-bit secrets drawn from any distribution of min-entropy \(e_a\).

An important consequence of Theorem 2 is that any one of the point function obfuscators in [7, 5, 6, 13], or [19] can be extended to obtain a point function obfuscator that works for secret values with arbitrarily larger length and drawn from arbitrary distributions of min entropy larger than the factoring-type security parameter.

Informal and Formal Descriptions: The basic idea of the transformation underlying Theorem 2 follows a ‘hash-and-obfuscate’ paradigm, analogously to the much studied ‘hash-and-sign’ paradigm used for the design of digital signature schemes for large messages. This paradigm goes through two steps: first, the input is hashed using a second-preimage-resistant extractor, which we will implement using the construction sprH from Sect. 3; then, the extractor’s output is processed through the obfuscator with fixed length parameter, which can be instantiated using any one of the schemes from the literature (e.g., [5, 5,6,7, 10, 13, 20] or [19].) The resulting scheme satisfies computational functionality correctness, and the same adversary view simulation obfuscation notion as the used obfuscator for fixed-length secrets. We now proceed more formally. The construction for (genO\(_a\),evalO\(_a\)) uses the family of efficiently samplable functions \(sprH:\{0,1\}^*\rightarrow \{0,1\}^b\) from Sect. 3, which are simultaneously second-preimage-resistant hash functions and pairwise \(\delta '\)-independent extractors, and an obfuscator (genO\(_u\),evalO\(_u\)) for the family of point functions with length parameter \(\ell _u\) and secret values with almost uniform distribution.

Input to genO\(_a\): parameters \(1^{e_u},1^{\ell _u},\epsilon \), secret value \(s\in \{0,1\}^{\ell _a}\)

Instructions for genO\(_a\):

  1. 1.

    Randomly sample function \(sprH:\{0,1\}^{\ell _a}\rightarrow \{0,1\}^{\ell _u}\)

  2. 2.

    Compute \(v=sprH(s)\)

  3. 3.

    Compute \(out_u=genO_u(v)\)

  4. 4.

    Return: \(out_a=(desc(sprH),out_u)\).

Input to evalO\(_a\): input value \(x\in \{0,1\}^{\ell _a}\) and the output from \(genO_a\), containing the description desc(sprH) of function sprH and the output \(out_u\) from \(genO_u\).

Instructions for evalO\(_a\):

  1. 1.

    Compute \(v'=sprH(x)\)

  2. 2.

    Return: \(evalO_u(v')\).

Proofs are omitted due to space restrictions.

5 Obfuscators for Multi-bit-output Point Functions With Shorter Secrets

In this section we describe an obfuscator, denoted as \((genO_{mb},evalO_{mb})\), for the family of multi-bit-output point functions, where secrets can have a shorter length parameter than in our previous implementations, which implies applicability to the obfuscation of passphrases and even passwords. More specifically, this obfuscator differs from analogue results in the literature and in previous sections, in the following properties:

  1. 1.

    it works for a generalized type of point functions: multi-bit-output point functions, whose output can be a long string, instead of a bit;

  2. 2.

    it works for a length parameter that can be arbitrarily chosen as \(\ge \) the symmetric-cryptography security parameter (i.e., 128), instead of the factoring-type security parameter (i.e., 2048);

  3. 3.

    its obfuscation property can be based on the security of a symmetric cryptography primitive (i.e., a block cipher or a cryptographic hash function), instead of a number theory problem typically applied to construct an asymmetric cryptography primitive.

Formally, we achieve the following

Theorem 3

Let \(\ell _o,\ell _s,k,\epsilon \) be integers such that \(k\le \ell _s\). Also, let sprH be a second-preimage-resistant \((k,\epsilon )\)-extractor and let (KeyGen, Enc, Dec) be a secure symmetric encryption scheme. Then there exists (constructively) a cryptographic program obfuscator \((genO_{mb},evalO_{mb})\) for the family of multi-bit output point functions mbpF with \(\ell _o\)-bit outputs and \(\ell _s\)-bit secrets drawn from any distribution of min-entropy k.

We note that in the above theorem we are trading off some slightly, but not significantly, reduced confidence in the security assumptions (as indicated in item 3 of the above list), to achieve increased functionality power (as indicated in items 1 and especially item 2 of the above list). Indeed the property in item 1 can be obtained without resorting to symmetric cryptography primitives (see, e.g., [8]), but this comes with decreased obfuscator’s efficiency. The (most interesting) property in item 2 was unknown and is the one that allows applications to passphrase and password obfuscation, as further detailed in Appendix A.

Formal Description: Let | denote string concatenation, and let sprH denote a second-preimage-resistant \((k,\epsilon )\)-extractor (such as the one constructed in Sect. 3). Also, let (KeyGen, Enc, Dec) be a symmetric encryption scheme with the following syntax: on input a unary string \(1^n\) denoting the symmetric encryption security parameter, KeyGen returns an n-bit random key; on input key and message m, encryption algorithm Enc returns ciphertext c; on input key and ciphertext c, decryption algorithm Dec returns message m. Our construction of \((genO_{mb},evalO_{mb})\) combines the extractor sprH with the encryption scheme (KeyGen, Enc, Dec), as follows.

Input to genO\(_{mb}\): security parameters \(1^n,1^{n_0},1^\epsilon \), entropy parameter k, secret value \(s\in \{0,1\}^{\ell _s}\), output value \(w\in \{0,1\}^{\ell _o}\)

Instructions for genO\(_{mb}\):

  1. 1.

    uniformly and independently choose \(r\in \{0,1\}^{n_0}\)

  2. 2.

    compute \(key=sprH(r|s)\), where \(key\in \{0,1\}^n\)

  3. 3.

    compute \(v=\) Enc\((key,w|0^{n_0})\)

  4. 4.

    set \(gpar=(r,v)\) and return: gpar.

Input to evalO\(_{mb}\): security parameters \(1^n,1^{n_0},1^\epsilon \), entropy parameter k, the pair (r, v) returned by \(genO_{mb}\), and input value \(x\in \{0,1\}^\ell \)

Instructions for evalO\(_{mb}\):

  1. 1.

    compute \(key'=sprH(r|x)\), where \(key'\in \{0,1\}^n\)

  2. 2.

    compute \((w'|w'')= \) Dec\((key',v)\)

  3. 3.

    if \(w''=0^{n_0}\) return \(w'\) else return 0

Proofs and performance analysis are omitted due to space restrictions.

6 Conclusions

We showed for the first time how to efficiently obfuscate passwords, passphrases and password managers, without a random oracle assumption. Our obfuscator can work with passwords and passphrases of practical lengths. Even if we expect practitioners to continue using the simpler to implement construction based on cryptographic hashing of a password, our construction gives confidence that the impact of any future attacks to cryptographic hash functions can be significantly limited by a simple protocol design change.