1 Introduction

A watermarking scheme allows one to embed some information into a program while preserving its functionality. Moreover, it should be difficult for an adversary to remove the embedded information without destroying the marked program. Watermarking schemes are widely employed in many applications, including ownership protection, traitor tracing, etc.

The theoretical study of watermarking schemes was initiated by Barak et al. [BGI+01] and Hopper et al. [HMW07]. However, no concrete construction is provided in both works. It is extremely difficult to construct provably secure watermarking schemes and early works in this area [NSS99, YF11, Nis13] only consider restricted adversaries, which are not allowed to change the format of the watermarked object.

The first watermarking scheme with provable security against arbitrary removal strategies is presented by Cohen et al. in [CHN+16]. Specifically, they construct a watermarkable pseudorandom function (PRF) from indistinguishability obfuscation. In subsequent works [BLW17, KW17, QWZ18, YAL+18, KW19, YAL+19], watermarkable PRFs are constructed from either indistinguishability obfuscation or standard (lattice) assumptions. However, there is still a significant gap in security between the schemes constructed from indistinguishability obfuscation and those from standard assumptions.

In [CHN+16], Cohen et al. also construct watermarking schemes for public key encryption (PKE) and signature from their watermarkable PRFs. Subsequently, (stateful) watermarking schemes for PKE are constructed from any PKE scheme [BKS17]. Recently, in [GKM+19], Goyal et al. construct watermarking schemes for various public key cryptographic primitives with nearly all desired security properties from simple assumptions, such as the existence of one-way function, standard lattice assumptions, etc. This is achieved by a slight relaxation on the correctness of the watermarking scheme. More precisely, in their definition, a marked program is not required to approximately preserve the input/output behaviors of the original program, and instead, it is only required to preserve the “functionality” of the original program.Footnote 1 Unfortunately, such relaxation is not applicable to watermarkable PRF, whose functionality is exactly specified by its input/output behaviors.

Watermarking PRFs. A watermarking scheme for a PRF family \(\mathsf {F}\) consists of two main algorithms, namely, the marking algorithm and the extraction algorithm. The marking algorithm takes as input the mark key, a message, and a PRF key k, and outputs a watermarked circuit, which evaluates \(F_{k}(\cdot )\) correctly on almost all inputs. The extraction algorithm takes as input the extraction key and a circuit, and outputs either a message or a symbol \(\perp \), which indicates that the circuit is unmarked.

The main security property of a watermarking scheme is unremovability, which requires that given a marked circuit \({\mathtt {C}}^*\) for a random PRF key (namely, the challenge key), the adversary is not able to remove or modify the embedded messageFootnote 2 without altering the outputs of \({\mathtt {C}}^*\) on a significant fraction of inputs. An additional security property is unforgeability, which prevents anyone without the mark key from generating a new watermarked circuit. Besides, for watermarkable PRF, it is usually required to have pseudorandomness against the watermarking authority, i.e., the pseudorandomness holds against an adversary who possesses the mark key and the extraction key.

When defining security (either unremovability or unforgeability) for watermarking schemes, adversaries with different capabilities are considered. For example, if the adversary is allowed to access more than one marked circuit of the challenge key, the scheme is collusion resistant. Moreover, we say that the scheme has security with marking (oracle) queries if the security is defined against an adversary who can obtain marked circuits of its generated keys and we say that the scheme has security with public marking if the adversary can obtain the mark key. Besides, we say that the scheme has security with extraction (oracle) queries if the security is defined against an adversary who can obtain extraction results of its generated circuits and we say that the scheme has security with public extraction if the adversary can obtain the extraction key.

Prior works on watermarkable PRFs. Watermarkable PRF is first constructed by Cohen et al. in [CHN+16]. The scheme is constructed from indistinguishability obfuscation and has unremovability with public extraction. Later, in [YAL+19], Yang et al. improve Cohen et al.’s scheme to achieve collusion resistance. Both constructions rely on the full power of indistinguishability obfuscation and it seems infeasible to instantiate them from standard assumptions.

Towards constructing watermarkable PRF from standard assumptions, Boneh et al. [BLW17] propose a new approach that builds watermarkable PRF from variants of constrained PRFs [BW13, KPTZ13, BGI14]. The schemes provided in [BLW17] still rely on the existence of indistinguishability obfuscation. Then, building on Boneh et al.’s framework, watermarkable PRFs from standard assumptions are developed. In [KW17], Kim and Wu present the first watermarkable PRF from standard assumptions. The scheme only achieves security with marking queries. Subsequently, in [QWZ18, KW19], watermarkable PRFs that have security with public marking and extraction queries are constructed. However, all of these constructions (from standard assumptions) fail to provide desirable security properties such as security with public extraction and collusion resistance.

The goal of this work is to narrow the gap in security between indistinguishability obfuscation based watermarkable PRFs and standard assumptions based ones. We note that security with extraction queries, which is a natural stepping stone towards security with public extraction, is already achieved by previous watermarkable PRFs from standard assumptions [QWZ18, KW19]. In contrast, no positive result on collusion resistant watermarkable PRF from standard assumptions is known. Therefore, our main objective is to design collusion resistant watermarkable PRF that can be instantiated from standard assumptions.

1.1 Our Results

In this work, we explore the possibility to build collusion resistant watermarkable PRF from standard assumption and show that:

Theorem 1.1

(Informal). Assuming the existence of secure watermarkable PRF, there exist collusion resistant watermarkable PRFs. Especially, collusion resistant watermarkable PRFs exist assuming the worst-case hardness of appropriately parameterized GapSVP problems.

We prove Theorem 1.1 by presenting a generic transformation from watermarkable PRF without collusion resistance to collusion resistant watermarkable PRF. Our transformation can approximately preserve the security of the original scheme. For example, if the original scheme has security with public marking, then so does the new scheme. Besides, by using our transformation, the new scheme has very strong unforgeability even if the original scheme is not unforgeable. This is achieved by a novel technique that adds unforgeability to a large class of watermarkable PRFs, which may be of independent interest.

By applying our transformation to existing watermarkable PRFs from standard assumptions [KW17, QWZ18, KW19], we obtain lattice based collusion resistant watermarkable PRFs with various features. The results are summarized in Table 1.

Table 1. Security properties achieved by watermarkable PRFs from standard assumption. The default setting for unremovability, which is achieved by all constructions, is unremovability with marking queries. We use unremovability with PM to denote unremovability with public marking and use unremovability with EO to denote unremovability with extraction (oracle) queries. The default setting for unforgeability is unforgeability with marking queries. We use unforgeability with EO to denote unforgeability with extraction (oracle) queries and use unforgeability with PE to denote unforgeability with public extraction. We refer the reader to Sect. 4.1 for a more detailed discussion on different levels of unremovability and unforgeability.

The key component of our transformation is a fingerprinting code with enhanced security, where the adversary can query an extraction oracle that outputs the decoding of its submitted word. Surprisingly, this natural security requirement has not been considered in previous works. In this work, we change this situation by defining and constructing fingerprinting code that has security with extraction queries. The new primitive is also potentially useful for copyright protection in practical applications.

One caveat is that our constructions of fingerprinting code (and thus collusion resistant watermarking schemes) are only secure against an adversary that can make at most q queries to the extraction oracle, where q is a priori bounded polynomial. Also, the message spaces of our fingerprinting code and watermarkable PRFs are of polynomial-size.Footnote 3 It is an interesting open problem to design fingerprinting codes and standard assumption based collusion resistant watermarkable PRFs without these restrictions.

1.2 Technical Overview

In this section, we provide an overview of our techniques. We first recall current approach for constructing (single key secure) watermarkable PRF from standard assumption and identify the difficulty for achieving collusion resistance via this approach. Then we show our ideas to overcome the difficulty.

The difficulty. Existing constructions of watermarkable PRF from standard assumptions [KW17, QWZ18, KW19] are all built on (variants of) constrained PRFs, following the blueprint proposed by Boneh et al. in [BLW17]. A constrained PRF \(\mathsf {F}\) is a family of PRF that allows one to derive a constrained key ck from a PRF key k, where \(\mathsf {F}_{ck}(\cdot )\) and \(\mathsf {F}_k(\cdot )\) evaluate identically on almost all inputs except at some “punctured” pointsFootnote 4. Its security requires that given the constrained key ck, \(\mathsf {F}_k(x)\) is still pseudorandom if x is a punctured point. A constrained PRF is constraint-hiding if the constrained key does not reveal the punctured points. A (constraint-hiding) constrained PRF is collusion resistant if the security remains even if the adversary can obtain multiple constrained keys derived from a PRF key. Next, we briefly review how to watermark a constrained PRF family \(\mathsf {F}\).

To watermark a PRF key k of \(\mathsf {F}\), the marking algorithm first generates an input \(x^*\) and produces a constrained key ck that is punctured on \(x^*\) (i.e., \(\mathsf {F}_{ck}(x^*)\not =\mathsf {F}_{k}(x^*)\) and \(\mathsf {F}_{ck}(x)=\mathsf {F}_{k}(x)\) for all \(x\not =x^*\)). The marked version of k is just a circuit that evaluates \(\mathsf {F}_{ck}(\cdot )\). To test if a circuit \({\mathtt {C}}\) is marked, the extraction algorithm recovers \(x^*\) by using the extraction key and checks if \({\mathtt {C}}\) is a constrained key punctured on \(x^*\). This is accomplished via either checking if \({\mathtt {C}}(x^*)\) is in a specific set [KW17] or checking if \({\mathtt {C}}(x^*)\not =\mathsf {F}_{k}(x^*)\) [QWZ18, KW19]. The variants of constrained PRF used in these works support such checks. Security of the watermarking schemes relies on the fact that the punctured point \(x^*\) (or the output \(\mathsf {F}_{k}(x^*)\)) is hidden from the adversary. Based on this, to embed a message \(msg \in \{0,1\}^{l}\) (instead of a mark) into a PRF key, the marking algorithm will encode the message into the punctured points. One simple method is to generate 2l inputs \((x_{1,0}^*, x_{1,1}^*, \ldots , x_{l,0}^*, x_{l,1}^*)\) and puncture the PRF key on \(\{x_{i,msg[i]}^*\}_{i\in [l]}\). Then, the extraction algorithm can recover the i-th bit of the embedded message via checking if the circuit is punctured on \(x_{i,0}^*\) or if it is punctured on \(x_{i,1}^*\).

The main obstacle to achieving collusion resistance via the above approach is that the underlying (variants of) constrained PRFs are not collusion resistant. Specifically, for the instantiations provided in [KW17, KW19], one can recover the PRF key k and thus compromise security of the watermarking scheme if it is given two different constrained keys derived from k. For the scheme constructed in [QWZ18], it can be instantiated from any constraint-hiding constrained PRF for general constraint. However, to the best of our knowledge, no constraint-hiding constrained PRF from standard assumption [BKM17, CC17, BTVW17, PS18, CVW18, AMN+18, DKN+20] is known to achieve collusion resistance. Moreover, as proved in [CC17], collusion resistant constraint-hiding constrained PRF for general constraint implies indistinguishability obfuscation.

Our solution. To get around this obstacle, our key idea is to encode bits of a message into different “keys” instead of encoding them into different inputs. Next, we first illustrate how the idea works with a failed attempt, then we show how to correct it. We also discuss some barriers to achieving other desirable security properties and explain how to solve them.

The first attempt. The watermarking object of our initial attempt is a new PRF family \(\tilde{\mathsf {F}}\) that is a “repetition” of l constrained PRFs, i.e., \(\tilde{\mathsf {F}}_{k_1,\ldots ,k_l}(x)=(\mathsf {F}_{k_1}(x), \ldots , \mathsf {F}_{k_l}(x))\). To embed a message \(msg \in \{0,1\}^{l}\) into a key \(\varvec{k}=(k_1,\ldots ,k_l)\) of \(\tilde{\mathsf {F}}\), the marking algorithm first generates 2l inputs \((x_{1,0}^*, x_{1,1}^*, \ldots , x_{l,0}^*, x_{l,1}^*)\), then it punctures \(k_i\) on \(x_{i,msg[i]}^*\) to obtain a constrained key \(ck_i\). The marked version of \(\varvec{k}\) is a circuit that computes \((\mathsf {F}_{ck_1}(\cdot ), \ldots , \mathsf {F}_{ck_l}(\cdot ))\). Then, on input a circuit, the extraction algorithm can recover the i-th bit of the embedded message via checking if the i-th part of the circuit is punctured on \(x_{i,0}^*\) or if it is punctured on \(x_{i,1}^*\).

Now, we examine what is guaranteed from this construction. For simplicity, we consider the simplified case that \(l=3\) and that the adversary only obtains two marked circuits \({\mathtt {C}}^{(1)}=(\mathsf {F}_{ck^{(1)}_1}(\cdot ), \mathsf {F}_{ck^{(1)}_2}(\cdot ),\mathsf {F}_{ck^{(1)}_3}(\cdot ))\) and \({\mathtt {C}}^{(2)}=(\mathsf {F}_{ck^{(2)}_1}(\cdot ), \mathsf {F}_{ck^{(2)}_2}(\cdot ), \mathsf {F}_{ck^{(2)}_3}(\cdot ))\) of \(\varvec{k}=(k_1,k_2,k_3)\), embedded with messages

$$msg^{(1)}=101 \quad \text {and} \quad msg^{(2)}=110$$

respectively. First, we have \(ck^{(1)}_1=ck^{(2)}_1\) since both of them are generated by puncturing \(k_1\) on \(x^*_{1,1}\) (we derive the randomness for the puncturing algorithm from \(\varvec{k}\)). Thus, by the single key security of the underlying (constraint-hiding) constrained PRF, the adversary is not able to modify the mark 1 in \(k_1\). However, as \(ck^{(1)}_2\) and \(ck^{(2)}_2\) (also, \(ck^{(1)}_3\) and \(ck^{(2)}_3\)) are generated by puncturing \(k_2\) (resp. \(k_3\)) on different points, we have \(ck^{(1)}_2\not =ck^{(2)}_2\) (resp. \(ck^{(1)}_3\not =ck^{(2)}_3\)). So, the adversary is able to obtain different constrained versions of \(k_2\) and \(k_3\), and thus it has the capability to remove or modify the marks in them. As a result, when extracting a circuit produced by the adversary, the extraction algorithm may obtain a message in \(\{1\} \times \{?,0,1\} \times \{?,0,1\}\),Footnote 5 which contains new messages such as 111 and 100. That is, the adversary still has the ability to modify the embedded messages even if it fails at some position.

A secure solution using fingerprinting code. To solve this problem, we employ a fingerprinting code to amplify the robustness of our initial construction, from extracting some bits of the embedded messages to extracting one of the embedded messages, when dealing with adversarially-generated circuits. A fingerprinting code scheme consists of two algorithms, namely, the generation algorithm and the decoding algorithm. The generation algorithm generates a codebook and a trapdoor, where the codebook assigns a unique codeword to each message and the trapdoor is used for decoding. The decoding algorithm decodes a word (not necessarily in the codebook) using the trapdoor. Its security ensures that given a few codewords for messages in a specific set \(\mathcal {C}\), no one could produce a word that is decoded to a message outside \(\mathcal {C}\). This security is defined assuming the “marking assumption”, where the adversary is not allowed to modify the bit at a position if all given codewords agree at this position. For example, if the given codewords are 101 and 110, then a word w such that \(w[1]=0\) is invalid.

Now, we integrate the fingerprinting code into our construction. More precisely, the marking algorithm first gets the codeword for the given message from the codebook. Then it embeds the codeword into the PRF keys via invoking the marking algorithm provided in our initial construction. Then, on input a circuit, the extraction algorithm first invokes the extraction algorithm of our initial construction. It replaces all “?” in the returned string with “0”, and decodes the string using the decoding algorithm of the fingerprinting code. The marking assumption is guaranteed by security of our initial construction and thus the new extraction algorithm can succeed in extracting one of the embedded messages.

It is worth noting that our construction does not rely on concrete properties of the underlying constrained PRF, thus it is safe to replace it with any secure watermarkable PRF. In other words, our idea can be seen as a compiler that compiles a single key secure watermarkable PRF into a collusion resistant one.

Achieving security with marking queries/public marking. We have shown how to achieve collusion resistant watermarkable PRF in a setting that the adversary is only allowed to obtain some “challenge circuits”, which are produced by embedding messages in a set \(\mathcal {C}\) into a random PRF key. However, in previous works, the adversary is always allowed to further learn marked circuits for its selected keys. The above solution is not secure with such marking queries. This is because the marking query will provide codewords of messages outside \(\mathcal {C}\), which can help the adversary to alter the embedded messages in the challenge circuits.

We fix this issue by forcing the marking algorithm to use different codebooks for different keys. In particular, the marking algorithm will first generate a codebook and the associated trapdoor (using randomness derived from the input PRF key), and then produce the marked circuit with this fresh codebook. As the codewords acquired from the marking queries are from different codebooks, they will not help the adversary in modifying the embedded messages in the challenge circuits.

The next issue is how to send the trapdoor to the extraction algorithm. Here we need to guarantee that the extraction algorithm can always receive the correct trapdoor and that the trapdoor is hidden to the adversaryFootnote 6. Note that, however, the only communication channel between the marking algorithm and the extraction algorithm is the watermarked circuits, which can be arbitrarily modified by the adversary.

We complete this task by embedding an encryption of the trapdoor into a new watermarkable PRF. More precisely, the watermarking object now is \(l+1\) single key secure watermarkable PRF, where each of the first l parts is embedded with one bit of the codeword and the last part is embedded with the ciphertext. For the same PRF key, we use the same randomness to generate the trapdoor and its encryption, thus single key security of a watermarkable PRF is sufficient to guarantee a reliable transmission of the trapdoor. Also, confidentiality of the trapdoor is guaranteed by security of the encryption scheme.

By applying above tweaks, security with marking queries of the new construction can be based on the security with marking queries of the underlying (single key secure) watermarkable PRF. Besides, we can also show that if the underlying watermarkable PRF is secure with public marking (i.e., the mark key is public), the new scheme also supports public marking.

Achieving security with extraction queries. Another desirable security property for watermarking schemes is security with extraction queries, which allows the adversary to learn what can be extracted from its generated circuits. Note that the extraction algorithm of our scheme consists of three steps. First, it extracts a word and a ciphertext from the marked circuit; then it decrypts the ciphertext to get the trapdoor; finally, it uses the trapdoor to retrieve the message from the word. Therefore, security with extraction queries of our scheme can be guaranteed if the underlying watermarkable PRF has security with extraction queries (achieved in [QWZ18, KW19]), the underlying encryption scheme has security with decryption queries (i.e., CCA-security, which is achieved by numerous previous works), and the underlying fingerprinting code has security with extraction queries. However, no fingerprinting code that is provable secure with extraction queries is known. To solve this problem, in this work, we construct the first fingerprinting code that is secure with extraction queries. We provide an overview of this construction later in this section.

Achieving unforgeability. One drawback of the current construction is that it cannot achieve unforgeability even if the underlying watermarking scheme is unforgeable. To see this, recall that given a marked circuit, which is a combination of \(l+1\) marked circuits, the extraction algorithm will extract one bit from each of the first l circuits. The bit is set to be 1 if it gets 1 from the circuit and the bit is set to be 0 either if it gets 0 from the circuit or if it gets an unmarked symbol \(\perp \). That is to say, the extraction algorithm could still output some message even if part of the circuit is unmarked. Thus, an adversary may break the unforgeability of our construction by replacing part of a marked circuit with a random circuit. The new circuit and the original marked circuit should behave differently on nearly all inputs, yet the extraction algorithm will probably extract some message from it.

We solve this problem by presenting a general approach to adding strong unforgeability to watermarkable PRFs.Footnote 7 Let \(\mathsf {G}\) be a secure watermarkable PRF (without unforgeability). We show how to construct a secure watermarkable PRF with unforgeability from \(\mathsf {G}\). The construction employs a signature scheme and an encryption scheme. In more detail, the revised marking algorithm first signs on the PRF key and encrypts the PRF key and the signature. Then, it embeds the ciphertext as well as the message into the PRF key using the original marking algorithm of \(\mathsf {G}\). The new extraction algorithm will first extract the ciphertext and the message from the circuit. Then, it decrypts the ciphertext to obtain the PRF key k and the signature. Next, the extraction algorithm checks if the signature is valid and if the circuit behaves almost identically to \(\mathsf {G}_{k}\). It outputs the message only if both checks are passed, and it outputs \(\perp \) otherwise.

Unforgeability of the watermarking scheme comes from unforgeability of the signature scheme. In particular, due to the unforgeability of the signature scheme, the adversary is not able to generate valid signatures for a new PRF key. Therefore, if the adversary wishes to create a circuit that can pass the extraction algorithm, the circuit must be close to one of previously marked PRF keys, which is exactly what the unforgeability requires. We stress that the claim holds even if the extraction key of the scheme is revealed. Thus, we provide the first watermarkable PRF achieving unforgeability with public extraction from standard assumption (yet, it does not have unremovability with public extraction).

Next, we argue why the new construction still has unremovability. There are two main concerns. Firstly, the original PRF key is included in the marked circuit, but as only an encryption of the key is embedded, this will not leak additional information to the adversary.Footnote 8 Secondly, an additional check is performed in the extraction algorithm to test if the circuit preserves the functionality of the original key. Since the adversary (for unremovability) is not allowed to significantly change the functionality of the challenge circuit(s), its submitted circuit should pass the check.

Putting it all together. Piecing together all ideas and techniques proposed above, we obtain a generic construction of collusion resistant watermarkable PRF from any single key secure watermarkable PRF. The construction preserves the security with marking queries/public marking/extraction queries of the underlying single key secure watermarking scheme. Also, it achieves unforgeability for free. We provide a detailed description of the construction in Sect. 4.

Fingerprinting code secure with extraction queries. It remains to show how to construct a fingerprinting code that is secure with extraction queries. We start by briefly reviewing the well-known Boneh-Shaw code [BS95], which is widely used in cryptography.

Let N be the size of the message space and let L be a polynomial in security parameter and N. The code generation algorithm first samples N disjoint subsets \(\mathcal {P}_1, \ldots , \mathcal {P}_N\) of [NL], where \(|\mathcal {P}_i |=L\), and sets them as the trapdoor. Then it sets the codeword for a message \(\mathtt {m}\in [N]\) to be an NL-bit binary string \(\bar{w}_{\mathtt {m}}\), where \(\bar{w}_{\mathtt {m}}[j]=1\) iff \(j\in \mathcal {P}_i\) for some \(i\le \mathtt {m}\). To decode a word w, the decoding algorithm sets \(A_0=1\) and \(A_{N+1}=0\), then it computes \(A_i=(\sum _{j\in \mathcal {P}_i}w[j])/L\) and outputs the first i s.t. \(A_i-A_{i+1}\) is large.

To see security of the Boneh-Shaw code, considering a simple example where \(N=4\) and the adversary is given two codewords \(\bar{w}_{1}\) and \(\bar{w}_{3}\), let w be the word output by the adversary. Then, the decoding algorithm will not output a message outside \(\{1,3\}\) on input w, because:

  1. 1.

    For any \(j\in \mathcal {P}_1\), \(\bar{w}_{1}[j]=\bar{w}_{3}[j]=1\) and for any \(j\in \mathcal {P}_4\), \(\bar{w}_{1}[j]=\bar{w}_{3}[j]=0\), then from the marking assumption, the adversary is not allowed to modify the bit at a position in \(\mathcal {P}_1\) and \(\mathcal {P}_4\). Thus, we still have \(A_1=1\) and \(A_4=0\). Therefore, the decoding algorithm will not output 0 or 4.

  2. 2.

    For bits at positions in \(\mathcal {P}_2\) and \(\mathcal {P}_3\), the adversary can modify them arbitrarily. But, since the trapdoor is kept hidden to the adversary, it cannot distinguish positions in \(\mathcal {P}_2\) and that in \(\mathcal {P}_3\). So, the adversary cannot make \(A_2-A_{3}\) large and thus the decoding algorithm will not output 2.

However, if the adversary is allowed to make queries to an extraction oracle, it can learn some information about the trapdoor from each query. Thus, the second claim above will be invalidated in this case.Footnote 9

We deal with this issue by using part of the trapdoor in each invocation of the decoding algorithm. In particular, the decoding algorithm randomly picks a fixed size subset \(\mathcal {S}_i \subseteq \mathcal {P}_i\) for \(i\in [N]\). Then it computes \(A'_i=(\sum _{j\in \mathcal {S}_i}w[j])/|\mathcal {S}_i |\) and finds the large gap between \(A'_i\) and \(A'_{i+1}\). The fraction \(A'_i\) can be viewed as an estimation of \(A_i\) and the two numbers are close, so the modification here will not compromise security of the Boneh-Shaw code.

To see why the revised decoding algorithm can provide security with extraction queries, let \(\mathcal {S}^*_1,\ldots ,\mathcal {S}^*_N\) be the partial trapdoor used when decoding a word w from the adversary, who has seen codewords for messages in a set \(\mathcal {C}\). Due to the security of the original Boneh-Shaw code, the decoding algorithm should output a message in \(\mathcal {C}\), if all \(\mathcal {S}_i\) used in previous extraction queries are sampled from \(\mathcal {P}_i-\mathcal {S}^*_i\). Thus, it is sufficient to show that the output of the extraction oracle will not change (significantly) if the decoding algorithm uses a random subset of \(\mathcal {P}_i-\mathcal {S}^*_i\) instead of a random subset of \(\mathcal {P}_i\). Unfortunately, it seems that there is a non-negligible gap between the oracle outputs in these two cases and the conventional statistical distance is not applicable here to bound the adversary’s advantage. To overcome this hurdle, we use the Rényi divergence to measure the distribution closeness and limit the number of extraction queries from the adversary. See Sect. 3 for a more detailed description of our construction.

1.3 Related Works

The notion of fingerprinting code is first studied in [Wag83, BMP85]. Considering the adversary’s ability in altering the codewords, many different models for fingerprinting code are studied. In this work, we consider the model presented in [BS98]. Boneh and Shaw [BS98] construct the first fingerprinting code that is secure in this model. Then, in [Tar03], Tardos presents a shorter code and shows that the code length is optimal in the asymptotic sense. Some subsequent works (see e.g., [NFH+09, AT09, LdW14] and references therein) aim at improving the concrete efficiency of the scheme. However, to the best of our knowledge, no work has considered an adversary that can ask for the decoding of its created words.

One important application of fingerprinting code is to build traitor tracing schemes [CFN94], which aims at tracing secret key leakers in a broadcast encryption setting. The notion of traitor tracing is somewhat similar to the notion of collusion resistant watermarking. But our construction has several differences from previous fingerprinting code based traitor tracing schemes [BN08]. First, we embed each bit of the codeword into the underlying single key watermarkable PRF directly, while in [BN08], codewords are used to select secret keys for users. Besides, we need to additionally send the trapdoor from the marking algorithm to the extraction algorithm. In addition, we require a stronger fingerprinting code that has adaptive security with extraction queries, and provide an instantiation.

2 Notations

Let s be a string, we use \(|s |\) to denote the length of s. For integers \(a \le |s |\), we use s[a] to denote the i-th character of s and for integers \(a \le b \le |s |\), we use s[a : b] to denote the substring \((s[a],s[a+1], \ldots ,s[b])\). Let \(\mathcal {S}\) be a finite set, we use \(|\mathcal {S} |\) to denote the size of \(\mathcal {S}\), and use \(s \overset{\$}{\leftarrow }\mathcal {S}\) to denote sampling an element s uniformly from set \(\mathcal {S}\). Let \(\mathcal {D}\) be a distribution, we use \(d \leftarrow \mathcal {D}\) to denote sampling d according to \(\mathcal {D}\) and use \( Supp (\mathcal {D})\) to denote the support of \(\mathcal {D}\).

We write \(negl(\cdot )\) to denote a negligible function, and write \(poly(\cdot )\) to denote a polynomial. For integers \(a \le b\), we write [ab] to denote all integers from a to b and use [b] to denote all integers from 1 to b. For natural numbers \(a \le b\), we use \(\left( {\begin{array}{c}b\\ a\end{array}}\right) \) to denote the binomial coefficient, i.e., \(\left( {\begin{array}{c}b\\ a\end{array}}\right) =\frac{b \cdot (b-1) \cdot \ldots \cdot (b-a+1)}{a \cdot (a-1) \cdot \ldots \cdot 1}\).

For more background knowledge and definitions of cryptographic primitives employed, we refer the readers to the full version of this paper.

3 Fingerprinting Code with Enhanced Security

3.1 The Definition

In this section, we provide the definition of fingerprinting code. Compared to previous definitions [BS95, Tar03, BN08], we require a stronger security, where the adversary is allowed to 1) make queries to an extraction oracle that outputs the decoding of a given word and 2) make challenge oracle queries adaptively.

Definition 3.1

(Fingerprinting Code). A fingerprinting code \(\mathsf {FC}=({\mathtt {Gen}},{\mathtt {Dec}})\) with message space [1, N] and code length l consists of the following algorithms:

  • \({\mathtt {Gen}}(1^{\lambda }) \rightarrow (td,\varGamma =(\bar{w}_{\mathtt {m}})_{\mathtt {m}\in [N]}):\) On input the security parameter \(\lambda \), the code generation algorithm outputs the trapdoor td and N codewords \(\bar{w}_1, \ldots \bar{w}_N\) (for messages \(1, \ldots , N\) respectively) in \(\{0,1\}^{l}\).

  • \({\mathtt {Dec}}(td,w) \rightarrow \mathtt {m}:\) On input the trapdoor td and a word \(w \in \{0,1\}^l\) (w is not necessarily in \(\varGamma \)), the decoding algorithm outputs a message \(\mathtt {m} \in [1,N] \cup \{\perp \}\).

The correctness property requires that the the decoding algorithm will decode codewords in \(\varGamma \) correctly.

Definition 3.2

(Correctness). Let \((td,(\bar{w}_\mathtt {m})_{\mathtt {m}\in [N]}) \leftarrow {\mathtt {Gen}}(1^{\lambda })\), then for any \(\mathtt {m}\), we have:

$$\Pr [{\mathtt {Dec}}(td,\bar{w}_\mathtt {m}) \not = \mathtt {m}]=0$$

The security property requires that given a few codewords \(\{\bar{w}_\mathtt {m}\}_{\mathtt {m}\in \mathcal {C}^*} \subseteq \varGamma \) for messages in a set \(\mathcal {C}^*\), no adversary can generate a “feasible” word that decodes to a new message outside \(\mathcal {C}^*\). Here, we say that a word w is feasible if

$$\forall j\in [l], \exists \,\mathtt {m}\in \mathcal {C}^*, \bar{w}_\mathtt {m}[j]=w[j]$$

In this work, we consider a strong security, where the adversary is allowed to learn the decoding of q feasible words for an a priori bounded q. Also, we allow the adversary to make challenge oracle queries adaptively, i.e., it can request codewords for its selected messages after viewing some codewords and the decoding of some words.

Definition 3.3

(Security with q Extraction Queries). A fingerprinting code is secure with q extraction queries if for all polynomial-time (PPT) adversaries \(\mathcal {A}\), we have \(\Pr [\mathtt {Expt}_{\mathcal {A},q}(\lambda )=1] \le negl(\lambda )\), where we define the experiment \(\mathtt {Expt}\) as follows:

  1. 1.

    The challenger samples \((td,(\bar{w}_\mathtt {m})_{\mathtt {m}\in [N]}) \leftarrow {\mathtt {Gen}}(1^{\lambda })\) and initializes the set \(\mathcal {C}^*=\emptyset \).

  2. 2.

    Then, the adversary is allowed to make a priori unbounded number of queries to the challenge oracle and make up to q queries to the extraction oracle, which are defined as follows:

    • Challenge Oracle. On input a message \(\mathtt {m} \in [1,N]\), the challenger returns \(\bar{w}_\mathtt {m}\) to the adversary and sets \(\mathcal {C}^*=\mathcal {C}^* \cup \{\mathtt {m}\}\).

    • Extraction Oracle. On input a word w, the challenger does not return anything to \(\mathcal {A}\) if w is not feasible (according to current \(\mathcal {C}^*\)). Otherwise, it computes \(\mathtt {m} \leftarrow {\mathtt {Dec}}(td,w)\). The challenger returns \(\mathtt {m}\) to \(\mathcal {A}\) if \(\mathtt {m} \in \mathcal {C}^*\). Otherwise, the experiment aborts and outputs 1.

  3. 3.

    The experiment outputs 0 if it does not abort in Step 2.

3.2 The Construction

In this section, we present our construction of fingerprinting code that has adaptive security with extraction queries.

Let \(\lambda \) be the security parameter. Let NLlq be positive integers that are polynomial in \(\lambda \) and satisfy \(l=8\lambda (N+1)^2\), \(L=8l-4+4l^2Nq\). Let \(\theta =1/(2(N+1))\). Let \(\mathfrak {S}=\{\mathcal {S} \subseteq [1,L]: |\mathcal {S} |=l\}\) be the set of all subsets of [1, L] that contain l elements.

We construct the fingerprinting code \(\mathsf {FC}=({\mathtt {Gen}},{\mathtt {Dec}})\) with message space [1, N] and code length NL as follows:

  • \({\mathtt {Gen}}\). On input a security parameter \(\lambda \), the code generation algorithm first samples a random permutation \({\mathtt {P}}\) over [NL]. Then for \(\mathtt {m}\in [1,N]\), and \(h\in [NL]\), it sets

    $$ \bar{w}_\mathtt {m}[h]={\left\{ \begin{array}{ll} 1 \quad &{} \text {if } \lceil {\mathtt {P}}(h)/L \rceil \le \mathtt {m} \\ 0 &{} \text {otherwise} \end{array}\right. } $$

    Finally, it outputs the trapdoor \(td={\mathtt {P}}\) and the codewords \((\bar{w}_{\mathtt {m}})_{\mathtt {m}\in [N]}\).

  • \({\mathtt {Dec}}\). On input the trapdoor \(td={\mathtt {P}}\) and a word \(w \in \{0,1\}^{NL}\), the decoding algorithm proceeds as follows:

    1. 1.

      For \(i\in [N]\):

      1. (a)

        Sample \(\mathcal {S}_i \overset{\$}{\leftarrow }\mathfrak {S}\)

      2. (b)

        \(A_i=0\)

      3. (c)

        For \(j\in \mathcal {S}_i\):

        1. i.

          \(A_i=A_i+w[{\mathtt {P}}^{-1}((i-1)L+j)]\)

      4. (d)

        If \(\frac{A_i}{l} \le \frac{3}{4}-i \theta \):

        1. i.

          If \(i=1\): Output \(\perp \)

        2. ii.

          Output \(i-1\)

    2. 2.

      Output N

Theorem 3.1

\(\mathsf {FC}\) is a secure fingerprinting code that has correctness and adaptive security with q extraction queries.

We give proof of Theorem 3.1 in the full version.

4 Collusion Resistant Watermarkable PRF

4.1 The Definition

In this section, we provide the definition of watermarkable PRF, which is adapted and generalized from definitions in previous works [CHN+16, BLW17, KW17, QWZ18, KW19, YAL+19].

Definition 4.1

(Watermarkable PRFs). A watermarkable PRF \(\mathsf {WPRF}=({\mathtt {Setup}},{\mathtt {KeyGen}},{\mathtt {Eval}},{\mathtt {Mark}},{\mathtt {Extract}})\) with key space \(\mathcal {K}\), input space \(\{0,1\}^n\), output space \(\{0,1\}^m\), and message space \(\mathcal {M}\) consists of the following algorithms:

  • \({\mathtt {Setup}}(1^{\lambda }) \rightarrow (PP,MK,EK):\) On input the security parameter \(\lambda \), the setup algorithm outputs the public parameter PP, the mark key MK and the extraction key EK.

  • \({\mathtt {KeyGen}}(PP) \rightarrow k:\) On input the public parameter PP, the key generation algorithm outputs a PRF key \(k\in \mathcal {K}\).

  • \({\mathtt {Eval}}(PP,k,x) \rightarrow y:\) On input the public parameter PP, a PRF key \(k\in \mathcal {K}\), and an input \(x \in \{0,1\}^n\), the evaluation algorithm outputs an output \(y \in \{0,1\}^m\).

  • \({\mathtt {Mark}}(PP,MK,k,msg) \rightarrow {\mathtt {C}}:\) On input the public parameter PP, the mark key MK, a PRF key \(k\in \mathcal {K}\), and a message \(msg \in \mathcal {M}\), the marking algorithm outputs a marked circuit \({\mathtt {C}}: \{0,1\}^n \rightarrow \{0,1\}^m\).

  • \({\mathtt {Extract}}(PP,EK,{\mathtt {C}}) \rightarrow msg:\) On input the public parameter PP, the extraction key EK, and a circuit \({\mathtt {C}}\), the extraction algorithm outputs a message \(m \in \mathcal {M} \cup \{\perp \}\), where \(\perp \) denotes that the circuit is unmarked.

Correctness. The correctness of a watermarking scheme includes three properties. The functionality preserving property requires that the watermarked key can roughly preserve the functionality of the original key.

Definition 4.2

(Functionality Preserving). For any \(msg \in \mathcal {M}\), let \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^{\lambda })\), \(k \leftarrow {\mathtt {KeyGen}}(PP)\), \({\mathtt {C}} \leftarrow {\mathtt {Mark}}(PP,MK,k,msg)\), \(x \overset{\$}{\leftarrow }\{0,1\}^n\), then we have \(\Pr [{\mathtt {C}}(x)\not ={\mathtt {Eval}}(PP,k,x)]\le negl(\lambda )\).

The extraction correctness requires that the extraction algorithm can extract the correct message from an honestly-watermarked key.

Definition 4.3

(Extraction Correctness). For any \(msg \in \mathcal {M}\), let \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^{\lambda })\), \(k \leftarrow {\mathtt {KeyGen}}(PP)\), \({\mathtt {C}} \leftarrow {\mathtt {Mark}}(PP,MK,k,msg)\), then we have \(\Pr [{\mathtt {Extract}}(PP,EK,{\mathtt {C}})\not =msg] \le negl(\lambda )\).

The meaningfulness property requires that most circuits are unmarked, which rules out the trivial construction that regards all circuits as marked.

Definition 4.4

(Watermarking Meaningfulness). For any circuit \({\mathtt {C}}: \{0,1\}^n \rightarrow \{0,1\}^m\), let \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^{\lambda })\), then we have:

$$ \Pr [{\mathtt {Extract}}(PP,EK,{\mathtt {C}})\not =\perp ] \le negl(\lambda ) $$

Remark 4.1

In Definition 4.2 and Definition 4.3, correctness properties are defined for honestly-generated PRF keys only. A stronger notion of correctness consider adversarially-chosen keys, where k is chosen by the adversary. See [KW17, KW19] for more detailed discussions on different notions of correctness.

Pseudorandomness. The pseudorandomness property of a watermarkable PRF is twofold. First, it requires that the watermarkable PRF should be pseudorandom against an external adversary.

Definition 4.5

(Pseudorandomness). Let \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^{\lambda })\), \(k \leftarrow {\mathtt {KeyGen}}(PP)\), and f be a random function from \(\{0,1\}^n\) to \(\{0,1\}^m\). Also, let \(\mathcal {O}_1(\cdot )\) be an oracle that takes as input a string \(x\in \{0,1\}^n\) and returns \({\mathtt {Eval}}(PP,k,x)\), and let \(\mathcal {O}_2(\cdot )\) be an oracle that takes as input a string \(x\in \{0,1\}^n\) and returns f(x). Then for all PPT adversary \(\mathcal {A}\), we have:

$$ \mid \Pr [\mathcal {A}^{\mathcal {O}_1(\cdot )}(PP)=1] - \Pr [\mathcal {A}^{\mathcal {O}_2(\cdot )}(PP)=1] \mid \le negl(\lambda ) $$

Moreover, the watermarkable PRF should be (weak) pseudorandom against the watermarking authority, who holds the mark key and the extraction key.

Definition 4.6

(Pseudorandomness against the Watermarking Authority). Let \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^{\lambda })\), \(k \leftarrow {\mathtt {KeyGen}}(PP)\), and f be a random function from \(\{0,1\}^n\) to \(\{0,1\}^m\). Also, let \(\mathcal {O}_1(\cdot )\) be an oracle that takes as input a string \(x\in \{0,1\}^n\) and returns \({\mathtt {Eval}}(PP,k,x)\), and let \(\mathcal {O}_2(\cdot )\) be an oracle that takes as input a string \(x\in \{0,1\}^n\) and returns f(x). Then for all PPT adversary \(\mathcal {A}\), we have:

$$ \mid \Pr [\mathcal {A}^{\mathcal {O}_1(\cdot )}(PP,MK,EK)=1] - \Pr [\mathcal {A}^{\mathcal {O}_2(\cdot )}(PP,MK,EK)=1] \mid \le negl(\lambda ) $$

Definition 4.7

(Weak Pseudorandomness against the Watermarking Authority). Let \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^{\lambda })\), \(k \leftarrow {\mathtt {KeyGen}}(PP)\), and f be a random function from \(\{0,1\}^n\) to \(\{0,1\}^m\). Also, let \(\mathcal {O}_1\) be an oracle that samples \(x\overset{\$}{\leftarrow }\{0,1\}^n\) and returns \((x,{\mathtt {Eval}}(PP,k,x))\) on each query, and let \(\mathcal {O}_2\) be an oracle that samples \(x\overset{\$}{\leftarrow }\{0,1\}^n\) and returns (xf(x)) on each query. Then for all PPT adversary \(\mathcal {A}\), we have:

$$ \mid \Pr [\mathcal {A}^{\mathcal {O}_1}(PP,MK,EK)=1] - \Pr [\mathcal {A}^{\mathcal {O}_2}(PP,MK,EK)=1] \mid \le negl(\lambda ) $$

Unremovability. This is the main security requirement for a watermarking scheme. Roughly, it requires that the adversary is not able to remove or modify the messages embedded in a random PRF key without significantly changing the functionality.

Definition 4.8

(\(\epsilon \)-Unremovability). A watermarkable PRF is \(\epsilon \)-unremovable if for all PPT and \(\epsilon \)-unremoving-admissible adversaries \(\mathcal {A}\), we have \(\Pr [\mathtt {ExptUR}_{\mathcal {A}}(\lambda )=1] \le negl(\lambda )\), where we define the experiment \(\mathtt {ExptUR}\) as follows:

  1. 1.

    The challenger samples \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^\lambda )\) and returns PP to \(\mathcal {A}\). Also, it samples a challenge key \(k^* \leftarrow {\mathtt {KeyGen}}(PP)\), which is used in answering the adversary’s challenge oracle queries.

  2. 2.

    Then, \(\mathcal {A}\) is given access to the following oracles (but it may be restricted in querying them as discussed below):

    • Mark Key Oracle. The mark key oracle returns MK to the adversary.

    • Extraction Key Oracle. The extraction key oracle returns EK to the adversary.

    • Marking Oracle. On input a PRF key \(k \in \mathcal {K}\) and a message \(msg \in \mathcal {M}\), the marking oracle returns a circuit \({\mathtt {C}} \leftarrow {\mathtt {Mark}}(PP,MK,k,msg)\).

    • Extraction Oracle. On input a circuit \({\mathtt {C}}\), the extraction oracle returns a message \(msg \leftarrow {\mathtt {Extract}}(PP,EK,{\mathtt {C}})\).

    • Challenge Oracle. On input a message msg, the challenge oracle returns a circuit \({\mathtt {C}}^* \leftarrow {\mathtt {Mark}}(PP,MK,k^*,msg)\) to the adversary.

  3. 3.

    Finally, \(\mathcal {A}\) submits a circuit \(\tilde{{\mathtt {C}}}\) and the experiment outputs 1 iff \({\mathtt {Extract}}(PP,EK,\tilde{{\mathtt {C}}}) \not \in \mathsf {M}^*\). Here, we use \(\mathsf {M}^*\) to denote all messages submitted to the challenge oracle and use \(\mathsf {C}^*\) to denote all circuits returned by the challenge oracle.

We say that an adversary \(\mathcal {A}\) is \(\epsilon \)-unremoving-admissible if there exists circuit \({\mathtt {C}}^* \in \mathsf {C}^*\) that \(|\{x \in \{0,1\}^n:{\mathtt {C}}^*(x) \not = \tilde{{\mathtt {C}}}(x) \} | \le \epsilon \cdot 2^n\).

We can get different levels of unremovability by restricting the adversary’s ability in querying oracles. In a nutshell, we write unremovability as -unremovability, where , , and . In more detail, the security notions are organized along the following three dimensions:

  • Ability to Query the Challenge Oracle. The unremovability can be defined against an adversary that can:

    • make only one query to the challenge oracle (single key).

    • make queries to the challenge oracle for a priori bounded number of times (bounded collusion resistant).

    • make queries to the challenge oracle for a priori unbounded number of times (fully collusion resistant).

  • Ability in Obtaining Information about MK. The unremovability can be defined against an adversary that can:

    • make query to neither the mark key oracle nor the marking oracle (−).

    • make a priori unbounded number of queries to the marking oracle but make no query to the mark key oracle (MO).

    • make query to the mark key oracle (PM).

  • Ability in Obtaining Information about EK. The unremovability can be defined against an adversary that can:

    • make query to neither the extraction key oracle nor the extraction oracle (−).

    • make at most q queries to the extraction oracle but make no query to the extraction key oracle, where q is a priori bounded (bounded EO or q-EO).

    • make a priori unbounded number of queries to the extraction oracle but make no query to the extraction key oracle (EO).

    • make query to the extraction key oracle (PE).

Remark 4.2

In our definition of collusion resistant unremovability, the adversary is allowed to make challenge oracle queries adaptively. Such adaptive security is not defined (and achieved) in previous works about collusion resistant watermarkable PRF [YAL+19].

Unforgeability. This property is dual to the unremovability. Roughly, it prevents one from embedding messages to PRF keys without the mark key.

Definition 4.9

(\(\delta \)-Unforgeability). A watermarkable PRF is \(\delta \)-unforgeable if for all PPT and \(\delta \)-unforging-admissible adversaries \(\mathcal {A}\), we have \(\Pr [\mathtt {ExptUF}_{\mathcal {A}}(\lambda )=1] \le negl(\lambda )\), where we define the experiment \(\mathtt {ExptUF}\) as follows:

  1. 1.

    The challenger samples \((PP,MK,EK) \leftarrow {\mathtt {Setup}}(1^\lambda )\) and returns PP to \(\mathcal {A}\).

  2. 2.

    Then, \(\mathcal {A}\) is given access to the following oracles (but it may be restricted in querying them as discussed below):

    • Extraction Key Oracle. The extraction key oracle returns EK to the adversary.

    • Marking Oracle. On input a PRF key \(k \in \mathcal {K}\) and a message \(msg \in \mathcal {M}\), the marking oracle returns a circuit \({\mathtt {C}} \leftarrow {\mathtt {Mark}}(PP,MK,k,msg)\).

    • Extraction Oracle. On input a circuit \({\mathtt {C}}\), the extraction oracle returns a message \(msg \leftarrow {\mathtt {Extract}}(PP,EK,{\mathtt {C}})\).

  3. 3.

    Finally, \(\mathcal {A}\) submits a circuit \(\tilde{{\mathtt {C}}}\) and the experiment outputs 1 iff \({\mathtt {Extract}}(PP,EK,\tilde{{\mathtt {C}}}) \not = \perp \).

Here, an adversary \(\mathcal {A}\) is \(\delta \)-unforging-admissible if for every circuit \({\mathtt {C}}_i\) returned by the marking oracle, \(|\{x \in \{0,1\}^n:{\mathtt {C}}_i(x) \not = \tilde{{\mathtt {C}}}(x) \} | \ge \delta \cdot 2^n\).Footnote 10

We can get different levels of unforgeability by restricting the adversary’s ability in querying oracles. In a nutshell, we write unforgeability as -unforgeability, where , and . In more detail, the security notions are organized along the following two dimensions:

  • Ability in Obtaining Information about MK. The unforgeability can be defined against an adversary that can:

    • make no query to the marking oracle (−).

    • make a priori unbounded number of queries to the marking oracle (MO).

  • Ability in Obtaining Information about EK. The unforgeability can be defined against an adversary that can:

    • make query to neither the extraction key oracle nor the extraction oracle (−).

    • make a priori unbounded number of queries to the extraction oracle but make no query to the extraction key oracle (EO).

    • make query to the extraction key oracle (PE).

4.2 The Construction

In this section, we show our main construction, which upgrades single key secure watermarkable PRF families into fully collusion resistant ones.

Let \(\lambda \) be the security parameter. Let \(n,m,N,l,s,\kappa ,q\) be positive integers that are polynomial in \(\lambda \). Let \(\epsilon , \epsilon ', \bar{\epsilon }\) be positive real values s.t. \(1/\bar{\epsilon }\) is polynomial in \(\lambda \), \(\bar{\epsilon }=(1+1/\lambda )\cdot \epsilon , \epsilon '=(1+2/\lambda )\cdot \epsilon \). Also, let \(t=\lambda ^3/\epsilon \).

Our construction is built on the following building blocks:

  • A watermarkable PRF family \(\mathsf {WPRF}_0=(\mathsf {WPRF}_0.{\mathtt {Setup}},\mathsf {WPRF}_0.{\mathtt {KeyGen}},\mathsf {WPRF}_0.{\mathtt {Eval}},\mathsf {WPRF}_0.{\mathtt {Mark}},\mathsf {WPRF}_0.{\mathtt {Extract}})\) with input space \(\{0,1\}^n\), output space \(\{0,1\}^{m}\), and message space \(\{0,1\}^{\kappa }\). Also, we use \(\mathcal {R}_0\) and \(\mathcal {R}'_0\) to denote the randomness space for the algorithm \(\mathsf {WPRF}_0.{\mathtt {KeyGen}}\) and the algorithm \(\mathsf {WPRF}_0.{\mathtt {Mark}}\) respectively.

  • A fingerprinting code \(\mathsf {FC}=(\mathsf {FC}.{\mathtt {Gen}},\mathsf {FC}.{\mathtt {Dec}})\) with message space [1, N] and code length l. Also, we use \(\mathcal {T}\) and \(\mathcal {R}_{\mathsf {FC}}\) to denote the key space (i.e., the set of all trapdoors for \(\mathsf {FC}\)) and the randomness space for the algorithm \(\mathsf {FC}.{\mathtt {Gen}}\) respectively.

  • A signature scheme \(\mathsf {SIG}=(\mathsf {SIG}.{\mathtt {KeyGen}},\mathsf {SIG}.{\mathtt {Sign}},\mathsf {SIG}.{\mathtt {Verify}})\) with message space \(\{0,1\}^{\lambda }\), signature space \(\{0,1\}^{s}\) and signing randomness space \(\mathcal {R}_{\mathsf {SIG}}\).

  • A PKE scheme \(\mathsf {PKE}=(\mathsf {PKE}.{\mathtt {KeyGen}},\mathsf {PKE}.{\mathtt {Enc}},\mathsf {PKE}.{\mathtt {Dec}})\) with message space \(\mathcal {T} \times \{0,1\}^{\lambda } \times \{0,1\}^{s}\), ciphertext space \(\{0,1\}^{\kappa }\), and encryption randomness space \(\mathcal {R}_{\mathsf {PKE}}\).

  • Pseudorandom generators:

    $$\begin{aligned}&{\mathtt {G}}:\{0,1\}^{\lambda } \rightarrow \{0,1\}^{\lambda } \times \{0,1\}^{\lambda } \times \mathcal {R}_{\mathsf {FC}} \times \mathcal {R}_{\mathsf {SIG}} \times \mathcal {R}_{\mathsf {PKE}} \\&{\mathtt {G}}':\{0,1\}^{\lambda } \rightarrow \mathcal {R}_0^{l+1} \qquad \qquad \quad {\mathtt {G}}'':\{0,1\}^{\lambda } \rightarrow \mathcal {R}^{\prime 2l+1}_0 \end{aligned}$$
  • A pseudorandom function family \(\mathsf {F}=(\mathsf {F}.{\mathtt {KeyGen}},\mathsf {F}.{\mathtt {Eval}})\) with input space \(\mathcal {R}^{\prime 2l+1}_0\) and output space \(\mathcal {R}^{\prime 2l+1}_0\).

We construct \(\mathsf {WPRF}=({\mathtt {Setup}},{\mathtt {KeyGen}},{\mathtt {Eval}},{\mathtt {Mark}},{\mathtt {Extract}})\), which has input space \(\{0,1\}^{n}\), output space \(\{0,1\}^{(l+1)m}\), and message space [1, N], as follows:

  • \({\mathtt {Setup}}\). On input a security parameter \(\lambda \), the setup algorithm first generates \((PP_0,MK_0,EK_0) \leftarrow \mathsf {WPRF}_0.{\mathtt {Setup}}(1^{\lambda })\), \((VK,SK) \leftarrow \mathsf {SIG}.{\mathtt {KeyGen}}(1^{\lambda })\), \((PK,DK) \leftarrow \mathsf {PKE}.{\mathtt {KeyGen}}(1^{\lambda })\), and \(K \leftarrow \mathsf {F}.{\mathtt {KeyGen}}(1^{\lambda })\). Then, it outputs the public parameter \(PP=(PP_0,VK,PK)\), the mark key \(MK=(MK_0,SK,K)\), and the extraction key \(EK=(EK_0,DK)\).

  • \({\mathtt {KeyGen}}\). On input the public parameter PP, the key generation algorithm outputs the PRF key \(s \overset{\$}{\leftarrow }\{0,1\}^{\lambda }\).

  • \({\mathtt {Eval}}\). On input the public parameter \(PP=(PP_0,VK,PK)\), a PRF key \(s \in \{0,1\}^{\lambda }\) and an input \(x \in \{0,1\}^{n}\), the evaluation algorithm proceeds as follows:

    1. 1.

      \((\check{r}, \hat{r},R_{\mathsf {FC}},R_{\mathsf {SIG}},R_{\mathsf {PKE}}) = {\mathtt {G}}(s)\).

    2. 2.

      \((r_0,r_1, \ldots , r_l) = {\mathtt {G}}'(\check{r})\).

    3. 3.

      For \(i\in [0,l]\), \(k_{i} = \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0;r_i)\).

    4. 4.

      Output \((\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k_{i},x))_{i\in [0,l]}\).

  • \({\mathtt {Mark}}\). On input the public parameter \(PP=(PP_0,VK,PK)\), the mark key \(MK=(MK_0,SK,K)\), a PRF key \(s \in \{0,1\}^{\lambda }\) and a message \(msg \in [1,N]\), the marking algorithm proceeds as follows:

    1. 1.

      \((\check{r}, \hat{r},R_{\mathsf {FC}},R_{\mathsf {SIG}},R_{\mathsf {PKE}}) = {\mathtt {G}}(s)\).

    2. 2.

      \((r_0,r_1, \ldots , r_l) = {\mathtt {G}}'(\check{r})\).

    3. 3.

      For \(i\in [0,l]\), \(k_{i} = \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0;r_i)\).

    4. 4.

      \((td,(\bar{w}_{i})_{i\in [N]})=\mathsf {FC}.{\mathtt {Gen}}(1^{\lambda };R_{\mathsf {FC}})\).

    5. 5.

      \(\sigma = \mathsf {SIG}.{\mathtt {Sign}}(SK,\check{r};R_{\mathsf {SIG}})\).

    6. 6.

      \(ct=\mathsf {PKE}.{\mathtt {Enc}}(PK,td \Vert \check{r} \Vert \sigma ;R_{\mathsf {PKE}})\).

    7. 7.

      \((r'_0,(r'_{i,\iota })_{i\in [l],\iota \in \{0,1\}})=\mathsf {F}.{\mathtt {Eval}}(K,{\mathtt {G}}''(\hat{r}))\).

    8. 8.

      \({\mathtt {W}}_0 = \mathsf {WPRF}_0.{\mathtt {Mark}}(PP_0,MK_0,k_0,ct;r'_0)\).

    9. 9.

      For \(i\in [l]\):

      1. (a)

        \(b_i=\bar{w}_{msg}[i]\).

      2. (b)

        \({\mathtt {W}}_{i} = \mathsf {WPRF}_0.{\mathtt {Mark}}(PP_0,MK_0,k_{i},b_i;r'_{i,b_i})\).

    10. 10.

      Outputs a circuit \({\mathtt {C}}: \{0,1\}^{n} \rightarrow \{0,1\}^{(l+1)m}\) s.t. \({\mathtt {C}}(x)=({\mathtt {W}}_{i}(x))_{i\in [0,l]} \).

  • \({\mathtt {Extract}}\). On input the public parameter \(PP=(PP_0,VK,PK)\), the extraction key \(EK=(EK_0,DK)\), and a circuit \({\mathtt {C}}\), the extraction algorithm proceeds as follows:

    1. 1.

      Set the circuit \({\mathtt {W}}_0: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_0(x)={\mathtt {C}}(x)[1:m]\).

    2. 2.

      \(ct=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_0)\).

    3. 3.

      If \(ct=\perp \), output \(\perp \).

    4. 4.

      \((td\Vert \check{r}\Vert \sigma ) = \mathsf {PKE}.{\mathtt {Dec}}(DK,ct)\).

    5. 5.

      If \((td\Vert \check{r}\Vert \sigma )=\perp \), output \(\perp \).

    6. 6.

      If \(\mathsf {SIG}.{\mathtt {Verify}}(VK,\check{r},\sigma )=0\), output \(\perp \).

    7. 7.

      \((r_0,r_1, \ldots , r_l) = {\mathtt {G}}'(\check{r})\).

    8. 8.

      For \(i\in [0,l]\), \(k_{i} = \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0;r_i)\).

    9. 9.

      \(A=0\).

    10. 10.

      For \(j\in [t]\):

      1. (a)

        Sample \(x \overset{\$}{\leftarrow }\{0,1\}^n\).

      2. (b)

        If \({\mathtt {C}}(x)\not =(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k_i,x))_{i\in [0,l]}\), \(A=A+1\).

    11. 11.

      If \(A>t \cdot \bar{\epsilon }\), output \(\perp \).

    12. 12.

      For \(i\in [l]\):

      1. (a)

        \(a=im+1\), \(b=(i+1)m\).

      2. (b)

        Set the circuit \({\mathtt {W}}_i: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_i(x)={\mathtt {C}}(x)[a:b]\).

      3. (c)

        \(w[i]=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_i)\).

      4. (d)

        If \(w[i]\not \in [0,1]\), \(w[i]=0\).

    13. 13.

      \(msg \leftarrow \mathsf {FC}.{\mathtt {Dec}}(td,w)\).

    14. 14.

      Output msg.

Theorem 4.1

If \(\mathsf {WPRF}_0\) is a single key secure watermarkable PRF family, \(\mathsf {FC}\) is a secure fingerprinting code that is adaptively secure with \(q+1\) extraction queries as defined in Sect. 3, \(\mathsf {PKE}\) is a CCA secure PKE scheme, \(\mathsf {SIG}\) is a secure signature scheme, \({\mathtt {G}}, {\mathtt {G}}', {\mathtt {G}}''\) are secure pseudorandom generators, and \(\mathsf {F}\) is secure pseudorandom function, then \(\mathsf {WPRF}\) is a secure watermarkable PRF family with collusion resistant security. In particular:

  • If \(\mathsf {WPRF}_0\) has (weak) pseudorandomness against the watermarking authority, then \(\mathsf {WPRF}\) also has (weak) pseudorandomness against the watermarking authority.

  • If \(\mathsf {WPRF}_0\) is single key--unremovable, then \(\mathsf {WPRF}\) is fully collusion resistant--unremovable, where , and . In more detail, if \(\mathsf {WPRF}_0\) is single key--unremovable, then \(\mathsf {WPRF}\) is fully collusion resistant--unremovable.

  • \(\mathsf {WPRF}\) is (MOPE)-\(\epsilon '\)-unforgeable.

We present proof of Theorem 4.1 later in this section, which includes proof of the correctness and pseudorandomness (Sect. 4.4), the unremoveability (Sect. 4.5), and the unforgeability (Sect. 4.6) of \(\mathsf {WPRF}\).

4.3 The Instantiations

In this section, we show how to instantiate our construction via employing existing watermarkable PRFs from standard assumptions [KW17, QWZ18, KW19]. Note that, all of them can be instantiated from some standard lattice assumptions, which can be further reduced to the worst-case hardness of appropriately parameterized GapSVP problem. Therefore, the watermarking schemes provided in this work also rely on the worst-case hardness of the GapSVP problem.

Instantiating from [KW17]. The scheme in [KW17] can achieve a single key-\((\text {MO},-)\)-\(\epsilon '\)-unremovability and a \((\text {MO},-)\)-\(\delta '\)-unforgeability, where \(\epsilon '\) is negligible in \(\lambda \) and \(\delta '=1/poly(\lambda )\). Besides, the scheme has pseudorandomness against the watermarking authority.

Unfortunately, the scheme can not be used in our general construction directly. This is because in our construction, \(\epsilon '\) is required to be significantly larger than \(\bar{\epsilon }\), where \(1/\bar{\epsilon }=poly(\lambda )\). Nonetheless, the requirement (i.e., \(\epsilon ' - \bar{\epsilon }\) is large) is only desired when proving unremovability against an adversary that can query the extraction oracle. Since the scheme in [KW17] does not achieve security with extraction queries, we do not need to argue it during the upgrading. So, we can still instantiate \(\mathsf {WPRF}_0\) with the scheme. Formally, we have:

Corollary 4.1

Assuming the worst-case hardness of appropriately parameterized GapSVP problem, there exist watermarkable PRF families with fully collusion resistant-\((\text {MO},-)\)-\({\epsilon }\)-unremovability, \((\text {MO},\text {PE})\)-\({\delta }\)-unforgeability, and pseudorandomness against the watermarking authority, where \(\epsilon =negl(\lambda )\) and \(\delta =1/poly(\lambda )\).

Instantiating from [QWZ18]. The scheme in [QWZ18] can achieve a single key-\((\text {PM},\text {EO})\)-\(\epsilon '\)-unremovability, where \(\epsilon '=1/2-1/poly(\lambda )\). When instantiating our construction with this scheme, we have:

Corollary 4.2

Assuming the worst-case hardness of appropriately parameterized GapSVP problem, there exist watermarkable PRF families with fully collusion resistant-\((\text {PM},\text {bounded EO})\)-\({\epsilon }\)-unremovability and \((\text {MO},\text {PE})\)-\({\delta }\)-unforgeability, where \(\epsilon =\delta -1/poly(\lambda )\) and \(\delta =1/2-1/poly(\lambda )\).

Instantiating from [KW19]. The scheme provided in [KW19] has single key-\((\text {PM},\text {EO})\)-\(\epsilon '\)-unremovability and weak pseudorandomness against the watermarking authorityFootnote 11, where \(\epsilon '=1/2-1/poly(\lambda )\). When instantiating our construction with this scheme, we have

Corollary 4.3

Assuming the worst-case hardness of appropriately parameterized GapSVP problem, there exist watermarkable PRF families with fully collusion resistant-\((\text {PM},\text {bounded EO})\)-\({\epsilon }\)-unremovability, \((\text {MO},\text {PE})\)-\({\delta }\)-unforgeability, and weak pseudorandomness against the watermarking authority, where \(\epsilon =\delta -1/poly(\lambda )\) and \(\delta =1/2-1/poly(\lambda )\).

4.4 Correctness and Pseudorandomness of \(\mathsf {WPRF}\)

Functionality Preserving. The functionality preserving property comes from the functionality preserving property of \(\mathsf {WPRF}_0\) and the pseudorandomness of \({\mathtt {G}},{\mathtt {G}}',{\mathtt {G}}'',\mathsf {F}\) directly.

Note that if \(\mathsf {WPRF}_0\) has functionality preserving against adversarially-chosen keys (achieved in [QWZ18, KW19]), \(\mathsf {WPRF}\) also has this stronger correctness property. Besides, even if \(\mathsf {WPRF}_0\) does not satisfy it, \(\mathsf {WPRF}\) can still achieve functionality preserving against adversarially-chosen keys if the outputs of \({\mathtt {G}}\) are “random” enough (e.g., if \({\mathtt {G}}\) is modeled as a random oracle).

Extraction Correctness. The extraction correctness comes from the extraction correctness of \(\mathsf {WPRF}_0\), the correctness of \(\mathsf {PKE}\), the correctness of \(\mathsf {SIG}\), the functionality preserving property of \(\mathsf {WPRF}_0\), the correctness of \(\mathsf {FC}\), and the pseudorandomness of \({\mathtt {G}},{\mathtt {G}}',{\mathtt {G}}'',\mathsf {F}\) directly.

Watermarking Meaningfulness. The watermarking meaningfulness comes from the watermarking meaningfulness of \(\mathsf {WPRF}_0\) directly.

Pseudorandomness. The pseudorandomness comes from the pseudorandomness of \(\mathsf {WPRF}_0\) and the pseudorandomness of \({\mathtt {G}},{\mathtt {G}}'\) by a direct reduction.

(Weak) Pseudorandomness Against the Watermarking Authority. The (weak) pseudorandomness against the watermarking authority comes from the (weak) pseudorandomness against the watermarking authority of \(\mathsf {WPRF}_0\) and the pseudorandomness of \({\mathtt {G}},{\mathtt {G}}'\) by a direct reduction.

4.5 Unremovability of \(\mathsf {WPRF}\)

In this section, we prove the fully collusion resistant--unremovability of \(\mathsf {WPRF}\), assuming that \(\mathsf {WPRF}_0\) is single key--unremovable, where and . For simplicity, here we only provide the detailed proof for and . The proofs are similar in cases that and , and at the end of this section, we also discuss how to deal with a few subtle issues in the proofs when .

First, we define the following games between a challenger and a PPT \(\epsilon \)-unremoving-admissible adversary \(\mathcal {A}\):

  • Game 0. This is the real experiment \(\mathtt {ExptUR}\) with some purely conceptual changes. More precisely, the challenger proceeds as follows.

    I.:

    First, the challenger generates \((PP_0,MK_0,EK_0) \leftarrow \mathsf {WPRF}_0.{\mathtt {Setup}}(1^{\lambda })\), \((VK,SK) \leftarrow \mathsf {SIG}.{\mathtt {KeyGen}}(1^{\lambda })\), \((PK,DK) \leftarrow \mathsf {PKE}.{\mathtt {KeyGen}}(1^{\lambda })\), and \(K \leftarrow \mathsf {F}.{\mathtt {KeyGen}}(1^{\lambda })\). Then, it returns the public parameter \(PP=(PP_0,VK,PK)\) and the mark key \(MK=(MK_0,SK,K)\) to \(\mathcal {A}\).

    II.:

    Then the challenger samples the challenge key \(s^* \overset{\$}{\leftarrow }\{0,1\}^{\lambda }\) and generates some variables (determined by \(s^*\)), which are used in answering the challenge oracle:

    1.:

    \((\check{r}^*, \hat{r}^*,R_{\mathsf {FC}}^*,R_{\mathsf {SIG}}^*,R_{\mathsf {PKE}}^*) = {\mathtt {G}}(s^*)\).

    2.:

    \((r^*_0,r^*_1, \ldots , r^*_l) = {\mathtt {G}}'(\check{r}^*)\).

    3.:

    For \(i\in [0,l]\), \(k^*_{i} = \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0;r^*_i)\).

    4.:

    \((td^*,(\bar{w}^*_{i})_{i\in [N]})=\mathsf {FC}.{\mathtt {Gen}}(1^{\lambda };R_{\mathsf {FC}}^*)\).

    5.:

    \(\sigma ^* = \mathsf {SIG}.{\mathtt {Sign}}(SK,\check{r}^*;R_{\mathsf {SIG}}^*)\).

    6.:

    \(ct^*=\mathsf {PKE}.{\mathtt {Enc}}(PK,td^* \Vert \check{r}^* \Vert \sigma ^*;R_{\mathsf {PKE}}^*)\).

    7.:

    \((r^{\prime *}_0,(r^{\prime *}_{i,\iota })_{i\in [l],\iota \in \{0,1\}})=\mathsf {F}.{\mathtt {Eval}}(K,{\mathtt {G}}''(\hat{r}^{*}))\).

    III.:

    Next the challenger answers \(\mathcal {A}\)’s oracle queries, including the extraction oracle queries and the challenge oracle queries. Once \(\mathcal {A}\) submits a circuit \({\mathtt {C}}\) to the extraction oracle, the challenger proceeds as follows:

    1.:

    Set the circuit \({\mathtt {W}}_0: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_0(x)={\mathtt {C}}(x)[1:m]\).

    2.:

    \(ct=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_0)\).

    3.:

    If \(ct=\perp \), return \(\perp \) to \(\mathcal {A}\).

    4.:

    \((td\Vert \check{r}\Vert \sigma ) = \mathsf {PKE}.{\mathtt {Dec}}(DK,ct)\).

    5.:

    If \((td\Vert \check{r}\Vert \sigma )=\perp \), return \(\perp \) to \(\mathcal {A}\).

    6.:

    If \(\mathsf {SIG}.{\mathtt {Verify}}(VK,\check{r},\sigma )=0\), return \(\perp \) to \(\mathcal {A}\).

    7.:

    \((r_0,r_1, \ldots , r_l) = {\mathtt {G}}'(\check{r})\).

    8.:

    For \(i\in [0,l]\), \(k_{i} = \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0;r_i)\).

    9.:

    \(A=0\).

    10.:

    For \(j\in [t]\):

    (a):

    Sample \(x \overset{\$}{\leftarrow }\{0,1\}^n\).

    (b):

    If \({\mathtt {C}}(x)\not =(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k_i,x))_{i\in [0,l]}\), \(A=A+1\).

    11.:

    If \(A>t \cdot \bar{\epsilon }\), return \(\perp \) to \(\mathcal {A}\).

    12.:

    For \(i\in [l]\):

    (a):

    \(a=im+1\), \(b=(i+1)m\).

    (b):

    Set the circuit \({\mathtt {W}}_i: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_i(x)={\mathtt {C}}(x)[a:b]\).

    (c):

    \(w[i]=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_i)\).

    (d):

    If \(w[i]\not \in [0,1]\), \(w[i]=0\).

    13.:

    \(msg \leftarrow \mathsf {FC}.{\mathtt {Dec}}(td,w)\).

    14.:

    Return msg to \(\mathcal {A}\).

    Also, for the h-th challenge oracle query with message \(msg^*_h\), the challenger generates the circuit \({\mathtt {C}}^*_h\) as follows and returns it back to the adversary.

    1.:

    \({\mathtt {W}}^{*}_{h,0} = \mathsf {WPRF}_0.{\mathtt {Mark}}(PP_0,MK_0,k^{*}_0,ct^{*};r^{\prime *}_0)\).

    2.:

    For \(i\in [l]\):

    (a):

    \(b^{*}_{h,i}=\bar{w}^{*}_{msg^*_h}[i]\).

    (b):

    \({\mathtt {W}}^{*}_{h,i} = \mathsf {WPRF}_0.{\mathtt {Mark}}(PP_0,MK_0,k^{*}_{i},b^{*}_{h,i};r^{\prime *}_{i,b_i})\).

    3.:

    Set the circuit \({\mathtt {C}}^{*}_h\) as \({\mathtt {C}}^*_h(x)=({\mathtt {W}}^{*}_{h,i}(x))_{i\in [0,l]}\).

    IV.:

    Finally, \(\mathcal {A}\) submits a circuit \(\tilde{C}\) to the challenge oracle and the challenger checks if \(\mathcal {A}\) succeeds in attacking the unremovability of \(\mathsf {WPRF}\) as follows. Here, we use \(\mathsf {M}\) to denote the set of all messages submitted to the challenge oracle.

    1.:

    Set the circuit \({\mathtt {W}}_0: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_0(x)=\tilde{{\mathtt {C}}}(x)[1:m]\).

    2.:

    \(ct=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_0)\).

    3.:

    If \(ct=\perp \), output 1.

    4.:

    \((td\Vert \check{r}\Vert \sigma ) = \mathsf {PKE}.{\mathtt {Dec}}(DK,ct)\).

    5.:

    If \((td\Vert \check{r}\Vert \sigma )=\perp \), output 1.

    6.:

    If \(\mathsf {SIG}.{\mathtt {Verify}}(VK,\check{r},\sigma )=0\), output 1.

    7.:

    \((r_0,r_1, \ldots , r_l) = {\mathtt {G}}'(\check{r})\).

    8.:

    For \(i\in [0,l]\), \(k_{i} = \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0;r_i)\).

    9.:

    \(A=0\).

    10.:

    For \(j\in [t]\):

    (a):

    Sample \(x \overset{\$}{\leftarrow }\{0,1\}^n\).

    (b):

    If \(\tilde{{\mathtt {C}}}(x)\not =(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k_i,x))_{i\in [0,l]}\), \(A=A+1\).

    11.:

    If \(A>t \cdot \bar{\epsilon }\), output 1.

    12.:

    For \(i\in [l]\):

    (a):

    \(a=im+1\), \(b=(i+1)m\).

    (b):

    Set the circuit \({\mathtt {W}}_i: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_i(x)=\tilde{{\mathtt {C}}}(x)[a:b]\).

    (c):

    \(w[i]=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_i)\).

    (d):

    If \(w[i]\not \in [0,1]\), \(w[i]=0\).

    13.:

    \(msg \leftarrow \mathsf {FC}.{\mathtt {Dec}}(td,w)\).

    14.:

    If \(msg\not \in \mathsf {M}\), output 1.

    15.:

    Output 0.

  • Game 1. This is identical to Game 0 except that in Step II, the challenger samples \((\check{r}^*, R_{\mathsf {FC}}^*,R_{\mathsf {SIG}}^*,R_{\mathsf {PKE}}^*, r^{\prime *}_0,(r^{\prime *}_{i,\iota })_{i\in [l],\iota \in \{0,1\}})\) uniformly at random instead of computing them using pseudorandom generators and pseudorandom functions.

  • Game 2. This is identical to Game 1 except that the challenger changes the way to answer extraction oracle queries. In particular, after receiving a circuit \({\mathtt {C}}\) and extracting the ciphertext ct from the first part of \({\mathtt {C}}\), the challenger works as follows (instead of continuing the extraction procedure defined above) if \(ct=ct^*\):

    1. 1.

      \(A=0\).

    2. 2.

      For \(j\in [t]\):

      1. (a)

        Sample \(x \overset{\$}{\leftarrow }\{0,1\}^n\).

      2. (b)

        If \({\mathtt {C}}(x)\not ={\mathtt {C}}^*_1(x)\), \(A=A+1\).

    3. 3.

      If \(A>t \cdot \bar{\epsilon }\), return \(\perp \) to \(\mathcal {A}\).

    4. 4.

      For \(i\in [l]\):

      1. (a)

        \(a=im+1\), \(b=(i+1)m\).

      2. (b)

        Set the circuit \({\mathtt {W}}_i: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_i(x)={\mathtt {C}}(x)[a:b]\).

      3. (c)

        \(w[i]=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_i)\).

      4. (d)

        If \(w[i]\not \in [0,1]\), \(w[i]=0\).

    5. 5.

      \(msg \leftarrow \mathsf {FC}.{\mathtt {Dec}}(td^*,w)\).

    6. 6.

      Return msg to \(\mathcal {A}\).

  • Game 3. This is identical to Game 2 except that in Step IV, after extracting the ciphertext ct, it works as follows (instead of continuing the extraction procedure defined above) if \(ct=ct^*\):

    1. 1.

      For \(i\in [l]\):

      1. (a)

        \(a=im+1\), \(b=(i+1)m\).

      2. (b)

        Set the circuit \({\mathtt {W}}_i: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_i(x)=\tilde{{\mathtt {C}}}(x)[a:b]\).

      3. (c)

        \(w[i]=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_i)\).

      4. (d)

        If \(w[i]\not \in [0,1]\), \(w[i]=0\).

    2. 2.

      \(msg \leftarrow \mathsf {FC}.{\mathtt {Dec}}(td^*,w)\).

    3. 3.

      If \(msg\not \in \mathsf {M}\), output 1.

    4. 4.

      Output 0.

  • Game 4. This is identical to Game 3 except that the challenger changes the way to generate \(ct^*\). In particular, it computes \(ct^* \leftarrow \mathsf {PKE}.{\mathtt {Enc}}(PK,0)\).

  • Game 5. This is identical to Game 4 except that the challenger samples \((r^*_0,r^*_1, \ldots , r^*_l) \) uniformly at random instead of setting them as \((r^*_0,r^*_1, \ldots , r^*_l) = {\mathtt {G}}'(\check{r}^*)\). Note that in Game 5, each \({\mathtt {C}}^{*}_h\) is set as

    $$ {\mathtt {C}}^*_h(x)={\mathtt {W}}^{*}_0(x) \Vert ({\mathtt {W}}^{*}_{i,\bar{w}^{*}_{msg^*_h}[i]}(x))_{i\in [1,l]} $$

    where \({\mathtt {W}}^{*}_0, \{{\mathtt {W}}^{*}_{i,j}\}_{i\in [l],j\in \{0,1\}}\) are generated as follows in Step II:

    1. 1.

      For \(i\in [0,l]\), \(k^*_{i} \leftarrow \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0)\).

    2. 2.

      \((td^*,(\bar{w}^*_{i})_{i\in [N]}) \leftarrow \mathsf {FC}.{\mathtt {Gen}}(1^{\lambda })\).

    3. 3.

      \(ct^* \leftarrow \mathsf {PKE}.{\mathtt {Enc}}(PK,0)\).

    4. 4.

      \({\mathtt {W}}^{*}_{0} \leftarrow \mathsf {WPRF}_0.{\mathtt {Mark}}(PP_0,MK_0,k^{*}_0,ct^{*})\).

    5. 5.

      For \(i\in [l]\):

      1. (a)

        \({\mathtt {W}}^{*}_{i,0} \leftarrow \mathsf {WPRF}_0.{\mathtt {Mark}}(PP_0,MK_0,k^{*}_{i},0)\).

      2. (b)

        \({\mathtt {W}}^{*}_{i,1} \leftarrow \mathsf {WPRF}_0.{\mathtt {Mark}}(PP_0,MK_0,k^{*}_{i},1)\).

  • Game 6. This is identical to Game 5 except that in Step IV, after extracting the ciphertext ct, the challenger aborts the experiment and outputs 2 if \(ct \not = ct^*\).

  • Game 7. This is identical to Game 6 except that when

    • answering an extraction oracle query with extracted ciphertext \(ct=ct^*\),

    • performing the final check in Step IV,

    the challenger aborts and outputs 2 if the extracted word w satisfies

    $$\exists i\in [l], b\in \{0,1\}: w[i]\not =b \wedge \forall msg\in \mathsf {M},\bar{w}^*_{msg}[i]=b$$

    We call this event as \( Bad \). Here, we abuse the notion \(\mathsf {M}\) as the set of all messages submitted to the challenge oracle before the event occurs.

Next, we prove the indistinguishability of each consecutive pair of games defined above and show that the adversary \(\mathcal {A}\) will win in the final game (Game 7) with a negligible probability. For simplicity of notation, we use \(\mathcal {E}_i\) to denote the output of Game i.

Lemma 4.1

\(\mid \Pr [\mathcal {E}_0=1] - \Pr [\mathcal {E}_1=1] \mid \le negl(\lambda )\).

Proof

In Game 1, some random variables are sampled uniformly instead of being set as output of pseudorandom generators. As the PRG seed \(s^*,\hat{r}^*\) does not appear in the view of \(\mathcal {A}\) directly, indistinguishability between Game 0 and Game 1 comes from the pseudorandomness of \({\mathtt {G}},{\mathtt {G}}''\).    \(\square \)

Lemma 4.2

\(\mid \Pr [\mathcal {E}_1=1] - \Pr [\mathcal {E}_2=1] \mid \le negl(\lambda )\).

Proof

Game 1 and Game 2 are identical as long as

  1. 1.

    \((td^*\Vert \check{r}^*\Vert \sigma ^*) = \mathsf {PKE}.{\mathtt {Dec}}(DK,ct^*)\).

  2. 2.

    \(\mathsf {SIG}.{\mathtt {Verify}}(VK,\check{r}^*,\sigma ^*)=1\).

  3. 3.

    For all tested input x, \({\mathtt {C}}_1^*(x)=(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k^*_i,x))_{i\in [0,l]}\).

The first two conditions are satisfied (with all but negligible probability) due to the correctness of \(\mathsf {PKE}\) and the correctness of \(\mathsf {SIG}\) respectively. The last condition comes from the functionality preserving property (against an honest key) of \(\mathsf {WPRF}\), which guarantees that the probability that \({\mathtt {C}}_1^*(x)\not =(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k^*_i,x))_{i\in [0,l]}\) is negligible for a uniform x.    \(\square \)

Lemma 4.3

\(\mid \Pr [\mathcal {E}_2=1] - \Pr [\mathcal {E}_3=1] \mid \le negl(\lambda )\).

Proof

Proof of Lemma 4.3 is similar to the proof of Lemma 4.2. Note that as \(\mathcal {A}\) is \(\epsilon \)-unremoving-admissible, there exists \(\tilde{i}\in [Q]\) s.t.

$$|\{x \in \{0,1\}^n:{\mathtt {C}}^*_{\tilde{i}}(x) \not = \tilde{{\mathtt {C}}}(x) \} | \le \epsilon \cdot 2^n$$

Also, by the functionality preserving property (against an honest key) of \(\mathsf {WPRF}\),

$$|\{x \in \{0,1\}^n:{\mathtt {C}}^*_{\tilde{i}}(x) \not = (\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k^*_i,x))_{i\in [0,l]} \} | \le negl(\lambda ) \cdot 2^n$$

So, we have

$$|\{x \in \{0,1\}^n:\tilde{{\mathtt {C}}}(x) \not = (\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k^*_i,x))_{i\in [0,l]} \} | \le (\epsilon +negl(\lambda )) \cdot 2^n$$

By the Chernoff bounds,

$$\Pr [A \ge t \cdot \bar{\epsilon }] \le e^{-\frac{\lambda }{60}}$$

Therefore, it will not affect the output even if the challenger does not check whether \(\tilde{{\mathtt {C}}}\) is close to \((\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k^*_i,x))_{i\in [0,l]}\).    \(\square \)

Lemma 4.4

\(\mid \Pr [\mathcal {E}_3=1] - \Pr [\mathcal {E}_4=1] \mid \le negl(\lambda )\).

Proof

Indistinguishability between Game 3 and Game 4 comes from the CCA-security of \(\mathsf {PKE}\) by a direct reduction. Note that the reduction can answer the extraction oracle queries and perform the check in Step IV by querying its decryption oracle, and in both cases it is not required to decrypt the challenge ciphertext \(ct^*\).    \(\square \)

Lemma 4.5

\(\mid \Pr [\mathcal {E}_4=1] - \Pr [\mathcal {E}_5=1] \mid \le negl(\lambda )\).

Proof

As the PRG seed \(\check{r}^*\) is not used in any other part of the experiment, indistinguishability between Game 4 and Game 5 comes from the pseudorandomness of \({\mathtt {G}}'\) directly.    \(\square \)

Lemma 4.6

\(\mid \Pr [\mathcal {E}_5=1] - \Pr [\mathcal {E}_6=1] \mid \le negl(\lambda )\).

Proof

Indistinguishability between Game 5 and Game 6 comes from the single key-\((\text {PM},(l+1)q\text {-EO})\)-\(\epsilon '\)-unremovability of \(\mathsf {WPRF}_0\).

More precisely, if the adversary is able to generate a circuit \(\tilde{{\mathtt {C}}}\) such that \({\mathtt {W}}_0(\cdot ) = \tilde{{\mathtt {C}}}[1:m]\) is marked with a message not equal to \(ct^*\) (with a non-negligible probability), then we can construct an adversary \(\mathcal {B}\) that breaks the single key-\((\text {PM},(l+1)q\text {-EO})\)-\(\epsilon '\)-unremovability of \(\mathsf {WPRF}_0\).

In particular, the adversary \(\mathcal {B}\) sets the circuit \({\mathtt {W}}^*_0\) as its challenge, which is obtained by submitting \(ct^*\) to its challenge oracle. Moreover, \(\mathcal {B}\) can answer extraction oracle queries via querying its own extraction oracle. Also, using the mark key returned from its mark key oracle, it can answer the mark key oracle query from \(\mathcal {A}\) and to generate \(\{{\mathtt {W}}^{*}_{i,b}\}_{i\in [l],j\in \{0,1\}}\) when answering the challenge oracle. Finally, \(\mathcal {B}\) submits \({\mathtt {W}}_0(\cdot ) = \tilde{{\mathtt {C}}}[1:m]\) to its challenger. Note that, \(\mathcal {B}\) is \(\epsilon '\)-unremoving-admissible since \(\mathcal {A}\) is \(\epsilon \)-unremoving-admissible, which ensures that \(|\{x \in \{0,1\}^n:{\mathtt {W}}_0(x) \not = {\mathtt {W}}^*_0(x) \} | \le \epsilon \cdot 2^n\).    \(\square \)

Lemma 4.7

\(\mid \Pr [\mathcal {E}_6=1] - \Pr [\mathcal {E}_7=1] \mid \le negl(\lambda )\).

Proof

Indistinguishability between Game 6 and Game 7 comes from the single key-\((\text {PM},(l+1)q\text {-EO})\)-\(\epsilon '\)-unremovability of \(\mathsf {WPRF}_0\) by a hybrid argument and the reductions are similar to the reduction provided in the proof of Lemma 4.6.

Note that for \(i\in [l]\), if \(\bar{w}^*_{msg^*_1}[i]=\bar{w}^*_{msg^*_2}[i]\) for all \(msg^*_1,msg^*_2 \in \mathsf {M}\), then the adversary \(\mathcal {A}\) can only obtain one marked circuit for \(k^*_i\), thus, single key security for \(\mathsf {WPRF}_0\) is enough. Also, for \(i\in [l]\),

  • If \( Bad \) occurs at Step IV: Let \({\mathtt {W}}_i(\cdot )=\tilde{{\mathtt {C}}}(\cdot )[il+1,(i+1)l]\). Then \(|\{x \in \{0,1\}^n:{\mathtt {W}}_i(x) \not = {\mathtt {W}}^*_i(x) \} | \le \epsilon \cdot 2^n\) due to the fact that \(\mathcal {A}\) is \(\epsilon \)-unremoving-admissible.

  • If \( Bad \) occurs at an extraction oracle query: Let \({\mathtt {W}}_i(\cdot )={\mathtt {C}}(\cdot )[il+1,(i+1)l]\), where \({\mathtt {C}}\) is the circuit submitted to the extraction oracle. Assuming that \(|\{x \in \{0,1\}^n:{\mathtt {W}}_i(x) \not = {\mathtt {W}}^*_i(x) \} | \ge \epsilon ' \cdot 2^n\), then by the chernoff bound, the probability that \({\mathtt {C}}\) can pass the check in Step 3 (in the new extraction procedure defined in Game 2) is negligible, i.e., the challenger is not able to recover a word w in this case.

Thus, the adversary \(\mathcal {B}\) is \(\epsilon '\)-unremoving-admissible.

   \(\square \)

Lemma 4.8

\(\Pr [\mathcal {E}_7=1] \le negl(\lambda )\).

Proof

Lemma 4.8 comes from adaptive security with \((q+1)\) extraction queries of \(\mathsf {FC}\) by a direction reduction.    \(\square \)

Combining Lemma 4.1 to Lemma 4.8, we have \(\Pr [\mathcal {E}_0=1] \le negl(\lambda )\), i.e., the probability that \(\mathcal {A}\) wins in the real experiment \(\mathtt {ExptUR}\) is negligible. This completes the proof of unremovability.

The Proofs in Cases That . The above proof strategies (almost) work perfectly in cases that . One subtle issue is that in the proof of Lemma 4.6 and that of Lemma 4.7, the adversary \(\mathcal {B}\) for single key security of \(\mathsf {WPRF}_0\) needs to simulate the marking oracle for \(\mathcal {A}\) via its own marking oracle. However, as the seed (\(\mathsf {WPRF}\)’s PRF key) is chosen by \(\mathcal {A}\), security of pseudorandom generator is not enough to ensure that the simulated marking oracle (which runs \(\mathsf {WPRF}_0.{\mathtt {Mark}}\) on fresh randomness) is indistinguishable from an honest marking oracle (which runs \(\mathsf {WPRF}_0.{\mathtt {Mark}}\) on randomness output by some pseudorandomness generation component). To solve this subtle issue, we employ a pseudorandom function \(\mathsf {F}\) to generate the randomness for \(\mathsf {WPRF}_0.{\mathtt {Mark}}\). Since the secret key K of \(\mathsf {F}\) is put in the mark key, which is not given to \(\mathcal {A}\) in this case, we can argue the indistinguishability of these two modes for answering marking oracle queries.

4.6 Unforgeability of \(\mathsf {WPRF}\)

Next, we prove the unforgeability of \(\mathsf {WPRF}\). First, we define the following games between a challenger and a PPT \(\epsilon '\)-unforging-admissible adversary \(\mathcal {A}\):

  • Game 0. This is the real experiment \(\mathtt {ExptUF}\). More precisely, the challenger proceeds as follows.

    I.:

    First, the challenger generates \((PP_0,MK_0,EK_0) \leftarrow \mathsf {WPRF}_0.{\mathtt {Setup}}(1^{\lambda })\), \((VK,SK) \leftarrow \mathsf {SIG}.{\mathtt {KeyGen}}(1^{\lambda })\), \((PK,DK) \leftarrow \mathsf {PKE}.{\mathtt {KeyGen}}(1^{\lambda })\), and \(K \leftarrow \mathsf {F}.{\mathtt {KeyGen}}(1^{\lambda })\). Then, it returns the public parameter \(PP=(PP_0,VK,PK)\) to \(\mathcal {A}\).

    II.:

    Next, it answers \(\mathcal {A}\)’s oracle queries:

    \(\bullet \):

    If \(\mathcal {A}\) submits a query to the extraction key oracle, the challenger returns \(EK=(EK_0,DK)\) to \(\mathcal {A}\).

    \(\bullet \):

    If \(\mathcal {A}\) submits the h-th marking oracle query \((s^{h},msg^{h}) \in \{0,1\}^{\lambda } \times [N]\), the challenger returns \({\mathtt {C}}^{h} \leftarrow {\mathtt {Mark}}(PP,MK,s^{h},msg^{h})\) to \(\mathcal {A}\).

    III.:

    Finally, \(\mathcal {A}\) submits a circuit \(\tilde{{\mathtt {C}}}\) and the challenger proceeds as follows:

    1.:

    Set the circuit \({\mathtt {W}}_0: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_0(x)=\tilde{{\mathtt {C}}}(x)[1:m]\).

    2.:

    \(ct=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_0)\).

    3.:

    If \(ct=\perp \), output 0.

    4.:

    \((td\Vert \check{r}\Vert \sigma ) = \mathsf {PKE}.{\mathtt {Dec}}(DK,ct)\).

    5.:

    If \((td\Vert \check{r}\Vert \sigma )=\perp \), output 0.

    6.:

    If \(\mathsf {SIG}.{\mathtt {Verify}}(VK,\check{r},\sigma )=0\), output 0.

    7.:

    \((r_0,r_1, \ldots , r_l) = {\mathtt {G}}'(\check{r})\).

    8.:

    For \(i\in [0,l]\), \(k_{i} = \mathsf {WPRF}_0.{\mathtt {KeyGen}}(PP_0;r_i)\).

    9.:

    \(A=0\).

    10.:

    For \(j\in [t]\):

    (a):

    Sample \(x \overset{\$}{\leftarrow }\{0,1\}^n\).

    (b):

    If \(\tilde{{\mathtt {C}}}(x)\not =(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k_i,x))_{i\in [0,l]}\), \(A=A+1\).

    11.:

    If \(A>t \cdot \bar{\epsilon }\), output 0.

    12.:

    For \(i\in [l]\):

    (a):

    \(a=im+1\), \(b=(i+1)m\).

    (b):

    Set the circuit \({\mathtt {W}}_i: \{0,1\}^{n} \rightarrow \{0,1\}^{m}\) as \({\mathtt {W}}_i(x)=\tilde{{\mathtt {C}}}(x)[a:b]\).

    (c):

    \(w[i]=\mathsf {WPRF}_0.{\mathtt {Extract}}(PP_0,EK_0,{\mathtt {W}}_i)\).

    (d):

    If \(w[i]\not \in [0,1]\), \(w[i]=0\).

    13.:

    \(msg \leftarrow \mathsf {FC}.{\mathtt {Dec}}(td,w)\).

    14.:

    If \(msg=\perp \), output 0.

    15.:

    Output 1.

  • Game 1. This is identical to Game 0 except that in Step III.6, after checking if \(\sigma \) is a valid signature for \(\check{r}\), the challenger further checks if \(\check{r}\) has appeared. In particular, let Q be the number of marking oracle queries the adversary made and for \(h\in [Q]\), let \((\check{r}^{h}, \hat{r}^{h},R_{\mathsf {FC}}^{h},R_{\mathsf {SIG}}^{h},R_{\mathsf {PKE}}^{h}) = {\mathtt {G}}(s^{h})\), then the challenger outputs 0 if \(\forall h\in [Q], \check{r}\not =\check{r}^{h}\).

Game 0 and Game 1 are identical unless \(\mathsf {SIG}{.}{\mathtt {Verify}}(VK,\check{r},\sigma )=1\) but \(\forall h\in [Q], \check{r}\not =\check{r}^{h}\), i.e., the adversary generates a valid signature \(\sigma \) for a new message \(\check{r}\) after viewing signatures for messages \(\check{r}^{1}, \ldots , \check{r}^{Q}\). This occurs with only a negligible probability due to the existentially unforgeable of \(\mathsf {SIG}\). Thus, the probability that \(\mathcal {A}\) succeeds in Game 0 and that in Game 1 are close.

Next, we argue that Game 1 outputs 1 with only a negligible probability. First, due to the new checking rule in Game 1, \(\check{r}=\check{r}^{h}\) for some \(h\in [Q]\) (otherwise, the experiment outputs 0 directly). Then, by the functionality preserving property (against adversarially-chosen PRF keys) of \(\mathsf {WPRF}\), with all but negligible probability,

$$|\{x\in \{0,1\}^n:(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k_i,x))_{i\in [0,l]}\not ={\mathtt {C}}^{h}(x)\} | \le negl(\lambda ) \cdot 2^n$$

Since \(\mathcal {A}\) is \(\epsilon '\)-unforging-admissible,

$$|\{x\in \{0,1\}^n:\tilde{C}(x)\not ={\mathtt {C}}^{h}(x)\} | \ge \epsilon ' \cdot 2^n$$

So, we haveFootnote 12

$$\begin{aligned} |\{x\in \{0,1\}^n:\tilde{C}(x)\not =(\mathsf {WPRF}_0.{\mathtt {Eval}}(PP_0,k_i,x))_{i\in [0,l]}\} | \ge (\epsilon '-negl(\lambda )) \cdot 2^n \end{aligned}$$
(1)

Finally, by the Chernoff bounds,

$$\Pr [A \le t \cdot \bar{\epsilon }] \le e^{-\frac{\lambda -2}{8}}$$

i.e., \(\tilde{C}\) can pass the check in Step III.11 with only negligible probability.

This completes the proof of unforgeability.