Keywords

1 Introduction

Discrete logarithms arise in many aspects of cryptography. The hardness of the discrete logarithm problem is central in many cryptographic schemes; for instance in signatures, key exchange protocols and encryption schemes. In fact, many variants of the discrete logarithm problem have evolved over the years. Some of these include the Bilinear Diffie-Hellman Exponent Problem [6], the Bilinear Diffie-Hellman Inversion Problem [4], the Weak Diffie-Hellman Problem [14] and the Strong Diffie-Hellman Problem [5]. We first describe the classical discrete logarithm problem and its generalizations. Formal descriptions of the statements are as follow.

Let G be a cyclic group such that \(|\) \({G}\) \(|\) = p where p a prime and denote g to be a generator of G so that G = \(\langle g \rangle \).

The discrete logarithm problem (DLP) is defined as follows: Given G, p and any h selected uniformly at random from G, find x \(\in \) \(\mathbb {Z}_p\) satisfying \(g^x = h\).

The k-Multiple Discrete Logarithm (k-MDL) is defined as follows:

Definition 1

(MDL). Given G, p and k elements \(h_1\), \(h_2\), \(\dots \), \(h_k\) selected uniformly at random from G, find non-negative integers \(x_1\), \(x_2\), \(\dots \), \(x_k\) satisfying \(g^{x_i} = h_{i}\) \(\forall \) \(1 \le i \le k\).

In particular when \(k = 1\), the 1-MDL is equivalent to DLP.

For \(k \le n\), we define the (kn)-Generalized Multiple Discrete Logarithm ((kn)-GMDL) as follows:

Definition 2

(GMDL). Given G, p and n elements \(h_1\), \(h_2\), \(\dots \), \(h_n\) selected uniformly at random from G, find k pairs \((i, x_i)\) satisfying \(g^{x_i} = h_{i}\) where \(i \in S\) and where S is a k-subset of \(\{1, \dots , n\}\).

As the definition suggests, (kn)-GMDL can be viewed as a generalization of k-MDL. In particular when \(n = k\), the (kk)-GMDL is equivalent to the k-MDL.

Cryptographic constructions based on DLP are applied extensively. For instance, an early application of the DLP in cryptography came in the form of the Diffie-Hellman key exchange protocol [8] for which the security is dependent on the hardness of the DLP. Among some of the others include the ElGamal encryption and signature schemes [9] as well as Schnorr’s signature scheme and identification protocol [16]. The multiple discrete logarithm problem mainly arises from elliptic curve cryptography. NIST recommended a small set of fixed (or standard) curves for use in cryptographic schemes [1] to eliminate the computational cost of generating random secure elliptic curves. The implications for the security of standard elliptic curves over random elliptic curves were analysed based on the efficiency of solving multiple discrete logarithm problems [11].

In a generic group, no special properties which are exhibited by any specific groups or their elements are assumed. Algorithms for a generic group are termed as generic algorithms. There are a number of results pertaining to generic algorithms for DLP and k-MDL. Shoup showed that any generic algorithm for solving the DLP must perform \(\varOmega (\sqrt{p})\) group operations [17]. There are a few methods for computing discrete logarithm in approximately \(\sqrt{p}\) operations. For example, Shanks Baby-Step-Giant-Step method computes the DLP in \(\tilde{O}(\sqrt{p})\) operations. One other method is the Pollard’s Rho Algorithm which can be achieved in \(O(\sqrt{p})\) operations [15]. Since then, further practical improvements to the Pollard’s Rho Algorithm have been proposed in [3, 7, 18] but the computational complexity remains the same. There exist index calculus methods which solve the DLP in subexponential time. However, such index calculus methods are not relevant in our context since they are not applicable for a generic group.

An extension of Pollard’s Rho algorithm was proposed in [13] which solves k-MDL in \({O}(\sqrt{{kp}})\) group operations if \(k \le O(p^{1/4})\). It was subsequently shown in [10] that \({O}(\sqrt{kp})\) can in fact be achieved without the imposed condition on k. The former’s method of finding discrete logarithm is sequential in the sense that they are found one after another. However in the latter’s method, all the discrete logarithms can only be obtained towards the end. Finally, it was presented in [19] that any generic algorithm solving k-MDL must require at least \(\varOmega (\sqrt{kp})\) group operations if \(k = o(p)\).

Our Contributions. In the context of our work, suppose an adversary has knowledge or access to many instances of the discrete logarithm problem either from a generic underlying algebraic group or from a standard curve recommended by NIST. Our work investigates how difficult it is for such an adversary to solve subcollections of those instances. One of our result outcomes in this work shows that an adversary gaining access to additional instances of the DLP provides no advantage in solving some subcollection of them when k is small and for corresponding small classes of n. Our techniques are also applicable to other standard non-NIST based curves. For instance, the results in this work are relevant to Curve25519 [2] which has garnered considerable interest in recent years. Furthermore, we also establish formal lower bounds for the generic hardness of solving the GMDL problem for larger k values. As a corollary, these results provide the lower bounds of solving the GMDL problem for the full possible range of inputs k. Part of this work can be viewed as a generalization of the results in [19].

More specifically, we introduce two techniques to solve such generalized multiple discrete logarithm problems. The first we refer to as the matrix method which is also shown to achieve asymptotically tight bounds when the inputs are small. From this, we obtain the result that the GMDL problem is as hard as the MDL problem for \(k = o\left( \frac{p^{1/3}}{\log ^{2/3} p}\right) \) and \(kn^2 = o\left( \frac{p}{\log ^2 p}\right) \). This strictly improves the result of [13] where the equivalence is achieved for a smaller range of inputs satisfying \(k = o(p^{\frac{1}{4}})\) and \(k^2n^2 = o(p)\). The second technique is referred as the block method which can be applied for larger inputs. We also show that the block partitioning in this method is optimized. Moreover, when n is relatively small with respect to k, the bounds that are obtained in this way are also asymptotically tight. Furthermore, we demonstrate that the block method can be adapted and applied to generalized versions of other discrete logarithm settings introduced in [12] to also obtain generic hardness bounds for such problems. For instance, part of this work also shows that solving one out of n instances of the Discrete Logarithm Problem with Auxiliary Inputs is as hard as solving a single given instance when n is not too large. In addition, we also explain why the matrix method cannot be extended to solve these problems.

2 Preliminaries

For generic groups of large prime order p, denote \(T_k\), \(T_{k,n}\) to be the expected workload (in group operations) of an optimal algorithm solving the k-MDL problem and (kn)-GMDL problem respectively.

Lemma 1 and Corollary 1 for a special case \(k=1\) are attributed to [13].

Lemma 1

\(T_1 \le T_{1,n} + 2n\log _2p\)

Proof

Given an arbitrary \(h \in G = \langle g \rangle \), obtaining x such that \(g^x = h\) can be achieved in time \(T_1\). For all i, \(1 \le i \le n\), select integers \(r_i\) uniformly at random from the set \(\{0, \dots , p-1\}\) and define \(h_i := g^{r_i}h = g^{x+r_i}\). All the \(h_i\) are random since all the \(r_i\) and h are random. Apply a generic algorithm with inputs of (\(h_1, h_2, \dots , h_n)\) that solves the (1, n)-GMDL problem in time \(T_{1,n}\). The resulting algorithm outputs (jy) such that \(h_j = g^{y}\), \(1 \le j \le n\). Therefore, \(x \equiv y - r_j\) mod p, thus solving the 1-MDL problem within \(T_{1,n} + 2n\log _2p\) group multiplications.   \(\square \)

Corollary 1

For all \(n = o\left( \frac{\sqrt{p}}{\log p}\right) \), \(T_{1,n} = \varOmega (\sqrt{p})\).

Proof

Since \(T_1 = \varOmega (\sqrt{p})\) [17], when \(n = o\left( \frac{\sqrt{p}}{\log p}\right) \), it follows directly from Lemma 1 that \(T_{1,n} = \varOmega (\sqrt{p})\).    \(\square \)

It was also obtained in [13] that the GMDL problem is as hard as the MDL problem if \(kn \ll \sqrt{p}\). Since \(k \le n\), this equivalence is valid for \(k = o(p^{\frac{1}{4}})\) and \(k^2n^2 = o(p)\).

3 Generalized Bounds of \(T_{k,n}\) for Small k

The first method we introduce is to obtain an improved lower bound of \(T_{k,n}\) for small k. We refer to this as the Matrix technique.

We seek to obtain an upper bound of \(T_k\) based on \(T_{k,n}\). Given \(g^{x_i} = h_i\) \(\forall \) \(1\le i \le k\), \(T_k\) represents the time to solve all such \(x_i\). For all \(1\le i \le n\), denote \(y_i\) by the followingFootnote 1:

$$\begin{aligned} \begin{pmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_n \end{pmatrix} = \begin{pmatrix} 1 &{} \alpha _{1} &{} \alpha _{1}^2 &{} \dots &{} \alpha _{1}^{k-1} \\ 1 &{} \alpha _{2} &{} \alpha _{2}^2 &{} \dots &{} \alpha _{2}^{k-1} \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1 &{} \alpha _{n} &{} \alpha _{n}^2 &{} \dots &{} \alpha _{n}^{k-1} \end{pmatrix} \begin{pmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_k \end{pmatrix} \end{aligned}$$

Next, multiply each \(g^{y_i}\) with a corresponding random element \(g^{r_i}\) where \(0 \le r_i \le p-1\). By considering these randomized \(g^{y_i + r_i}\) as inputs to a (kn)-GMDL solver, this solver outputs solutions to k out of n of such discrete logarithms. These solutions are of the form \(y_i + r_i\). As such, a total of k values of \(y_i\) can be obtained by simply subtracting from their corresponding \(r_i\). We claim that any k collections of \(y_i\) is sufficient to recover all of \(x_1, x_2, \dots , x_k\). Indeed, it suffices to show that any k-by-k submatrices of

$$\begin{aligned} V=\begin{pmatrix} 1 &{} \alpha _{1} &{} \alpha _{1}^2 &{} \dots &{} \alpha _{1}^{k-1} \\ 1 &{} \alpha _{2} &{} \alpha _{2}^2 &{} \dots &{} \alpha _{2}^{k-1} \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1 &{} \alpha _{n} &{} \alpha _{n}^2 &{} \dots &{} \alpha _{n}^{k-1} \end{pmatrix} \end{aligned}$$

has non zero determinant. This can be satisfied by simply letting \(\alpha _i = i\) since V is a Vandermonde matrix.

In this case, recovering \(x_1, x_2, \dots , x_k\) from k number of \(y_i\) requires solving a k-by-k system of linear equations. This can be achieved in \(O(k^3)\) arithmetic operations using Gaussian elimination. Crucially, this does not involve any group operations. On the other hand, group operations are incurred from the computations of all \(g^{y_i} = g^{x_1 + \alpha _ix_2 + \dots + \alpha _i^ {k-1}x_k}\). Since \(\alpha _i = i\), this process requires the computations of each \((g^{x_j})^{i{^{j-1}}}\) \(\forall \) \(1 \le i \le n\), \(1 \le j \le k\). Denote \(a_{i,j}\) = \((g^{x_j})^{i{^{j-1}}}\). By noting that \(a_{i+1,j}\) = \(a_{i,j}^{(\frac{i+1}{i})^{j-1}}\), it can be concluded that computing \(a_{i+1,j}\) given \(a_{i,j}\) requires at most \(2(j-1) \log _2 \frac{i+1}{i}\) group multiplications. Moreover, \(a_{1,j} = g^{x_j}\) is already known. Hence the total number of groups multiplications required to compute all \((g^{x_j})^{i{^{j-1}}}\) is at most

$$\begin{aligned} \sum _{j=1}^{k} \sum _{i=1}^{n-1} 2(j-1) \log _2 \left( \frac{i+1}{i}\right) = k(k-1) \log _2 n. \end{aligned}$$

Furthermore, each addition in the exponent of \(g^{x_1 + \alpha _ix_2 + \dots + \alpha _i^ {k-1}x_k+r_i}\) constitutes a group multiplication. Therefore, kn group multiplications are necessary in this step. Thus, the total number of group multiplications required to compute all of \(g^{y_i+r_i}\) is at most \(kn + k(k-1) \log _2 n\). Since \(k\le n < p\), the above expression can be bounded from above by \(2kn \log _2 p\) and so it follows that

$$\begin{aligned} T_k \le T_{k,n} + 2kn \log _2 p \end{aligned}$$

Since \(T_k=\varOmega (\sqrt{kp})\) [19], \(T_{k,n}\) is asymptotically as large as \(T_k\) if \(nk \log p \ll \sqrt{kp}\). Hence, \(T_{k,n} = \varOmega (\sqrt{kp})\) if \(k = o\left( \frac{p^{1/3}}{\log ^{2/3} p}\right) \) and \(n\sqrt{k} = o\left( \frac{\sqrt{p}}{\log p}\right) \). Moreover, this bound is asymptotically tight since there exists an algorithm which solves k-MDL in \(O(\sqrt{kp})\).

4 Generalized Bounds of \(T_{k,n}\) for Larger k

The matrix technique has been shown to provide asymptotically tight bounds required to solve k out of n of the multiple discrete logarithm in the classical setting when the inputs are small. One main limitation of this technique is that it is only applicable to the classical DLP and cannot be extended for other variants or other settings of the DLP. This will be shown in further details in the subsequent sections. Moreover, in light of the fact that the bound of \(T_k\) can be achieved for large inputs extending to o(p), the matrix method is not sufficient to obtain analogous bounds for larger such k values. In this section, we address these issues by introducing the block method to evaluate lower bounds of \(T_{k,n}\) for general k, including large k values. Moreover, the block technique can also be applied to other variants or other settings in the MDLP. This will be also described in the later sections.

Proposition 1

Suppose that \(n \ge k^2\). Then,

$$\begin{aligned} T_k \le kT_{k,n} + 2nH_k \log _2 p \end{aligned}$$

where \(H_k\) denotes the \(k^{th}\) harmonic number, \(H_k = \sum _{i=1}^{k} \frac{1}{i}\).

Proof

Given arbitraries \(h_1, h_2, \dots , h_k \in G = \langle g \rangle \), obtaining \(x_1, x_2, \dots , x_k\) such that \(g^{x_i} = h_i\) \(\forall \) \(1 \le i \le k\) can be achieved in time \(T_k\). Consider n elements partitioned into k blocks each of size approximately \(s_k\), where \(s_k\) = \(\frac{n}{k}\). Each block is labelled i where i ranges from 1 to k. For each i, \(1 \le i \le k\), select about \(s_k\) integers \(r_{i,j}\) uniformly at random from \(\mathbb {Z}_p\) and define \(h_{i,j} := g^{r_{i,j}}h_i = g^{x_i+r_{i,j}}\). This generates n of \(h_{i,j}\) which when applied to a generic algorithm for the (kn)-GMDL problem, outputs k pseudo solutions. Computing each \(h_{i,j}\) requires at most \(2 \log _2 p\) group multiplications. Since each block is about size \(\frac{n}{k} \ge k\), these k pseudo solutions might be derived from the same block. In which case, the algorithm outputs k of (\((i', j)\), \(y_{i',j}\)) such that \(h_{i',j} = g^{y_{i',j}}\) for some \({i'}\). As a result, one can obtain \(x_{i'} \equiv y_{i',j} - r_{i',j}\) mod p but derive no other information of other values of \(x_i\). This invokes at most \(T_{k,n} + 2n \log _2 p\) group operations. Figure 1 illustrates an overview of the first phase.

Fig. 1.
figure 1

Overview of the first phase

The second phase proceeds as follows. Since 1 out of k discrete logarithms has been obtained, discard the block for which that determined discrete logarithm is contained previously. For each of \(i \in \) {1, \(\dots , k\)} \(\setminus \) {\({i}'\)}, select about \(\frac{s_k}{k-1}\) integers \(r_{i,j}\) uniformly at random from \(\mathbb {Z}_p\) and define \(h_{i,j} := g^{r_{i,j}}h_i = g^{x_i+r_{i,j}}\). Incorporate these new values of \(h_{i,j}\) into the remaining \(k-1\) blocks. Hence, each of the remaining \(k-1\) unsolved discrete logarithms are contained separately in \(k-1\) blocks each of size approximately \(\frac{n}{k-1}\). This generates n of \(h_{i,j}\) which when applied to a generic algorithm for the (kn)-GMDL problem, outputs k pseudo solutions. Since each block is about size \(\frac{n}{k-1} \ge k\), these k pseudo solutions might once again be derived from the same block. In which case, the algorithm outputs k of \(((i'', j)\), \(y_{i'',j}\)) such that \(h_{i'',j} = g^{y_{i'',j}}\) for some \({i''}\). As a result, one can obtain \(x_{i''} \equiv y_{i'',j} - r_{i'',j}\) mod p but derive no other information for the other remaining values of \(x_i\). This second phase incurs at most \(T_{k,n}\) + 2\(s_k \log _2 p\) group operations.

The third phase executes in a similar manner to the second phase as follows. For each of \(i \in \) {1, \(\dots , k\)} \(\setminus \) {\({i}'\), \({i''}\)}, select about \(\frac{s_{k-1}}{k-2}\) integers \(r_{i,j}\) uniformly at random from \(\mathbb {Z}_p\) and define \(h_{i,j} := g^{r_{i,j}}h_i = g^{x_i+r_{i,j}}\). Incorporate these new values of \(h_{i,j}\) into the remaining \(k-2\) blocks. Hence, each of the remaining \(k-2\) unsolved discrete logarithms are contained separately in \(k-2\) blocks each of size approximately \(\frac{n}{k-2}\). This generates n of \(h_{i,j}\) which when applied to a generic algorithm for the (kn)-GMDL problem, outputs k pseudo solutions. Since each block is about size \(\frac{n}{k-2} \ge k\), these k pseudo solutions might be derived from the same block. In which case, the algorithm outputs k of (\((i^{(3)},j)\), \(y_{i^{(3)},j}\)) such that \(h_{i^{(3)},j} = g^{y_{i^{(3)},j}}\) for some \({i^{(3)}}\). As a result, one can obtain \(x_{i^{(3)}} \equiv y_{i^{(3)},j} - r_{i^{(3)},j}\) mod p but derive no other information for the other remaining values of \(x_i\). This second phase incurs at most \(T_{k,n}\) + 2\(s_{k-1} \log _ 2 p\) group operations.

During each phase of the process, a generic algorithm for the (kn)-GMDL problem can never guarantee outputs deriving from different blocks since \(n \ge k^2\) implies \(\frac{n}{k-i+1} \ge k\), \(\forall 1 \le i \le k\). In general for \(i \ge 2\), the maximum number of group operations required in the \(i^{th}\) phase is given by \(T_{k,n} + 2s_{k+2-i} \log _2 p\). The process terminates when all k discrete logarithms have been obtained. Since each phase outputs exactly 1 out of k discrete logarithms, all k discrete logarithms can be determined after k phases. Therefore, the following inequality can be obtained.

$$\begin{aligned}T_k \le T_{k,n} + 2n \log _2 p + \sum _{i=2}^{k} (T_{k,n} + 2s_{k+2-i} \log _2p) \end{aligned}$$

Since \(s_{k} = \frac{n}{k}\),

$$\begin{aligned}\sum _{i=2}^{k} (T_{k,n} + 2s_{k+2-i}\log _2p) = (k-1)T_{k,n} + (2\log _2p)\sum _{i=2}^{k} s_{i}= (k-1)T_{k,n} + (2\log _2p)\sum _{i=2}^{k} \frac{n}{i}. \end{aligned}$$

Hence, it follows that

$$\begin{aligned}T_k \le kT_{k,n} + 2n(1+ \sum _{i=2}^{k} \frac{1}{i})\log _2p = kT_{k,n} + 2nH_k\log _2p.\end{aligned}$$

This completes the proof.    \(\square \)

Remark 1

When \(k = 1\), Proposition 1 corresponds to Lemma 1.

By regarding k = k(p) as a function of p such that \(\lim _{p \rightarrow +\infty } k(p)= +\infty \), the asymptotic bounds of \(T_{k,n}\) can be obtained.

Theorem 1

Suppose \(k^3\log ^2 k= o\left( \frac{p}{\log ^2p}\right) \), then \(T_{k,n} = \varOmega \left( \sqrt{\frac{p}{k}} \right) \) for all n satisfying \(n = k^2 + \varOmega (1)\) and \(n = o\left( \frac{\sqrt{kp}}{(\log k)(\log p)}\right) \).

Proof

Clearly \(k^3\log ^2 k= o\left( \frac{p}{\log ^2p}\right) \) implies that \(k = o(p)\). Hence from [19], \(T_{k} = \varOmega (\sqrt{kp})\). It follows from Proposition 1 that if \(nH_k\log _2p \ll T_k\), then \(T_{k,n} = \varOmega (\frac{1}{k} \sqrt{kp}) = \varOmega \left( \sqrt{\frac{p}{k}}\right) \). Moreover, since

$$\begin{aligned} \lim _{k \rightarrow +\infty } (H_k - \log k)= \lim _{k \rightarrow +\infty } [(\sum _{i=1}^{k} \frac{1}{i}) - \log k] = \gamma \end{aligned}$$

where \(\gamma \) is the Euler-Mascheroni constant, the condition \(nH_k\log _2p \ll T_k\) implies that \(n = o\left( \frac{\sqrt{kp}}{(\log k)(\log p)}\right) \). The lower bound \(n = k^2 + \varOmega (1)\) is obtained by noting that n has to be of size at least \(k^2\) from the condition of Proposition 1. Finally, since the lower bound for n cannot be asymptotically greater than its upper bound, k(p) has to satisfy \(k^3\log ^2 k= o\left( \frac{p}{\log ^2p}\right) \). This completes the proof of Theorem 1.    \(\square \)

Remark 2

Although Theorem 1 holds for a wide asymptotic range of n as given, the \(T_{k,n}\) bound becomes sharper as n approaches \(\frac{\sqrt{kp}}{(\log k)(\log p)}\). In essence, Theorem 1 does not yield interesting bounds but is a prelude to the more essential Theorem 2 which requires Proposition 1 and is hence included.

Proposition 2

Suppose that \(k< n < k^2\). Then,

$$\begin{aligned} T_k \le (r + \frac{n}{k}) T_{k,n} + 2rk\log _2p + 2nH_{\lceil \frac{n}{k} \rceil }\log _2p \end{aligned}$$

where \(r = \left\lceil \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})} \right\rceil \).

Proof

The proof comprises two main phases. The former consists of the initializing phase followed by subsequent subphases. It utilizes the extended pigeon hole principle to obtain more than one solution during each of the initial subphases. The latter phase takes place after some point where the number of remaining unknown discrete logarithms is small enough such that each subphase can only recover one discrete logarithm. After this point, the method of determining all other discrete logarithms essentially mirrors that of the method described in the proof of Proposition 1. The formal proof and details are given as follows.

The initializing phase proceeds as follows. Given arbitraries \(h_1, h_2, \dots , h_k \in G = \langle g \rangle \), obtaining \(x_1, x_2, \dots , x_k\) such that \(g^{x_i} = h_i\) \(\forall \) \(1 \le i \le k\) can be achieved in time \(T_k\). Consider n elements partitioned into k blocks each of size approximately \(s_k\), where \(s_k\) = \(\frac{n}{k}\). Each block is labelled i where i ranges from 1 to k. For each i, \(1 \le i \le k\), select about \(s_k\) integers \(r_{i,j}\) uniformly at random from \(\mathbb {Z}_p\) and define \(h_{i,j} := g^{r_{i,j}}h_i = g^{x_i+r_{i,j}}\). This generates n of \(h_{i,j}\) which when applied to a generic algorithm for the (kn)-GMDL problem, outputs k pseudo solutions. Computing each \(h_{i,j}\) requires at most \(2 \log _2 p\) group multiplications. Since each block is about size \(\frac{n}{k} < k\), by the extended pigeon hole principle, at least \(\frac{k^2}{n}\) out of these k solutions must be derived from distinct blocks. In other words, at least \(\frac{k^2}{n}\) correspond to distinct i values and as a result, \(\frac{k^2}{n}\) discrete logarithms out of k discrete logarithms can be obtained during this initializing phase. This invokes at most \(T_{k,n} + 2n\log _2p\) group operations.

The first subphase proceeds as follows. Since \(\frac{k^2}{n}\) out of k discrete logarithms have been obtained, discard all the blocks for which those determined discrete logarithms are contained previously. Thus, about \(k - \frac{k^2}{n} = \frac{k(n-k)}{n}\) blocks remain, each of size approximately \(\frac{n}{k}\). For each of the remaining blocks i, select about k integers \(r_{i,j}\) uniformly distributed across each i and uniformly at random from \(\mathbb {Z}_p\). Define \(h_{i,j} := g^{r_{i,j}}h_i = g^{x_i+r_{i,j}}\). Incorporate these new values of \(h_{i,j}\) into the remaining \(\frac{k(n-k)}{n}\) blocks. Hence, each of the remaining \(\frac{k(n-k)}{n}\) unsolved discrete logarithms are contained separately in \(\frac{k(n-k)}{n}\) blocks each of size approximately \(\frac{n^2}{k(n-k)}\). This generates n of \(h_{i,j}\) which when applied to a generic algorithm for the (kn)-GMDL problem, outputs k pseudo solutions. This incurs a maximum of \(T_{k,n} + 2k\log _2p\) group operations.

If \(k \le \frac{n^2}{k(n-k)}\), these k pseudo solutions might be derived from the same block and hence phase two begins in which the method described in Proposition 1 can then be applied on this new set of blocks.

If \(k > \frac{n^2}{k(n-k)}\), the extended pigeon hole principle ensures that at least \(\frac{k^2(n-k)}{n^2}\) of the k pseudo solutions correspond to distinct i values and a result, about \(\frac{k^2(n-k)}{n^2}\) discrete logarithms can be obtained in the first subphase.

Subsequent subphases are similar to the first subphase. In general for the \(r^{th}\) subphase, since about \(\frac{k^2(n-k)^{r-1}}{n^r}\) solutions will have been obtained in the \((r-1)^{th}\) subphase, discard all the blocks for which those determined discrete logarithms are contained previously. By a simple induction, it can be shown that the number of remaining blocks in the \(r^{th}\) subphase is about \(\frac{k(n-k)^r}{n^r}\). The induction proceeds as follows. The base case has already been verified in the first subphase. Suppose the result holds for \(r = m-1\) for some \(m \ge 2\). By the inductive hypothesis, the number of blocks remaining in the \((m-1)^{th}\) subphase is about \(\frac{k(n-k)^{m-1}}{n^{m-1}}\). During the \({m}^{th}\) subphase, since about \(\frac{k^2(n-k)^{m-1}}{n^{m}}\) solutions have already been obtained previously and are thus discarded, the number of remaining blocks is given by

$$\begin{aligned} \frac{k(n-k)^{m-1}}{n^{m-1}} - \frac{k^2(n-k)^{m-1}}{n^{m}} = \frac{k(n-k)^{m}}{n^{m}}.\end{aligned}$$

This completes the induction. For each of the remaining blocks i, select about k integers \(r_{i,j}\) uniformly distributed across each i and uniformly at random from \(\mathbb {Z}_p\). Define \(h_{i,j} := g^{r_{i,j}}h_i = g^{x_i+r_{i,j}}\). Incorporate these new values of \(h_{i,j}\) into the remaining \(\frac{k(n-k)^{r}}{n^{r}}\) blocks. Hence, each of the remaining \(\frac{k(n-k)^{r}}{n^{r}}\) unsolved discrete logarithms are contained separately in \(\frac{k(n-k)^{r}}{n^{r}}\) blocks each of size approximately \(\frac{n^{r+1}}{k(n-k)^r}\). This generates n of \(h_{i,j}\) which when applied to a generic algorithm for the (kn)-GMDL problem, outputs k pseudo solutions. Each of the \(r^{th}\) subphase requires at most \(T_{k,n} + 2k\log _2p\) group operations.

When \(k \le \frac{n^{r+1}}{k(n-k)^r}\), the k outputs can only guarantee one solution. Hence, as soon as r satisfies the above inequality, the first main phase terminates at the end of the \(r^{th}\) subphase and the second main phase commences. That is

$$\begin{aligned} k \le \frac{n^{r+1}}{k(n-k)^r} \implies r = \left\lceil \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})} \right\rceil . \end{aligned}$$

At the beginning of second main phase, there are a total of about \(\frac{k(n-k)^r}{n^r} - 1\) unresolved discrete logarithms. The rest of the procedure follows starting from the second phase of Proposition 1 until the end. Hence, it can immediately be derived from the proof of Proposition 1 that the number of group operations required to solve them all is at most 2[\((\frac{k(n-k)^r}{n^r} - 1)\) \(T_{k,n}\) + \(nH_{\lceil \frac{k(n-k)^r}{n^r} \rceil } - n]\log _2p\). Since

$$\begin{aligned} k \le \frac{n^{r+1}}{k(n-k)^r} \implies \frac{k(n-k)^r}{n^r} \le \frac{n}{k}, \end{aligned}$$

the maximum number of group operations required in the second main phase is given by \((\frac{n}{k} - 1) T_{k,n} + 2(nH_{\lceil \frac{n}{k} \rceil } - n)\log _2p\). The number of group operations required during the first main phase is the sum of the number required for the initializing phase and the number required for all the subphases. Therefore, the maximum number of group operations required in the first main phase is given by

$$\begin{aligned} T_{k,n} + 2n\log _2p + r(T_{k,n} + 2k\log _2p) = (r+1)T_{k,n} + 2n\log _2p + 2rk\log _2p. \end{aligned}$$

Thus the maximum number of group operations required to execute both the first main phase and the second main phase is given by

$$\begin{aligned} (r+1)T_{k,n} + 2(n + rk)\log _2p + (\frac{n}{k} - 1) T_{k,n} + 2(nH_{\lceil \frac{n}{k} \rceil } - n)\log _2p. \end{aligned}$$

Hence, \(T_k \le (r + \frac{n}{k}) T_{k,n} + 2rk\log _2p + 2nH_{\lceil \frac{n}{k} \rceil }\log _2p\).    \(\square \)

Remark 3

One other approach is to replace the (kn)-GMDL solver with the (1, n)-GMDL solver during the second main phase. Both yield identical asymptotic results given in Sect. 7 as the computational bottleneck arises from the first main phase.

By regarding k = k(p) as a function of p such that \(\lim _{p \rightarrow +\infty } k(p)= +\infty \), the asymptotic bounds of \(T_{k,n}\) can be obtained.

Theorem 2

Suppose k, n satisfy the following conditions:

$$\begin{aligned} \begin{array}{ll} { 1.}\quad k = o(p) &{} \quad { 2.}\quad n = k^2 - \varOmega (1) \\ { 3.}\quad \frac{n}{\sqrt{k}} \log (\frac{n}{k}) = o\left( \frac{\sqrt{p}}{\log p}\right) &{} \quad { 4.}\quad \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})} \sqrt{k} = o\left( \frac{\sqrt{p}}{\log p}\right) \end{array} \end{aligned}$$

Then,

$$\begin{aligned} T_{k,n} = \varOmega \left( \frac{\sqrt{k}}{\frac{n}{k} + r} \sqrt{p} \right) \end{aligned}$$

where \(r = r(k,n)\) \(\ge 1\) is any function of k and n satisfying \(r(k,n) = \varOmega \left( \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})}\right) \).

Proof

Condition 1 is necessary to utilize results in [19] for a lower bound of \(T_k\). Condition 2 is required in order to apply the result of Proposition 2. Conditions 3 and 4 can be obtained by requiring \(nH_{\lceil \frac{n}{k} \rceil }\log p \ll T_k\) and \(rk \log p \ll T_k\) respectively and noting that \(T_k = \varOmega (\sqrt{kp})\). Hence from Proposition 2 and under these conditions, \(T_{k,n} = \varOmega \left( \frac{\sqrt{k}}{\frac{n}{k} + r} \sqrt{p}\right) \).    \(\square \)

It should be mentioned that r(kn) can be taken to be any function satisfying \(r(k,n) = \varOmega \left( \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})}\right) \). However, it is clear that \(T_{k,n}\) achieves sharper bounds for asymptotically smaller choices of r. One other point of note is that when \(n = k\) where \(k = o(p)\), all the 4 conditions are satisfied and r(kn) can be taken to be 1. In this case, \(T_k = T_{k,k} = \varOmega (\sqrt{kp})\). which indeed corresponds to the bound obtained in [19].

5 Optimizing the Partition of n

From the methods described in the proofs of Propositions 1 and 2, n is partitioned into blocks of approximately equal size at each phase. There are many ways to perform such partitions of n. The running time of each phase is partially determined by the number of uniformly randomly chosen \(r_{i,j},\) which invariably depends on the partition of n. In this section, we show that the method of partition described in the proofs of the earlier Propositions minimizes the expected number of chosen \(r_{i,j}\) required and hence results in the fastest running time among all other possible partitions. We first consider the case where a (kn)-GMDL solver output solutions derived from the same block so only one discrete logarithm can be determined at each phase. We follow this up by considering general scenarios where a (kn)-GMDL solver outputs solutions derived from multiple blocks.

5.1 Pseudo Solutions Deriving from the Same Block

Denote \(s_i\) to be the size of block i, \(k \le s_i\), \(1 \le i \le k\). So that \(\sum _{i=1}^{k} s_i = n\). Let \(p_{i,k}\) be the conditional probability that the k output solutions derive from block i given that the k output solutions derive from the same block. Then, \(p_{i,k}\) can be expressed by the following.

$$\begin{aligned} p_{i,k} = \frac{\left( {\begin{array}{c}s_i\\ k\end{array}}\right) }{\sum _{j=1}^{k} \left( {\begin{array}{c}s_j\\ k\end{array}}\right) } \end{aligned}$$

Suppose a solution is derived from block i. Upon discarding block i, \(s_i\) of \(r_{i,j}\) have to be randomly chosen to fill the remaining blocks so that they sum back up to n. Let \(E^{(1)}_{k}\) be the expected number of randomly chosen \(r_{i,j}\) required. Then, \(E^{(1)}_{k}\) can be expressed by the following.

$$\begin{aligned} E^{(1)}_{k} = \sum _{i=1}^{k} p_{i,k}s_i = \frac{\sum _{i=1}^{k} \left( {\begin{array}{c}s_i\\ k\end{array}}\right) s_i}{\sum _{i=1}^{k} \left( {\begin{array}{c}s_i\\ k\end{array}}\right) } \end{aligned}$$

Our objective is therefore to minimize \(E^{(1)}_{k}\) given \(\sum _{i=1}^{k} s_i = n\). We expand the admissible values of \(s_i\) to the set of positive real numbers so that \(s_i \in \mathbb {R}^+\). In this way, \(\left( {\begin{array}{c}s_i\\ k\end{array}}\right) \) is defined as \(\left( {\begin{array}{c}s_i\\ k\end{array}}\right) = \frac{s_i(s_i - 1)\dots (s_i - k+1)}{k!}\). We prove the following result.

Theorem 3

Given that \(\sum _{i=1}^{k} s_i = n\),

$$\begin{aligned} \frac{\sum _{i=1}^{k} \left( {\begin{array}{c}s_i\\ k\end{array}}\right) s_i}{\sum _{i=1}^{k} \left( {\begin{array}{c}s_i\\ k\end{array}}\right) } \ge \frac{n}{k}. \end{aligned}$$

Proof

Without loss of generality, assume that \(s_1 \le s_2 \le \dots \le s_k\). Thus, \(\left( {\begin{array}{c}s_1\\ k\end{array}}\right) \le \left( {\begin{array}{c}s_2\\ k\end{array}}\right) \le \dots \le \left( {\begin{array}{c}s_i\\ k\end{array}}\right) \). By Chebyshev’s sum inequality,

$$\begin{aligned} k\sum _{i=1}^{k} \left( {\begin{array}{c}s_i\\ k\end{array}}\right) s_i \ge \left( \sum _{i=1}^{k} s_i\right) \left( \sum _{i=1}^{k} \left( {\begin{array}{c}s_i\\ k\end{array}}\right) \right) . \end{aligned}$$

The result follows by replacing \(\sum _{i=1}^{k} s_i\) with n in the above inequality.    \(\square \)

Hence, \(E^{(1)}_{k} \ge \frac{n}{k}\) and it is straightforward to verify that equality holds if \(s_1 = s_2 = \dots = s_k\). Therefore, the method of partitioning n into blocks of equal sizes at each phase as described in the proof of the Proposition 1 indeed minimizes the running time.

5.2 Pseudo Solutions Deriving from Multiple Blocks

Denote \(s_{i}\) to be the size of block i, \(1 \le i \le k\). so that \(\sum _{i=1}^{k} s_i = n\). Let \(p_{i_1,i_2,\dots ,i_m,k}\) be the conditional probability that the k output solutions derive from blocks \(i_1,i_2,\dots ,i_m\) given that the k output solutions derive from m distinct blocks. Then, \(p_{i_1,i_2,\dots ,i_m,k}\) satisfies the following.

$$\begin{aligned} p_{i_1,i_2,\dots ,i_m,k}\sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\sum _{k_1+ \dots +k_m=k}\left( {\begin{array}{c}s_{i_1}\\ k_1\end{array}}\right) \dots \left( {\begin{array}{c}s_{i_m}\\ k_m\end{array}}\right) = \sum _{k_1+ \dots +k_m=k}\left( {\begin{array}{c}s_{i_1}\\ k_1\end{array}}\right) \dots \left( {\begin{array}{c}s_{i_m}\\ k_m\end{array}}\right) \end{aligned}$$

A more concise representation can be expressed as follows.

$$\begin{aligned} p_{i_1,i_2,\dots ,i_m,k}\sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\sum _{k_1+ \dots +k_m=k}\prod _{t=1}^{m}\left( {\begin{array}{c}s_{i_t}\\ k_t\end{array}}\right) = \sum _{k_1+ \dots +k_m=k}\prod _{t=1}^{m}\left( {\begin{array}{c}s_{i_t}\\ k_t\end{array}}\right) \end{aligned}$$
(1)

We can further simplify the above expression by the following lemma.

Lemma 2

$$\begin{aligned} \sum _{k_1+ \dots +k_m=k}\prod _{t=1}^{m}\left( {\begin{array}{c}s_{i_t}\\ k_t\end{array}}\right) = \left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) \end{aligned}$$

Proof

For brevity, denote \(s = s_{i_1}+ \dots + s_{i_m}\).

Consider the polynomial \((1+x)^s\). By the binomial theorem, \((1+x)^{s} = \sum _{r=0}^{s} \left( {\begin{array}{c}s_{i_1}+ \dots + s_{i_m}\\ r\end{array}}\right) x^r\). On the other hand,

$$\begin{aligned} (1+x)^s = (1+x)^{s_{i_1}} \dots (1+x)^{s_{i_m}} = \prod _{t=1}^{m} \sum _{r_t=0}^{s_{i_t}} \left( {\begin{array}{c}s_{i_t}\\ r_t\end{array}}\right) x^{r_t}. \end{aligned}$$

In this instance, the coefficient of \(x^r\) is the sum of all products of binomial coefficients of the form \(\left( {\begin{array}{c}s_{i_t}\\ r_t\end{array}}\right) \) where the \(r_t\) sum to r. Therefore,

$$\begin{aligned} \prod _{t=1}^{m} \sum _{r_t=0}^{s_{i_t}} \left( {\begin{array}{c}s_{i_t}\\ r_t\end{array}}\right) x^{r_t} = \sum _{r=0}^{s} \sum _{r_1 + \dots r_m = r} \prod _{t=1}^{m} \left( {\begin{array}{c}s_{i_t}\\ r_t\end{array}}\right) x^r. \end{aligned}$$

Hence,

$$\begin{aligned} \sum _{r=0}^{s} \left( {\begin{array}{c}s_{i_1}+ \dots + s_{i_m}\\ r\end{array}}\right) x^r = \sum _{r=0}^{s} \sum _{r_1 + \dots r_m = r} \prod _{t=1}^{m} \left( {\begin{array}{c}s_{i_t}\\ r_t\end{array}}\right) x^r. \end{aligned}$$

The result follows by equating the coefficients of \(x^r\) on both sides of the above equation.    \(\square \)

Corollary 2

$$\begin{aligned} p_{i_1,i_2,\dots ,i_m,k}\sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) = \left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) \end{aligned}$$

Proof

Follows directly from Eq. (1) and Lemma 2.    \(\square \)

Suppose m solutions are derived from blocks \(i_1, \dots , i_m\). Upon discarding blocks \(i_1, \dots , i_m\), \(s_{i_1}+ \dots + s_{i_m}\) of \(r_{i,j}\) have to be randomly chosen to fill the remaining blocks so that they sum back up to n. Denote \(E^{(m)}_{k}\) to be the expected number of randomly chosen \(r_{i,j}\) required. Then, a generalized form of \(E^{(m)}_{k}\) can be expressed as follows.

$$\begin{aligned} E^{(m)}_{k} = \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}p_{i_1,i_2,\dots ,i_m,k} (s_{i_1}+ \dots + s_{i_m}) \end{aligned}$$

From the result of Corollary 2, this implies that \(E^{(m)}_{k}\) satisfies

$$\begin{aligned} E^{(m)}_{k} \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) = \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) (s_{i_1}+ \dots + s_{i_m}). \end{aligned}$$

Once again, we seek to maximize \(E^{(m)}_{k}\) given \(\sum _{i=1}^{k} s_i = n\). As before, we expand the admissible values of \(s_{i}\) to the set of positive real numbers so that \(s_{i} \in \mathbb {R}^+\). In this way, \(\left( {\begin{array}{c}s_{i}\\ k\end{array}}\right) \) is defined as \(\left( {\begin{array}{c}s_{i}\\ k\end{array}}\right) = \frac{s_i(s_i - 1)\dots (s_i - k+1)}{k!}\). We prove the following result.

Theorem 4

Given that \(\sum _{i=1}^{k} s_i = n\),

$$\begin{aligned} \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) (s_{i_1}+ \dots + s_{i_m}) \ge \frac{mn}{k} \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) . \end{aligned}$$

Proof

By Chebyshev’s sum inequality,

$$\begin{aligned} \left( {\begin{array}{c}k\\ m\end{array}}\right) \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) (s_{i_1}+ \dots + s_{i_m}) \end{aligned}$$
$$\begin{aligned} \ge \left( \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) \right) \left( \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}} (s_{i_1}+ \dots + s_{i_m})\right) . \end{aligned}$$

Since \(\sum _{i=1}^{k} s_i = n\),

$$\begin{aligned} \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}} (s_{i_1}+ \dots + s_{i_m}) = \left( {\begin{array}{c}k-1\\ m-1\end{array}}\right) n. \end{aligned}$$

Hence, we obtain

$$\begin{aligned} \left( {\begin{array}{c}k\\ m\end{array}}\right) \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) (s_{i_1}+ \dots + s_{i_m}) \end{aligned}$$
$$\begin{aligned} \ge \left( {\begin{array}{c}k-1\\ m-1\end{array}}\right) n \sum _{\{i_1 \dots i_m\} \subseteq \{1,\dots ,k\}}\left( {\begin{array}{c}s_{i_1}+ \dots + {s_{i_m}}\\ k\end{array}}\right) . \end{aligned}$$

By elementary algebraic operations, it is straightforward to verify that \(\frac{\left( {\begin{array}{c}k-1\\ m-1\end{array}}\right) }{\left( {\begin{array}{c}k\\ m\end{array}}\right) } = \frac{m}{k}\) from which the result follows.   \(\square \)

Hence, \(E^{(m)}_{k} \ge \frac{mn}{k}\) and it is straightforward to verify that equality holds if \(s_1 = s_2 = \dots = s_k\). Therefore, the method of partitioning n into blocks of equal sizes at each phase as described in the proof of the Proposition 2 indeed minimizes the running time.

6 Applications in Other MDLP Settings

We demonstrate how the block method can be adapted to obtain bounds in other generalized multiple discrete logarithm settings. We consider applications to the (\(e_1, \dots , e_d\))-Multiple Discrete Logarithm Problem with Auxiliary Inputs (MDLPwAI) as well as the \(\mathbb {F}_p\)-Multiple Discrete Logarithm Problem in the Exponent (\(\mathbb {F}_p\)-MDLPX). Let G = \(\langle g \rangle \) be a cyclic group of large prime order p. Their formal definitions are as follow.

Definition 3

(MDLPwAI). Given G, g, p, \(e_i\) and \(g^{{x_i}^{e_1}}, g^{{x_i}^{e_2}}, \dots g^{{x_i}^{e_d}}\) \(\forall \) \(1 \le i \le k\), find non-negative integers \(x_1\), \(x_2\), \(\dots \), \(x_k\) \(\in \) \(\mathbb {Z}_p\).

Definition 4

( \(\mathbb {F}_\mathbf{p}\) -MDLPX). Let \(\chi \) \(\in \) \(\mathbb {F}_p\) be an element of multiplicative order N. Given G, g, p, \(\chi \) and \(g^{{\chi }^{x_i}}\) \(\forall \) \(1 \le i \le k\), find non-negative integers \(x_1\), \(x_2\), \(\dots \), \(x_k\) \(\in \) \(\mathbb {Z}_N\).

The computational complexity of MDLPwAI was analysed in [12]. In the same paper, the authors introduced the \(\mathbb {F}_p\)-MDLPX and also analysed its complexity.

Here, we define the Generalized Multiple Discrete Logarithm Problem (GMDLPwAI) with Auxiliary Inputs and the Generalized \(\mathbb {F}_p\)-Multiple Discrete Logarithm Problem in the Exponent (\(\mathbb {F}_p\)-GMDLPX) to be solving k out of n instances of the MDLPwAI and \(\mathbb {F}_p\)-MDLPX respectively. We provide the formal definitions below.

Definition 5

(GMDLPwAI). Given G, g, p, \(e_i\) and \(g^{{x_i}^{e_1}}, g^{{x_i}^{e_2}}, \dots g^{{x_i}^{e_d}}\) \(\forall \) \(1 \le i \le n\), find k pairs \((i, x_i)\), \(x_i\) \(\in \) \(\mathbb {Z}_p\), where \(i \in S\) such that S is a k-subset of \(\{1, \dots , n\}\).

Definition 6

( \(\mathbb {F}_\mathbf{p}\) -GMDLPX). Let \(\chi \) \(\in \) \(\mathbb {F}_p\) be an element of multiplicative order N. Given G, g, p, \(\chi \) and \(g^{{\chi }^{x_i}}\) \(\forall \) \(1 \le i \le n\), find k pairs \((i, x_i)\), \(x_i\) \(\in \) \(\mathbb {Z}_N\), where \(i \in S\) such that S is a k-subset of \(\{1, \dots , n\}\).

6.1 Block Based GMDLPwAI

The block method can be adapted to obtain bounds for the GMDLPwAI by randomizing the input elements in the following way. Given \(g^{{x_i}^{e_1}}\), select random integers \(r_{i,j}\) \(\in \) \(\mathbb {Z}_p^*\) and compute values of \((g^{{x_i}^{e_1}})^{{r_{i,j}}^{e_1}} = g^{({r_{i,j}x_i)}^{e_1}}\) as inputs into the GMDLPwAI solver. For each \(r_{i,j}\), reduce \(r_{i,j}^{e_1}\) modulo p and then \(g^{({r_{i,j}x_i)}^{e_1}}\) can be computed within \(2\log _2p\) group operations. Repeat this procedure for all \(e_2, \dots , e_d\). We show how the \(x_i\) can be recovered. For instance, suppose the solver outputs solution (ly) corresponding to some particular input \(g^{({r_{i,j}x_i)}^{e_1}}\). In which case, \(x_i\) can thus be obtained by solving \(r_{i,j}x_i \equiv y\) mod p. Such congruence equations are efficiently solvable since gcd(\(r_{i,j},p\)) = 1.

Let \(T'_k\) and \(T'_{k,n}\) denote the time taken in group operations for an optimal algorithm to solve the MDLPwAI and GMDLPwAI problems respectively.

Suppose \(k< n < k^2\). Then by adapting the block technique applied to the GMDL problem before, it can be shown that

$$\begin{aligned} T'_k \le (r + \frac{n}{k}) T'_{k,n} + 2d(rk + nH_{\lceil \frac{n}{k} \rceil })\log _2p \end{aligned}$$

where \(r = \left\lceil \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})} \right\rceil \).

It has been conjectured in [12] that \(T'_k\) = \(\varOmega (\sqrt{kp/d})\) for values of \(e_i = i\). Assuming this conjecture, we can conclude from our results that for all polynomially bounded inputs with \(d = O(p^{1/3 - \epsilon })\), \(\epsilon > 0 \), \(T'_{k,n}\) is bounded by

$$\begin{aligned} T'_{k,n} = \varOmega \left( \frac{\sqrt{k}}{\frac{n}{k} + r} \sqrt{\frac{p}{d}} \right) \end{aligned}$$

where \(r = r(k,n)\) \(\ge 1\) is any function of k and n satisfying \(r(k,n) = \varOmega \left( \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})}\right) \).

When \(k =1\), the above bound is not applicable since \(n \ge k^2\) in this situation. Nevertheless, we show how an unconditional bound for \(T'_{1,n}\) can still be obtained in this specific case without the assumption of any conjecture. Similar to the generalized case, randomize input elements of the form \(g^{({r_{i,j}x_i)}^{e_1}}\). One of the \(x_i\) can then be computed by solving \(r_{i,j}x_i \equiv y\) mod p where y is a given known output. The process terminates here since one of the \(x_i\) has already been obtained. Hence, we have the following inequality: \(T'_1 \le T'_{1,n} + 2nd\log _2p\). From the results of [5], \(T'_1 = \varOmega (\sqrt{p/d})\). It follows that for \(n = o\left( \frac{\sqrt{p}}{d^{3/2}\log p}\right) \),

$$\begin{aligned} T'_{1,n} = \varOmega (\sqrt{p/d}). \end{aligned}$$

6.2 Matrix Based GMDLPwAI

In this section, our objective is to find values of \(e_i\) for which the matrix method can applied to solve the GMDLPwAI. We recall from Sect. 3 that the validity of the method for the k-MDL problem necessitates \(g^{x_1+ x_2 + \dots + x_k}\) to be efficiently computable from given values of \(g^{x_1}, g^{x_2}, \dots , g^{x_k}\). Moreover, the GMDLPwAI can be viewed as a generalization of the k-MDL problem (i.e. the GMDLPwAI reduces to the k-MDL problem when \(d=1\) and \(e_1 =1\)). In a similar vein, it is required that \(g^{f_i(x_1 + x_2 + \dots + x_k)}\) to be efficiently computable, from the given known values of the GMDLPwAI. In the instance of GMDLPwAI, \(f_i(x) = x^{e_i}\). In that regard, suppose \(g^{{f_i}_0(x_1 + x_2 + \dots + x_k)} = g^{{a_1f_i}_1(x_1)}g^{a_2{f_i}_2(x_2)} \dots g^{a_k{f_i}_k(x_k)}\) \(\forall \) \(x_i\) and for some integer constants \(a_i\) so that it can be efficiently computed. We prove the following result.

Theorem 5

Let \(f_i(x) = x^{e_i}\), where \(e_i \in \mathbb {Z}\) are not necessarily distinct. If for some integer constants \(a_i\) such that

$$\begin{aligned} f_0(x_1 + x_2 + \dots + x_k) \equiv a_1f_1(x_1) + a_2f_2(x_2) + \dots a_kf_k(x_k) \,\,\, \textit{mod} \,\, \textit{p} \end{aligned}$$

for all odd primes p and for all \(x_1, x_2, \dots , x_k \in \mathbb {Z}_p\), then the only solutions are of the form \(f_i = x^{c_i(p-1)+1}\) for some integer constants \(c_i\).

Proof

For each \(1 \le i \le k\), substitute \(x_i = 1\) and \(x_{i'} = 0\) for \(i \not = i'\). We obtain \(f_0(1) = a_if_i(1)\) \(\forall \) i. Hence, \(a_i = 1\) \(\forall \) i. Upon establishing that all \(a_i\) values have to be 1, we proceed as follows.

Let \(x_2 = x_3 = \dots = x_k = 0\). This implies that \(f_0(x_1) \equiv f_1(x_1)\) mod p for all \(x_1\). Similarly, let \(x_1 = x_3 = \dots = x_k = 0\). This implies that \(f_0(x_2) \equiv f_2(x_2)\) mod p for all \(x_2\). Continuing in this fashion, it can be deduced that \(f_0(x ) \equiv f_1(x ) \equiv \dots f_k(x )\) mod p \(\forall \) x. Next, let \(x_3 = x_4 = \dots = x_k = 0\). This implies \(f_0(x_1+x_2) \equiv f_0(x_1) + f_0{(x_2)}\) mod p. We claim that \(f_0(x) \equiv xf_0(1)\) mod p \(\forall \) x. It clear that the result holds true for \(x = 0, 1\). By applying an inductive argument,

$$\begin{aligned} f_0(x+1) \equiv f_0(x) + f_0(1) \equiv xf_0(1) + f_0(1) \equiv (x+1)f_0(1) \,\,\, \text {mod} \,\, \textit{p} \end{aligned}$$

and the claim follows. Hence, p divides \(x^{e_0} - x\) for all x \(\in \) \(\mathbb {Z}_p\). Since (\(\mathbb {Z}/p\mathbb {Z})^{\times } \cong C_{p-1}\), there exists a generator of the cyclic group \(x_0\) \(\in \) \(\mathbb {Z}_p\) such that if p divides \(x_0^{e_0} - x_0\), then \(e_0 \equiv 1\) mod \(p-1\). Moreover, since we have earlier established that \(f_0(x ) \equiv f_1(x ) \equiv \dots f_k(x )\) mod p, it can be concluded that \(e_i \equiv 1\) mod \(p-1\) for all i. Hence, \(f_i = x^{c_i(p-1)+1}\) and it is straightforward to verify that these are indeed solutions to the original congruence equation.    \(\square \)

From the result of Theorem 5, if \(e_i\) is of the form \(e_i = c_i(p-1)+1\), then \(g^{f_i(x_1 + x_2 + \dots + x_k)}\) be can efficiently computed. However, \(g^{x^{c_i(p-1)+1}} = g^x\) so such \(e_i\) values reduces to the classical multiple discrete logarithm problem. Therefore, the matrix method is not applicable to solve the GMDLPwAI.

6.3 Block Based \(\mathbb {F}_p\)-GMDLPX

The block method can be also adapted to obtain bounds for the \(\mathbb {F}_p\)-GMDLPX by randomizing the input elements with the computations \((g^{{\chi }^{x_i}})^{{\chi }^{r_{i,j}}} = g^{{\chi }^{x_i+r_{i,j}}}\) where \(r_{i,j}\) \(\in \) \(\mathbb {Z}_p\) are selected randomly. For each \(r_{i,j}\), reduce \(\chi ^{r_{i,j}}\) modulo p and then \(g^{{\chi }^{x_i+r_{i,j}}}\) can be computed within \(2\log _2p\) group operations.

Suppose the solver outputs solution (ly) corresponding to some particular input \(g^{{\chi }^{x_i+r_{i,j}}}\). In which case, \(x_i\) can thus be obtained by solving \(r_{i,j}+x_i \equiv y\) mod p. The analysis and obtained bounds in this case is similar to the classical GMDL problem which was already discussed in Sect. 5 so we will omit the details here.

Let \(T''_k\) and \(T''_{k,n}\) denote the time taken in group operations for an optimal algorithm to solve the \(\mathbb {F}_p\)-MDLPX and \(\mathbb {F}_p\)-GMDLPX problems respectively. It has been shown in [12] that \(T''_k\) can be achieved in \(O(\sqrt{kN})\). If this is optimal, then our results show that

$$\begin{aligned} T''_{k,n} = \varOmega \left( \frac{\sqrt{k}}{\frac{n}{k} + r} \sqrt{N} \right) \end{aligned}$$

where \(r = r(k,n)\) \(\ge 1\) is any function of k and n satisfying \(r(k,n) = \varOmega \left( \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})}\right) \), subject to the conditions given in Theorem 2.

6.4 Matrix Based \(\mathbb {F}_p\)-GMDLPX

The matrix method does not apply here since there is no known efficient method to compute \(g^{{\chi }^{x_i + x_j}}\) given \(g^{{\chi }^{x_i}}\) and \(g^{{\chi }^{x_j}}\) if the Diffie-Hellman assumption holds.

7 Some Explicit Bounds of \(T_{k,n}\)

The conditions imposed in Theorem 2 might initially seem restrictive but in fact they are satisfied by large classes of k and n. In this section, we present some interesting explicit bounds of \(T_{k,n}\) by varying n relative to k. For the remainder of this section, k can be taken to be any function satisfying \(k = O(p^{1-\epsilon })\), for some \(0< \epsilon < 1\) and c is a constant. The proofs of Proposition 3 and 4 are straightforward applications of Theorem 2.

Proposition 3

$$\begin{aligned} T_{k,k+c} = \varOmega (\sqrt{kp}) \end{aligned}$$

where c is a positive constant.

Proposition 4

$$\begin{aligned}T_{k,ck} = \varOmega (\frac{\sqrt{k}}{\log k} \sqrt{p})\end{aligned}$$

where c is a constant, \(c > 1\).

Proposition 5

$$\begin{aligned}T_{k,k \log ^c k}= \varOmega (\frac{\sqrt{k}}{\log ^{1+c} k} \sqrt{p})\end{aligned}$$

where c is a positive constant.

Proof

$$\begin{aligned} \log \frac{n}{n-k} = \log \frac{\log ^c k}{\log ^c k -1}. \end{aligned}$$

Next, consider the function \(\log \frac{\log ^c k}{\log ^c k -1}\), for \(k>e\) where e is the base of the natural logarithm. For brevity, denote \(x = \log ^{c} k\) so \(x>1\). By the Maclaurin series expansion, for \(x > 1\),

$$\begin{aligned} \log \frac{x}{x-1} = -\log (1 - \frac{1}{x}) = \sum _{i=1}^{\infty } \frac{1}{ix^i} > \frac{1}{x}. \end{aligned}$$

In particular, this proves that \(\log \frac{\log ^c k}{\log ^c k -1} > \frac{1}{\log ^{c} k}\) when \(k > e\). Hence for \(k > e\),

$$\begin{aligned} \frac{\log (\frac{k^2}{n})}{\log (\frac{n}{n-k})}< (\log ^c k) \log (\frac{k^2}{n}) = (\log ^c k) \log (\frac{k}{\log ^c k}) < \log ^{1+c} k. \end{aligned}$$

Thus r(kn) can be taken to be \(\log ^{1+c} k\) when \(n = k\log ^{c}k\). From Theorem 2, we obtain

$$\begin{aligned} T_{k,k \log ^c k}= \varOmega (\frac{\sqrt{k}}{\log ^c k + \log ^{1+c} k} \sqrt{p}) = \varOmega (\frac{\sqrt{k}}{\log ^{1+c} k} \sqrt{p}). \end{aligned}$$

   \(\square \)

Table 1 provides a summary of the results for the lower bounds of \(T_{k,n}\) with different n relative to k.

Table 1. Some bounds of \(T_{k,n}\)

8 Conclusion

In this paper, we established rigorous bounds for the generic hardness of the generalized multiple discrete logarithm problem which can be regarded a generalization of the multiple discrete logarithm problem. Some explicit bounds are also computed using both the matrix method and block method. Many instances of \(T_{k,n}\) are shown to be in fact asymptotically optimal. The overall best bounds obtained here require the union of results from both of these techniques. Furthermore, we show that the block method can also be adapted to handle generalizations arising in other discrete logarithm problems. We similarly obtain bounds for these generalizations. For instance, a consequence of our result highlights that solving an instance of the MDLPwAI problem is as hard as solving the DLPwAI problem under certain conditions. We also demonstrated why the matrix method is not applicable to these and other variants of DLP.