Keywords

1 Introduction

Let \(p=[2^{766}\pi ]+62762\), which is the smallest 768-bit prime number larger than \(2^{766}\pi \) for which \(\frac{p-1}{2}\) is prime tooFootnote 1. Let \(g=11\), which is a generator of the multiplicative group \(\mathbf{F}_p^\times \) of the prime field \(\mathbf{F}_p\). On June 16, 2016, we finished the computation of the discrete logarithm of \(t=[2^{766}e]\) with respect to g. We found that the smallest non-negative integer x for which \(g^x\equiv t\bmod p\) equals

$$\begin{aligned}&325923617918270562238615985978623709128341338833721058543950813\\&521768156295091638348030637920237175638117352442299234041658748\\&471079911977497864301995972638266781162575370644813703762423329\\&783129621567127479417280687495231463348812. \end{aligned}$$

By itself, this is a useless result. What is interesting is how we found it, that we did so with much less effort than we expected, and what the result implies for cryptographic security that relies on the difficulty of larger similar problems. These issues are discussed in this paper.

The result was obtained using the number field sieve (NFS, [13, 28]). It required the equivalent of about 5300 core years on a single core of a 2.2 GHz Xeon E5-2660 processor, mostly harvested during the period May to December, 2015, on clusters at the authors’ universities. On average each additional discrete logarithm requires two core days. This result is a record for computing prime field discrete logarithms. It closes the gap between record calculations for general purpose integer factoring and computing arbitrary prime field discrete logarithms, with the 768-bit integer factorization record [21] dating back to 2009. Although our effort was substantial, we spent a fraction of what we originally expected. The purpose of this paper is to describe how this was achieved.

Records of this sort are helpful to get an impression of the security offered by cryptographic systems that are used in practice. The 768-bit number field sieve factorization from [21], for instance, required about 1700 core years. Because factoring a single 1024-bit RSA modulus [34] using the number field sieve is about three orders of magnitude more work (cf. end of Sect. 2), an educated guess follows for the worst-case effort to break a 1024-bit RSA key. Interpretation of the resulting estimate is another matter. Depending on one’s perception, applications, incentives, taste, \(\ldots \), it may boost or undermine one’s confidence in the security of 1024-bit RSA moduli.

The ratio is similar between the difficulties of computing 768-bit and 1024-bit prime field discrete logarithms (cf. Sect. 2). It follows that even the nonchalant users of 1024-bit RSA, ElGamal [11], or DSA [36] have no reason to be nervous anytime soon if their concern is an “academic attack” such as the one presented here (cf. [6]). They have to be a bit more concerned, however, than suggested by [2, Sect. 4.1]. Also, we explicitly illustrate in Sect. 3 that continued usage of 1024-bit prime field ElGamal or DSA keys is much riskier than it is for 1024-bit RSA (all are still commonly used), because once a successful attack has been conducted against a single well-chosen prime field all users of that prime field [27, Sect. 4] may be affected at little additional effort [2].

As shown in Sect. 5 our result gives a good indication for the difficulty of computing discrete logarithms in multiplicative groups of other 768-bit prime fields as well. One such group, the so-called First Oakley Default Group, is of some historical interest as it was one of the groups supported by the Internet Key Exchange standard from 1998 [15], a standard that has been obsolete since 2005 [16]. In some cryptographic applications, however, one may prefer to use a generator of a relatively small prime order subgroup of \(\mathbf{F}_p^\times \) that is chosen in such a way that comparable efforts would be required by Pollard’s rho in the subgroup and by the number field sieve in \(\mathbf{F}_p^\times \). Our choice of p assures that no (published) shortcut can be taken for our discrete logarithm computation. It also represents the most difficult case for the number field sieve, in particular for its linear algebra step. It follows from the numbers presented below that, for a discrete logarithm computation, our choice is overall more difficult than a subgroup order that may sometimes be preferred for cryptographic applications. With independently optimized parameters the two efforts are however of the same order of magnitude (cf. Sect. 4).

Two simple methods can be used to give an a priori estimate of the effort to solve our 768-bit prime field discrete logarithm problem. The first is direct extrapolation (cf. Sect. 2): given that solving a 596-bit prime field discrete logarithm problem took 130 core years (cf. [7]), extrapolation suggests that our 768-bit problem should be doable in about thirty thousand core years. For the second method we observe that the number field sieve for factoring or for prime field discrete logarithms is essentially the same algorithm. When applied to 768-bit composites or 768-bit prime fields and when using comparable number fields, they deal with similar probabilities and numbers of comparable sizes, with the sole exception occurring in the linear algebra step: although in both cases the matrix is very sparse and all non-zero entries are (absolutely) very small, when factoring linear algebra is done in a matrix modulo two, but for discrete logarithm problems the matrix elements are taken modulo the group order (a 767-bit integer in our case). An opposite effect, however, is caused by the fact that, with proper care, the number fields will not be comparable because modulo large primes polynomial selection methods can be used that do not work modulo large composites.

It follows that the numbers reported in [21] can be used to derive an upper bound for the 768-bit prime field discrete logarithm effort, simply by using a 767-fold increase (cf. Sect. 3) of the linear algebra effort from [21] while leaving the other steps unchanged. With [21, Sect. 2.4] we find that fifty thousand core years should suffice for our problem. If we would switch to a 768-bit prime that allows a much smaller but cryptographically still interesting subgroup this rough overall estimate would be reduced by a factor of about five.

Thirty or fifty thousand core years would be a waste of resources for a calculation of this sort, and the more doable small subgroup alternative would be of insufficient interest; independent of our estimates, a very similar figure was derived in [2, Sect. 4.1]. All these estimates, however, overlook several points. Direct extrapolation of the 596-bit effort turned out to be meaningless due to software improvements and because the limited size did not allow an optimization that applies to our case. But more importantly, the very different nature and size of the moduli used in, respectively, the polynomial selection and linear algebra steps imply a radical shift in the trade-off between the steps of the number field sieve, which in turn leads to very different parameter and algorithmic choices compared to what is done for factoring. We are not aware of a satisfactory theoretical analysis of this different trade-off and the resulting parameter selection, or of a reliable way to predict the practical implication for the relative hardness of integer factoring and prime field discrete logarithm problems. It is clear, however, that the issue is more subtle than recognized in the literature, such as [26, 31] and, more recently, [2, Sect. 4.1].

As described in Sect. 3, adapting the parameter choices and algorithms to the case at hand – and guided by multiple experiments – it was found that it should be possible to reduce the fifty thousand core years estimate by almost an order of magnitude. This led to the conclusion that actually doing the full calculation would be a worthwhile undertaking: in the first place because it shows that for our current range of interest k-bit factoring and computing k-bit prime field discrete logarithms require a comparable effort; and in the second place, and possibly more interesting, because it required more than just a casual application of known methods.

The previous 596-bit and current 768-bit prime field discrete logarithm records should not be confused with extension field discrete logarithm records. Due to recent developments, we now have much better methods than the number field sieve to compute discrete logarithms in small characteristic extension fields. As a consequence, those fields are no longer relevant for basic cryptographic applications such as DSA. Indeed, recent extension field records imply that impractically large extension fields would have to be used to get an appreciable level of security: for instance, computing discrete logarithms in the multiplicative group of the 9234-bit field \(\mathbf{F}_{2^{2\cdot 3^5\cdot 19}}\) took less than fifty core years [14], and in the 3796-bit group \(\mathbf{F}_{3^{5\cdot 479}}^\times \) the problem was dealt with in less than a single core year [18]. On the other hand, the current characteristic two prime extension degree record involved the much smaller finite field \(\mathbf{F}_{2^{1279}}\) and took between three and four core years [20]: the advantage of the new methods over the number field sieve strongly depends on properties of the extension degree, but for favorable degrees the advantage is much bigger than the advantage for the number field sieve when factoring special numbers (such as Mersenne or Fermat numbers) compared to general ones (such as RSA moduli).

While the correctness of the outcome of our calculation can simply be verified, independent validation of the other claims made in this paper requires access to suitable source code and data. We have established a long-standing tradition of open collaborations [30] with other leading researchers in this field (see [21, 22] and the references therein) which applies to anything relevant for the present project as well.

The paper is organized as follows. Section 2 presents the background for the rest of the paper. Section 3 describes the impact of the parameter selection on the way one of the main steps of the number field sieve is best implemented for the problem solved here and lists all relevant details of our new record calculation. Section 4 gives more details about the trade-off between the main steps of the number field sieve, and presents estimates for the effort required to solve a discrete logarithm problem in a small subgroup. In Sect. 5 it is shown that our choice of \(p=[2^{766}\pi ]+62762\) is not more or less favorable than other primes of the same size.

2 Algorithm Overview

Descriptions of the number field sieve are available in the literature, ranging from the high level narrative [33] to the somewhat simplified and fully detailed versions in [29] and [28], respectively.

Index Calculus Method [1, 24, 25]. Let \(\mathbf{F}_p\) be a finite field of cardinality p, identified with \(\{0,1,\ldots ,p-1\}\) in the usual manner, and let g generate its multiplicative group \(\mathbf{F}_p^\times \). To compute discrete logarithms with respect to g, an index calculus method fixes a so-called factor base \(B\subset \mathbf{F}_p^\times \), collects more than \(\#B\) multiplicative relations between the elements of \(B\cup \{g\}\), and uses linear algebra modulo the order of g to determine for all elements of B their discrete logarithm with respect to g. Given this information, the discrete logarithm of any \(h\in \mathbf{F}_p^\times \) is then found by finding a multiplicative relationship between h and the elements of \(B\cup \{g\}\).

Doing more or less the same modulo a composite \(N\) (as opposed to modulo p) and using linear algebra modulo two (as opposed to modulo the order of g) an integer solution to \(x^2\equiv y^2\bmod N\) may be found, and thus a chance to factor \(N\) by computing \(\gcd (N,x-y)\). This explains the similarity in the algorithms for factoring and computing discrete logarithms as well as the difference between the matrices for factoring and discrete logarithms that was pointed out in the introduction. The effect of the prime p versus the composite N, as also mentioned in the introduction, is touched upon below and in Sect. 3.

Different index calculus methods vary mostly in the way the multiplicative relations are found. This affects the way B is chosen. For prime p for instance, relations may be collected by considering \(g^e\) for random integers e and keeping those that factor over B. With B the set of primes up to some bound b one would thus be collecting b-smooth \(g^e\)-values. Faster methods increase the smoothness probabilities by generating smaller values in \(\{1,2,\ldots ,p-1\}\); select the values in an arithmetic progression so that sieving can be used to faster recognize smooth values; allow in relations a few large primes between b and a large prime bound \(b_\ell \); or they manage to combine those speedups. Dan Gordon [13] was the first to show how for prime p the ideas from the number field sieve for integer factorization [28] can be included as well. Many other variants have been proposed since then; the most accurate reference for the one used here is [35].

Relations in the Number Field Sieve. A property of the number field sieve that sets it apart from the earlier index calculus methods is that for a relation two distinct numbers must be smooth (with both numbers asymptotically significantly smaller than the values considered before). Let f and g be two coprime irreducible polynomials in \(\mathbf{Z}[X]\) of degrees \(d_f\) and \(d_g\), respectively, chosen in such a way that they have a root m in common modulo p (see Sect. 3 for how this may be done). A relation corresponds to a coprime pair of integers (ab) with \(b\ge 0\) such that the two integers \({\mathcal N}_f(a,b)=b^{d_f}f(\frac{a}{b})\) and \({\mathcal N}_g(a,b)=b^{d_g}g(\frac{a}{b})\) are smooth with respect to appropriately chosen bounds.

This is, very briefly, explained as follows. The integer \({\mathcal N}_f(a,b)\) is essentially (except for the leading coefficient of f) the norm of \(a-\alpha _fb\in \mathbf{Z}[\alpha _f]\subset \mathbf{Q}(\alpha _f)\), where \(\alpha _f\) denotes a zero of f and \(\mathbf{Q}(\alpha _f)\) is the algebraic number field \(\mathbf{Q}[X]/(f(X))\). The smoothness of \({\mathcal N}_f(a,b)\) then implies a factorization into small prime ideals in \(\mathbf{Q}(\alpha _f)\) of the ideal \((a-\alpha _fb)\) (cf. [8]). Noting that mapping \(\alpha _f\) to the common root m results in a ring homomorphism \(\varphi _f\) from \(\mathbf{Z}[\alpha _f]\) to \(\mathbf{F}_p\), and defining \(\alpha _g\) and \(\varphi _g\) in a similar manner for g, a relation (ab) thus corresponds to factorizations of the ideals \((a-\alpha _fb)\) and \((a-\alpha _gb)\) that map, via \(\varphi _f\) and \(\varphi _g\), respectively, to the same element \(a-bm\in \mathbf{F}_p\).

The 768-bit factorization from [21] used degrees \(d_f=6\) and \(d_g=1\); consequently, the labels algebraic or rational were used to distinguish values and computations related to f or g, respectively. These intuitive labels can no longer be used here, because the primality of our modulus (p) offers flexibility in the polynomial selection that is not available for composite moduli and that resulted, as could be expected, in the “better” choices \(d_f=3\) and \(d_g=4\) for the present paper. Though the f- and g-related parts here are thus both algebraic, it will be seen that the g-part is easier to deal with, and thus, to some extent, corresponds to the rational side in [21]. Because f and g have rather different properties, different considerations come into play when selecting the factor bases for \({\mathcal N}_f(a,b)\) and \({\mathcal N}_g(a,b)\). The single factor base B is therefore replaced by two distinct factor bases, denoted by \(B_f\) and \(B_g\). For the present purposes it may be assumed that \(B_f\) and \(B_g\) consist of the primes bounded by \(b_f\) and \(b_g\). We make no distinction between f and g for the large prime bound (which is thus still denoted by \(b_\ell \), with \(\#B_\ell \) denoting the number of primes bounded by \(b_\ell \)), but may allow different numbers of large primes in \({\mathcal N}_f(a,b)\) and \({\mathcal N}_g(a,b)\), denoted by \(n_f\) and \(n_g\).

Finding Relations in the Number Field Sieve. As each relation requires two numbers being smooth, collecting relations is a two-stage process: in the first stage pairs (ab) for which \({\mathcal N}_f(a,b)\) is smooth are located; in the second stage, from the pairs found those for which \({\mathcal N}_g(a,b)\) is smooth as well are selected. Thus, the first stage treats the numbers that are least likely to be smooth, thereby minimizing the number of pairs to be considered for the second stage: switching the roles of f and g would have led to more pairs to be treated in the second stage. Depending on the factor base sizes, various methods may be used to find relations.

The search for relations is typically limited to a (large) rectangular region S of the lattice \(\mathbf{Z}^2\). For the first stage (and numbers in the current range of interest) index-q sublattices \(L_q\) of \(\mathbf{Z}^2\) are identified such that q divides \({\mathcal N}_f(a,b)\) for all pairs \((a,b)\in L_q\) and for primes q close to and often somewhat larger than \(b_f\) (these primes are referred to as special q primes). The process described below is repeated for different special q primes until, after removal of unavoidable duplicates, enough relations have been collected.

Given a special q prime, lattice sieving is conducted over a rectangular region \(S_q\) (which roughly approximates \(L_q\cap S\)), to locate (ab) pairs for which \({\mathcal N}_f(a,b)\) is \(b_f\)-smooth (except for q and at most \(n_f\) large primes \(\le b_\ell \)). The number of “surviving” pairs thus found is denoted by \(y_f\). If \(y_f\) is large (and the pairs are not spread too widely as for instance in [9, 22]), it is best to again use lattice sieving in \(S_q\) to collect from those \(y_f\) pairs the \(y_g\) pairs that are actually relations, i.e., for which \({\mathcal N}_g(a,b)\) is \(b_g\)-smooth as well (again with at most \(n_g\) large primes \(\le b_\ell \)). This is the regular approach to the second stage, and was used in [21]. But there are circumstances where the second stage is best done in another manner: in [22], for instance, factorization trees (cf. [12, Sect. 4] and [4]) were used. This is also the approach taken here, as further described in Sect. 3.

Effort Required by the Number Field Sieve. With natural logarithms, let

$$\begin{aligned} E(x,c)=\exp \big ((({\textstyle \frac{64}{9}})^{\frac{1}{3}}+c)(\log x)^{\frac{1}{3}}(\log \log x)^{\frac{2}{3}}\big ); \end{aligned}$$

this slight variation on a well-known and more common notation allows us to focus on what is of greatest interest in the present context. The current best heuristic expected effort to compute a discrete logarithm in \(\mathbf{F}_p^\times \) using the number field sieve is E(po(1)), asymptotically for \(p\rightarrow \infty \) [13]. This is the same as the effort \(E(N,o(1))\) (for \(N\rightarrow \infty \)) to factor a composite \(N\) using the number field sieve [8]. This “same” should, however, be taken with a grain of salt because the o(1) hides different functions for the two cases.

Optimal Factor Base Sizes. The smoothness probabilities, the number of relations to be collected, and the dimension of the matrix handled by the linear algebra all increase with the smoothness parameters \(b_f\), \(b_g\), \(b_\ell \), \(n_f\) and \(n_g\). The resulting trade-off leads to optimal factor base sizes \(\#B_f\) and \(\#B_g\), namely \(E(p,o(1))^{\frac{1}{2}}\) for discrete logarithms and \(E(N,o(1))^{\frac{1}{2}}\) for factoring. As noted above, even if \([\log p]=[\log N]\), in a given model of computation the optimal values for both factoring and discrete logarithm computation may be very different because the two o(1)-functions behave quite differently. Moreover, in practice the situation is further complicated because of the software and hardware actually used. Thus, naively using factor base sizes that worked well for a factoring problem for a similarly sized prime field discrete logarithm problem, as done in the introduction and despite the “correction” attempted there, will at best result in a rough upper bound. Section 3 discusses this issue in more detail.

Remark on Using E(x, c) in Practice. The uncertain function hiding in the o(1) makes it challenging to use E(po(1)) to give an absolute estimate for the effort to solve a discrete logarithm problem in \(\mathbf{F}_p^\times \). It turns out, however, that a somewhat pessimistic indication can be obtained for the relative effort for \(\mathbf{F}_{\bar{p}}^\times \) compared to \(\mathbf{F}_{p}^\times \), for \(\bar{p}\) not much bigger than p (say, \(\bar{p}\le p^{\frac{4}{3}}\)), by dropping the o(1). Obviously, this assumes similar software that suffers no ill side-effects nor profits from new optimizations when moving to the larger \(\bar{p}\). The same works for factoring.

As an example, the three orders of magnitude difference between the efforts of factoring 768-bit and 1024-bit moduli, as mentioned in the introduction, follows from \(\frac{E(2^{1024},0)}{E(2^{768},0)}\approx 1200\); the jump from 130 core years for a 596-bit prime field discrete logarithm problem to about thirty thousand core years for 768 bits follows from \(\frac{E(2^{768},0)}{E(2^{596},0)}\approx 275\) – an extrapolation that failed to be useful because of the reasons mentioned in the introduction.

3 Computational Details

This section provides some background on our parameter choices. For comparison, we also provide the parameters that were used for the 768-bit factoring effort from [21].

Polynomial Selection. To get an initial impression of the feasibility of the calculation an extensive search was conducted using the method from [17]. First all integer polynomials g of degree four with coefficients absolutely bounded by 165 (and noting that g(X), \(g(-X)\), and \(X^4g(\frac{1}{X})\) and thus \(X^4g(\frac{-1}{X})\) are equivalent) were inspected, by using, for all the roots of g modulo p, lattice reduction to find a corresponding degree three integer polynomial f, and measuring the overall quality of all resulting pairs (fg) (as usual with respect to their small modular roots and size properties). For the second search, with bound 330, roots and size properties of g were first considered and only for the most promising candidates the roots of g modulo p were calculated and, if any, the polynomial f was derived. The best pair was found during the first search:

Because it requires root finding modulo p, the above search does not work to find polynomials for the number field sieve for integer factorization. There one is limited to more restrictive methods that cannot be expected to result in polynomials of comparable “quality”, with respect to the metric used in this context: indeed, the above pair is noticeably better than the degree (6, 1) pair used for the slightly smaller 768-bit modulus factored in [21]. A more quantitative statement requires a more careful analysis than we are ready to provide here. No significant amount of time was spent on searching for pairs (fg) of other degrees than \(d_f=3\) and \(d_g=4\).

Parameter Selection Background. The two main steps of the number field sieve after the polynomial selection, relation collection and linear algebra, are of a very different nature. Relation collection is long-term but low-maintenance: core years are easily harvested on any number of otherwise idle independent cores on any number of clusters that one can get access to, progress will be steady, and the process requires almost no human interaction. The results can easily be checked for correctness (cf. [22, Sect. 6]) and results that are lost or forgotten are easily replaced by others. Compared to this almost “happy-go-lucky” relation collection process, the linear algebra is tedious and cumbersome, despite the elegance of the block Wiedemann method used for it [10, 37]. It involves careful orchestration of a (modest number of) substeps each of which requires as many tightly coupled cores as needed to store the data (easily on the order of hundreds of GB), frequent checkpointing, and a central step that is even more memory-demanding but otherwise fortunately relatively swift. Overall, based on past experience core years are collected at about half the rate compared to relation collection.

For both main steps the required effort is well understood:

  • Given relation collection software and any choice of smoothness parameters a small number of experiments suffices to get an accurate indication for the effort required to collect any specified number of relations (it follows from the description in Sect. 5 how this may be done).

  • Similarly, given block Wiedemann software and any matrix dimension, weight, and modulus-size, the overall linear algebra effort can be reliably estimated based on the effort required for a few matrix\(\,\times \,\)vector multiplications on the processor network of one’s choice.

However, the relations as collected are never directly used for the linear algebra, because doing so would be hugely inefficient. Instead, a linear algebra preprocessing step is applied to the relations in order to reduce the dimension of the matrix while keeping its weight under control, thereby (substantially) reducing the linear algebra effort. This preprocessing becomes more effective as more relations are available (cf. Sect. 4) but the precise behavior of both dimension and weight depends on how (large) primes in a relation can be matched with the same primes in other relations and is thus uncertain. In practice one collects relations while occasionally doing a preprocessing attempt, and stops when the resulting linear algebra effort is within the targeted range. When to stop is a judgment call as more often than not the additional effort invested in relation collection is more than the expected linear algebra savings: it thus serves more to reduce the linear algebra headaches than to reduce the overall effort. As an example, for the current 768-bit factoring record about twice the strictly necessary relation collection effort was spent to make the linear algebra more manageable, an extra effort that was commented on as being “well spent” (cf. [21, Introduction]). These “negative returns” are further illustrated in Sect. 4.

Based on consistent past behavior of the preprocessing and given specific smoothness parameters, it can be roughly estimated how many relations have to be collected for a targeted matrix dimension and weight. Given the uncertainty alluded to above, this estimate can only be a guess, though it is a mildly educated one. With the known behavior of the software, an overall effort estimate assuming those specific smoothness parameters can be derived. Repeating this for different smoothness parameters, the “best” – despite a lack of clear optimization criteria – overall effort then follows.

Parameter Selection. Our starting point was that on current equipment the linear algebra effort for the 768-bit modulus factored in [21] would be about 75 core years. Given the similarity of the algorithms and sizes, and using the same smoothness parameters as in [21], the overall effort to solve our discrete logarithm problem can be estimated as \(1500+767\cdot 75=59025\) core years; due to the small entries of the matrix the linear algebra effort only depends linearly on the size of the group order. The fifty thousand core years estimate mentioned in the introduction then follows from the expected favorable comparison of polynomials found using the method from [17] compared to the method used in [21]; we refer to the papers involved for an explanation of this effect.

All that is clear at this point is that attempts to lower this estimate must focus on lowering the linear algebra effort; thus the smoothness parameters must be reduced, but by how much and what the overall effect is going to be is unclear. Because the block Wiedemann effort is roughly proportional to the product of the matrix dimension and weight, reducing the matrix dimension by a factor of c while keeping the same average row-weight, cuts the linear algebra effort by a factor of \(c^2\). Thus, given any targeted linear algebra effort a reduction factor c for the dimension follows. Assuming that, for our problem, a thousand core years would be acceptable for the linear algebra effort, a dimension reduction factor of about 7.6 follows, because \(\frac{767\cdot 75}{7.6^2}\approx 1000\). Compared to the parameters used in [21], such a drastic reduction requires severely cutting the smoothness parameters and the number of special q primes one may use. This in turn entails a more than proportional increase in the search space and thus a substantial increase in the overall relation collection effort compared to the 1500 core years spent in [21]. A priori, however, and as argued above, the effect of any of the changes that one may wish to consider cannot be accurately predicted.

While conducting experiments with a 640-bit example to better understand the increase in the relation collection effort depending on various possible combinations of smaller smoothness parameters and larger search spaces, we observed a mildly beneficial side-effect which – once observed – is obvious, but which was unanticipated: in the notation of Sect. 2, if \(B_f\) decreases, the number \({y}_f\) of \(b_f\)-smooth norms \({\mathcal N}_f(a,b)\) becomes smaller too, at a given point to an extent that it becomes more efficient to replace sieving for the second search stage (as used in [21]) by factorization trees. For the 640-bit example the effect was still small, i.e., \({y}_f\) was still relatively large. But for our 768-bit prime the impact soon turned out to be considerable, almost halving the (inflated, compared to [21]) relation collection effort.

The resulting “best” parameters that we settled for are listed in Table 1 (though for some special q primes larger factor bases were used), along with the parameters used in [21] for the 768-bit number field sieve factorization. The clear difference is that the choices for 768-bit factoring were optimized for speed during relation collection (collecting relations until the after-preprocessing matrix dimension and weight were found to be acceptable), whereas our choices try to squeeze as many relations as possible out of every special q prime under a relatively restrictive smoothness regime. Compared to [21], \(\#B_f\) is reduced by a factor of a bit more than two, the number of special q primes is reduced by a factor of more than twenty, the number of large primes per relation is cut from \(4+3\) to \(2+2\) with a large prime bound that is reduced by a factor of \(2^4\) from \(2^{40}\) to \(2^{36}\), while \(\#B_g\) remains unchanged and the search space is on average (forced to be) more than \(2^8\) times larger. As a result the number of relations per core unit of time drops by a factor of about sixteen compared to [21].

The first preprocessing attempt that resulted in the hoped-for matrix dimension and weight occurred when 1.09e10 relations had been collected, i.e., about six times fewer relations than in [21]. The resulting overall relation collection effort thus became \(\frac{16}{6}\cdot 1500= 4000\) core years. With 920 core years the linear algebra effort was close to but less than the thousand core years that we had hoped to achieve. A much smaller set of relations would in principle have sufficed too, but it would have resulted in a larger linear algebra effort; Sect. 4 below describes the trade-off in more detail.

Some of the details in Table 1 are listed for completeness; for explanations we refer to [21]. Like most of our computational number theory colleagues, we missed the fact (which apparently had not escaped numerical analysts) that the evaluation stage of the block Wiedemann method can be sped up considerably using Horner’s rule [19]; it would have reduced our overall effort to approximately 5000 core years.

Table 1. Comparison of 768-bit factoring and computing 768-bit prime field discrete logarithms.

Database. The linear algebra step resulted in the (virtual) logarithms of 24 million prime ideals. Spending less than 200 core years for additional sieving and further processing, a database was built containing the about 3e9 logarithms of all prime ideals of norms up to \(2^{35}\).

Individual Logarithms. Using the database and q-descent, the logarithm of the target t from the introduction was computed in approximately 115 core hours. Similar computations for \(t+1, t+2, \ldots , t+10\) took on average about 96 core hours per logarithm.

With improved software any individual logarithm can now be computed at an average effort of 43 core hours (and a rather large variation, cf. Table 1). Further software enhancements are easily conceivable, but this already insignificant effort underscores the point made in the introduction that once a single number field sieve based attack has been carried out successfully, other attacks against the same prime field are straightforward.

4 Trade-Off

During relation collection occasional preprocessing attempts were made, until the resulting matrix was found to be acceptable. Data about the non-final attempts were not kept, so for the purpose of the present paper part of this work was redone to be able to give an impression how the linear algebra effort decreases as more relations become available.

Table 2 summarizes the results of these “after the fact” preprocessing attempts of the sets of relations found for special q primes up to increasing bounds \(b_q\), along with the resulting extrapolations of the block Wiedemann efforts (not using Horner’s rule for the evaluation stage). Estimates are also listed for the effort required for a discrete logarithm problem in a 160-bit subgroup of the multiplicative group of a 768-bit prime field. For each set of relations up to five preprocessing attempts were made, and the best was selected depending on the subgroup size; this explains why for five of the 160-bit subgroup entries the matrix dimension and weight are different from those in the 767-bit subgroup entry. The last two rows show the effect of the 1e8 forgotten relations (cf. Table 1): including those and spending more time on constructing the matrix could have reduced the matrix effort by 20 (or 5) core years.

The difference between the linear algebra estimate in the last row for the matrix as actually used and the effort as reported in Table 1 is due to a lower number of nodes on which the experiment was run: for a full execution it would lower the linear algebra effort, but increase the calendar time. The effort required for polynomial selection and individual logarithms is independent of the \(b_q\)-value, and is not included in the “combined effort”. The database building effort may be up to three times larger for the smallest feasible \(b_q\)-value, but is not included either.

The numbers in the “combined effort” columns of Table 2 illustrate the negative returns mentioned in Sect. 3: with more patience to deal with a larger linear algebra problem (that would have required disproportionally more calendar time), our overall effort could have been reduced from 5300 to less than 4000 core years. As in [21], the additional relation collection effort was well spent, because a large block Wiedemann job requires constant attention and any way to reduce the calendar time is welcome.

Note that for the smaller subgroup problem the overall least effort is reduced by a factor smaller than two.

Table 2. Relation collection effort, matrix dimension and weight as a result of preprocessing, estimated linear algebra effort, and the combined effort (all efforts are in core years), when using special q primes up to \(b_q\) and both for 767-bit and 160-bit subgroup orders. The overshoot factor is the ratio of the number of relations and the number of relations for the least \(b_q\) (2.8e8) for which enough relations had been found. Relations are unique and factored and include the free relations.

5 Other Prime Fields

To convince ourselves that our results were not due to unexpected, lucky properties of our choice of prime field, we tested ten other similarly chosen 768-bit primes and roughly compared them to our p with respect to their sieving yield. Define the following eleven transcendental or supposed-to-be transcendental numbers:

figure a

For \(0\le i\le 10\) let \(\epsilon _i=767-[\frac{\log \rho _i}{\log 2}]\) and let \(p_i\) be the least prime larger than \(2^{\epsilon _i}\rho _i\) for which \(\frac{p_i-1}{2}\) is prime as well. Then \(p_0=p\). Let \(\pi _j\) be the number of primes in \([j\cdot \text {1e7},(j+1)\cdot \text {1e7}]\); for \(19\le j\le 62\) these intervals cover our range of special q primes (cf. Table 1).

For each of the eleven primes \(p_i\) with \(0\le i\le 10\) the following calculation was carried out:

Polynomial Selection. Find the best pair \((f_i,g_i)\) for \(p_i\) among the first 5e9 candidate polynomials for \(g_i\). (This requires about one core year.)

Sieving Experiments. For \(19\le j\le 62\) find the number \(r_j\) of relations when sieving with the parameters as in Table 1 but with the polynomials \(f_i\) and \(g_i\) and the prime \(p_i\) and for the least special q prime larger than \(j\cdot \text {1e7}+\text {5e6}\). (This requires less than four core days, cf. Table 1.)

Overall Yield Estimate. Let \(R_i=\sum _{j=19}^{62}\pi _jr_j\).

Table 3. Relative performance of \(p=p_0\) compared to ten other choices.

We also carried out the same sieving experiments for the polynomial pair (fg) from Sect. 3 and \(p=p_0\), finding an overall yield estimate \(R=\text {1.02e10}\). This is less than the 1.09e10 relations reported in Table 1, because there some of the sieving jobs used a larger factor base bound than reported in Table 1, thus producing more duplicates. But it is more than \(R_0\) (which was found to be 8.6e9), matching the expectation that (fg) is considerably better than \((f_0,g_0)\). Table 3 lists the relative performance of our p compared to the ten new choices: as can be seen in the final row, four of the ten perform better and six are worse, but they are all within a 6% margin from p. It also follows that the core years spent on polynomial selection for our p were well spent.

Although our tests counter suspicions about p being special, it may be argued that in practice primes used in cryptography would be chosen with high entropy [5]. Testing a few “random” primes as well might strengthen our argument. It is unclear to us, however, how such primes may be obtained in a manner that is sufficiently convincing to any suspicious reader, without input from that reader [32].

6 Conclusion

We presented the computation of a discrete logarithm in the multiplicative group of a 768-bit prime field. This is a new record in its category, beating the previous 596-bit record. We showed the beneficial effect of judicious choice of parameters and algorithms, and highlighted the differences with integer factorization. Based on our findings we may conclude that for sizes that are currently within reach of an academic effort, the hardness of factoring and computing prime field discrete logarithms is comparable, though discrete logarithms are harder. Although this was always suspected to be the case, the gap between the two problems is quite a bit smaller than we expected. Compared to the 768-bit factoring record (which required 1700 core years as opposed to our 5300 core years) we used less calendar time and a smaller collaborative and less heterogeneous effort [23]. We also conclude that the explicit 1024-bit estimates from [2, Sect. 4.1] should be redone, as they require not entirely straightforward re-optimization efforts.

Unless algorithmic improvements are proposed or new insights may be expected, pushing for actual new factoring or prime field discrete logarithm records – as opposed to studies that result in reliable estimates – is mostly a waste of energy. We are not aware of any developments based on which we could realistically expect publication of a 1024-bit record within the next, say, five years. As usual, this may change at any moment, but so far the predictions made back in 2009 (cf. [6]) have already turned out to be accurate, or remain valid. In this context it is relevant to note that the project embarked on in [3] is still ongoing.