Elsevier

Computer Networks

Volume 49, Issue 4, 15 November 2005, Pages 476-491
Computer Networks

A note on efficient implementation of prime generation algorithms in small portable devices

https://doi.org/10.1016/j.comnet.2004.12.007Get rights and content

Abstract

This paper investigates existing prime generation algorithms on small portable devices, makes optimizations and compares their efficiencies. It shows by comparing the performances that the bit array algorithm is the most efficient among all the existing prime generation algorithms. The paper further optimizes the implementation of the bit array algorithm by using an optimal parameter in the prime generations, namely the small prime set for its sieving procedure. A method for estimating the optimal small prime set for the bit array algorithm is provided. The paper gives generalized bit array algorithms which are able to find primes with special constraints, i.e., DSA primes and strong primes. Finally, the algorithms are implemented in a smart card and a PDA for validation. It shows that there is very little efficiency sacrifice for generating special primes with respect to generating random primes. It also shows that using optimal sets of small primes for prime generations will result in 30–200% efficiency improvement.

Introduction

Cryptographic functions based on public key cryptography [6], [19] have gained increasing attention from the research and commercial communities, as well as from end users. The use of public key cryptography can add security to a wide variety of applications. Especially, public key cryptography is a valuable tool for simplifying key management and enabling secure communications. Recently, there is a strong trend to use public key cryptography in small portable devices such as smart cards and handheld PDA’s to enable them to perform secure transactions. Those devices should be able to implement one or multiple public key cryptographic systems. Examples of public key algorithms that can be used by portable devices are RSA [17], Diffie–Hellman [7] and DSA [15]. Among them, RSA is the most popular public key cryptosystem and has been widely deployed in many portable devices to support protocols (e.g. communication protocols).

Many of the currently available portable devices possess a limited amount of hardware resource and computing power. Consequently, the processing speed attained by implementing public key cryptographic functions on those devices could be much slower than those on desktop computers. Because of this, there have been many studies on using specially designed hardware and software approaches to overcome the limited hardware resource and improve the performance of certain cryptographic functions [10], [13], [16].

This paper focuses on one of the important problems in public key cryptosystems—the generation of large random primes on resource-constrained devices. Due to the nature of the procedure for prime generation, which will be discussed later in the paper, the generation of large random primes is very time costly. For instance, the generation of 1024-bit primes, which can correspond to 2048-bit RSA key pairs, may cost several minutes to accomplish in devices like smart cards. For some applications, a user may need a higher security level that requires generating even larger primes, e.g. 2048-bit primes. In this case, the time needed for generating 2048-bit random primes is usually much more.1 One of the reasons contributing to the low performance of large prime generation is the hardness of finding efficient primality testing algorithms. Meanwhile, another reason is that the existing prime generation algorithms and the implementations are not sufficiently investigated, particularly for small portable devices.

There have been a number of prime generation algorithms implemented on small portable devices, with performances varying widely. Unfortunately, most implementations simply use any prime generation algorithm available to them without noticing the big performance variations among them. As a result, some of the implementations run for an unreasonable amount of time. The total time for key generation can be very long and sometimes unacceptable, especially when a group of keys need to be generated or if some low-end tamper resistant devices are used. In addition, many small portable devices have limited storage space, particularly non-persistent storage that is used to store temporary values. The problem is aggravated by the fact of having several applications competing for the limited storage. Naturally, prime generation algorithms should optimize their storage requirements, what is not considered in the usual publicly available prime generation algorithms. Therefore, it is necessary to optimize the performance and storage requirements of prime generation algorithms for small portable devices.

This paper investigates existing prime generation algorithms, makes optimizations and compares their efficiencies in terms of time and memory space required. Hence, it provides a good reference for software engineers who implement large prime generations on small portable devices where resource is limited. The paper initially discusses one of the most used ways for prime generation—incremental search. Then, several optimizations are made to the incremental search prime generation algorithm. The study of performances shows that the table lookup and bit-array algorithms are the most efficient among all the algorithms examined. In addition, the storage requirements are compared. The bit array algorithm requires significantly less memory than the table lookup algorithm and therefore is the best choice among the algorithms, when both time and memory efficiencies are considered.

One of the important issues in optimizing incremental search prime generation algorithms is in choosing a small prime set (SPS) for the sieve procedure. The paper analyzes the factors that affect the choices of the optimal SPS sets when generating different sizes of primes on portable devices and proposes a method that can predicate those values instead of exhaustively searching for them. It is shown by experiments that the efficiencies of prime generations can be improved by 30% in the worst case and 200% in the best case by using optimal SPS sets, compared with using some commonly used SPS sets. The paper follows with a discussion of a prime generation algorithm, called the constructive method [10], [11], which differs from incremental search. The paper compares the performance of the bit array algorithm using incremental search with that of the constructive method and shows that the bit array algorithm outperforms its peer. Therefore it can be concluded that the bit array algorithm using incremental search is the most efficient among all the available prime generation algorithms that can be implemented on small portable devices.

Some public key cryptosystems have additional requirements on the primes to be generated. For example, RSA requires random primes, or strong primes for meeting X9.31 standard; DSS system requires DSA primes; and Diffie–Hellman systems require random or safe primes. This paper describes how the bit array algorithm can be generalized to generate primes guaranteed to satisfy such constraints while preserving efficiency.

This paper uses smart cards and PDA’s for discussions and validation. However, the results presented can be used as a guideline for any other forms of resource-constrained computing systems that need to implement prime generation algorithms.

Section snippets

Background

Apart from being mathematically interesting, it is well known that efficient generation of prime numbers is of extreme importance in modern cryptography. Today, a prime generation implementation should be able to generate random primes that are at least 512 bits long. In many cases, the prime generation implementations will need to generate 1024-bit, 2048-bit or longer primes in order to support enhanced security levels in public key cryptosystems.

A generic approach for prime generation is to

Prime generation algorithms

The generic form of incremental search algorithms will choose new candidate numbers by adding 2 to the old ones when they fail in the primality test oracle until a probable prime is found. The procedure for generating an l-bit prime number using incremental search is illustrated as below.

  • Step 1.

    Generate an l-bit odd candidate q.

  • Step 2.

    Perform primality testing on q.

  • Step 3.

    If q is a probable prime, output q and terminate the search.

  • Step 4.

    Increment q by 2 and go to Step 2.

In Step 2, a primality test oracle is used for

Optimizations on the sieve procedure

There have been three different implementations of the sieve procedure for incremental search prime generation. They are the test-division algorithm (TDA), the table-lookup algorithm (TLU) and the bit array algorithm (BTA). The test-division algorithm is used by many prime generation applications including RSAREF [20], OpenSSL and etc. The algorithm directly divides the candidates by the elements in the SPS set to check if the candidates are composites with small prime factors as shown in Fig. 1

Sieve performance and prime generation time

Algorithms using the three different sieve methods were implemented in the HP iPAQ 5500 pocket PC. We measure the performance overheads for the three sieve procedures using different upper boundaries for the SPS sets. The results are shown in Fig. 5, where the x-axis represents the upper boundaries for the SPS sets and the y-axis represents the overheads in seconds.

From Fig. 5, we can see clearly that the test division algorithm is much slower than the other two algorithms. The bit array and

Optimal SPS sets

As already discussed, it is important to choose a proper set of small primes in the sieve procedure for prime generations. Initially, increasing the size of the SPS set will substantially improve the performance of prime generations because it reduces the number of calls to the primality test oracle. However, the improvement will become less and less significant as the size of the SPS set grows. After some point, the increase due to the sieve procedure overhead will exceed the improvement

Efficiency of constructive prime generation

As aforementioned, a large prime can also be generated by using the constructive method [10], [11] as given in Fig. 9.

The algorithm generates candidates that are coprime to η, which is the product of all the elements in SPS(r), thus increasing the probability for the candidates to be primes. Despite of several obvious differences from the incremental search, the constructive prime generation algorithm needs to use the primality test oracle as those in incremental search for examining the

Generation of special primes

Some public key cryptosystems require primes of special forms instead, or in addition to random primes. The special primes needed by public key cryptosystems include DSA primes, strong primes and safe primes. Therefore, we modify the bit array algorithm in this section to generate those special primes with good efficiency.

We give a generalized form of incremental search which uses an incremental value k other than 2. Consequently, the sequence of candidates that will be search is q, q + 2w, q + 4w, 

Conclusion and future work

This paper compares the efficiencies of implementations of various prime generation algorithms on small portable devices. It is shown that the bit array algorithm is the most efficient algorithm for generation of probable primes. The paper further optimizes the bit array algorithm by discussing how to choose an important parameter—the optimal SPS set. The implementations show that the performances of prime generations can be substantially improved by using the optimal SPS sets. Besides random

Acknowledgement

The authors are grateful to the anonymous reviewers for their valuable comments. The research was partly supported by Microsoft.

Chenghuai Lu is a Ph.D. student in the College of Computing at Georgia Tech. His research interests include optimizing cryptographic functions on tamper resistant devices and side-channel cryptographic analysis.

References (20)

  • E. Bach et al.

    Algorithmic Number Theory

    Foundations of Computing Volume I: Efficient Algorithms

    (1996)
  • W. Bosma, M.P. Van Der Hulst, Primality proving with cyclotomy. Doctoral Dissertation, University of Amsterdam,...
  • D.M. Bressoud

    Factorization and Primality Testing

    (1989)
  • J. Brandt, I. Damgard, P. Landrock, Speeding-up prime number generation, in: Proceedings of the ASIACRYPT’91, pp....
  • J. Brandt, I.B. Damgard, On generation of probable primes by incremental search, in: Proceedings of the CRYPTO’92, pp....
  • W. Diffie, M. Hellman, Multiuser cryptographic techniques, in: Proceedings of AFIPS National Computer Conference, 1976,...
  • W. Diffie et al.

    New directions in cryptography

    IEEE Transactions on Information Theory

    (1976)
  • P.X. Gallagher

    On the distribution of primes in short intervals

    Mathematica

    (1976)
  • G.H. Hardy et al.

    Some problems of ‘Partitio Numberorum’, on the expression of a number as a sum of primes

    Acta Mathematica

    (1922)
  • M. Joye, P. Palliar, S. Vaudeney, Efficient generation of prime numbers, CHES 2000, pp....
There are more references available in the full text version of this article.

Cited by (1)

Chenghuai Lu is a Ph.D. student in the College of Computing at Georgia Tech. His research interests include optimizing cryptographic functions on tamper resistant devices and side-channel cryptographic analysis.

Andre L.M. Dos Santos is an assistant professor in the College of Computing at Georgia Tech. His research interests include distributed systems security, security of systems using tamper resistant devices, and vulnerability analysis. He received a Ph.D. in computer science from University of California, Santa Barbara.

View full text