Abstract
Public-key cryptosystems normally spend most of their execution time in a small fraction of the program code, typically in an inner loop. The performance of these critical code sections can be significantly improved by customizing the processor’s instruction set and microarchitecture, respectively. This paper shows the advantages of instruction set extensions to accelerate the processing of cryptographic workloads such as long integer modular arithmetic. We define two custom instructions for performing multiply-and-add operations on unsigned integers (single-precision words). Both instructions can be efficiently executed by a (32 × 32 + 32 + 32)-bitmultiply/accumulate (MAC) unit. Thus, the proposed extensions are simple to integrate into standard 32-bitRISC cores like the MIPS32 4Km. We present an optimized Assembly routine for fast multiple-precision multiplication with ”finely” integrated Montgomery reduction (FIOS method). Simulation results demonstrate that the custom instructions double the processor’s arithmetic performance compared to a standard MIPS32 core.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Rivest, R.L., Shamir, A., Adleman, L.M.: A method for obtaining digital signatures and public key cryptosystems. Communications of the ACM 21, 120–126 (1978)
Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Transactions on Information Theory 22, 644–654 (1976)
National Institute of Standards and Technology (NIST): Digital Signature Standard (DSS). Federal Information Processing Standards Publication 186-2 (2000)
Blake, I.F., Seroussi, G., Smart, N.P.: Elliptic Curves in Cryptography. Cambridge University Press, Cambridge (1999)
Tenca, A.F., Koç, Ç.K.: A scalable architecture for Montgomery multiplicatio. In: Koç, Ç.K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 94–108. Springer, Heidelberg (1999)
Knuth, D.E.: Seminumerical Algorithms, 3rd edn. The Art of Computer Programming, vol. 2. Addison-Wesley, Reading (1998)
Montgomery, P.L.: Modular multiplication without trial division. Mathematics of Computation 44, 519–521 (1985)
Solinas, J.A.: Generalized Mersenne numbers. Technical Report CORR-99-39, University of Waterloo, Waterloo, Canada (1999)
Dhem, J.F., Feyt, N.: Hardware and software symbiosis helps smart card evolution. IEEEMicro 21, 14–25 (2001)
ARM Limited: ARM SecurCore Solutions. Product brief, available for download at (2002), http://www.arm.com/aboutarm/4XAFLB/$File/SecurCores.pdf
Dhem, J.F.: Design of an efficient public-key cryptographic library for RISC-based smart cards. Ph.D. Thesis, Université Catholique de Louvain, Louvain-la-Neuve, Belgium (1998)
De Micheli, G., Gupta, R.K.: Hardware/software co-design. Proceedings of the IEEE 85, 349–365 (1997)
The Open SystemC Initiative (OSCI): SystemC Version 2.0 User’s Guide. Available for download (2002), at http://www.systemc.org
Küc¸ ükc¸akar, K.: An ASIP design methodology for embedded systems. In: Proceedings of the 7th Int. Symposium on Hardware/Software Codesign (CODES 1999), pp. 17–21. ACM Press, New York (1999)
Gonzalez, R.E.: Xtensa: A configurable and extensible processor. IEEE Micro 20, 60–70 (2000)
Gschwind, M.: Instruction set selection for ASIP design. In: Proceedings of the 7th Int. Symposium on Hardware/Software Codesign (CODES 1999), pp. 7–11. ACM Press, New York (1999)
Wang, A., Killian, E., Maydan, D.E., Rowen, C.: Hardware/software instruction set configurability for system-on-chip processors. In: Proceedings of the 38th Design Automation Conference (DAC 2001), pp. 184–188. ACM Press, New York (2001)
Lee, R.B.: Multimedia extensions for general-purpose processors. In: Proceedings of the 1997 IEEE Workshop on Signal Processing Systems (SiPS 1997), pp. 9–23. IEEE, Los Alamitos (1997)
Lee, R.B.: Accelerating multimedia with enhanced microprocessors. IEEEMicro 15, 22–32 (1995)
Burke, J., McDonald, J., Austin, T.M.: Architectural support for fast symmetric-key cryptography. In: Proceedings of the 9th Int. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000), pp. 178–189. ACM Press, New York (2000)
Lee, R.B., Shi, Z., Yang, X.: Efficient permutation instructions for fast software cryptography. IEEE Micro 21, 56–69 (2001)
Phillips, B.J., Burgess, N.: Implementing 1,024-bit RSA exponentiation on a 32-bit processor core. In: Proceedings of the 12th IEEE Int. Conference on Application-specific Systems, Architectures and Processors (ASAP 2000), pp. 127–137. IEEE Computer Society Press, Los Alamitos (2000)
Moore, S.F.: Enhancing security performance through IA-64 architecture. Technical presentation at the 9th Annual RSA Conference (RSA 2000). Presentation slides are available for download (2000), at http://developer.intel.com/design/security/rsa2000/itanium.pdf
MIPS Technologies, Inc.: SmartMIPS Architecture Smart Card Extensions. Product brief, available for download (2001), at http://www.mips.com/ProductCatalog/P_SmartMIPSASE/SmartMIPS.pdf
STMicroelectronics: ST22 SmartJ Platform Smartcard ICs. Product brief, available for download (2002), at http://www.st.com/stonline/prodpres/smarcard/insc9901.htm
Großschädl, J.: Instruction set extension for long integer modulo arithmetic on RISC-based smart cards. In: Proceedings of the 14th Int. Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2002), pp. 13–19. IEEE Computer Society Press, Los Alamitos (2002)
Koc¸, C.K., Acar, T., Kaliski, B.S.: Analyzing and comparing Montgomery multiplication algorithms. IEEE Micro 16, 26–33 (1996)
Itoh, K., Takenaka, M., Torii, N., Temma, S., Kurihara, Y.: Fast implementation of publickey cryptography on a DSP TMS320C6201. In: Koç, Ç.K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 61–72. Springer, Heidelberg (1999)
Guajardo, J., Paar, C.: Modified squaring algorithm. Unpublished manuscript, available for download (1999), at http://www.crypto.ruhr-uni-bochum.de/~guajardo/publications/squaringManuscript.pdf
Menezes, A.J., van Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptography. CRC Press, Boca Raton (1996)
MIPS Technologies, Inc.: MIPS32 TM architecture for programmers, Vol. I & II. Available for download (2001), at http://www.mips.com/publications/index.html
MIPS Technologies, Inc.: MIPS32 4KmTMprocessor core family data sheet. Available for download (2001), at http://www.mips.com/publications/index.html
Großschädl, J., Kamendje, G.A.: A single-cycle (32×32+32+64)-bit multiply/accumulate unit for digital signal processing and public-key cryptography. In: Accepted for presentation at the 10th IEEE Int. Conference on Electronics, Circuits, and Systems (ICECS 2003),scheduled for Sharjah, UAE, December 14-17 (2003)
Gordon, D.M.: A survey of fast exponentiation methods. Journal of Algorithms 27, 129–146 (1998)
Quisquater, J.J., Couvreur, C.: Fast decipherment algorithm for RSA public-key cryptosystem. Electronics Letters 18, 905–907 (1982)
Walter, C.D.: MIST: An efficient, randomized exponentiation algorithm for resisting power analysis. In: Preneel, B. (ed.) CT-RSA 2002. LNCS, vol. 2271, pp. 53–66. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Großschädl, J., Kamendje, GA. (2004). Optimized RISC Architecture for Multiple-Precision Modular Arithmetic. In: Hutter, D., Müller, G., Stephan, W., Ullmann, M. (eds) Security in Pervasive Computing. Lecture Notes in Computer Science, vol 2802. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39881-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-39881-3_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20887-7
Online ISBN: 978-3-540-39881-3
eBook Packages: Springer Book Archive