Low-power compact composite field AES S-Box/Inv S-Box design in 65 nm CMOS using Novel XOR Gate
Highlights
► New full-custom S-Box/Inv S-Box AES using composite field arithmetic by resource sharing. ► Implementation using a novel low power 2-input XOR gate with only six devices. ► New XOR gate offers the lowest propagation delay and power consumption. ► Implementation using 65 nm CMOS technology and the area of the S-Box is 288 μm2 with 158 logic gates. ► Critical path delay of 7.322 ns, throughput of 130 Mbps and power dissipation of 0.09 μW (0.8 V).
Introduction
As cryptography plays a crucial role in the security of data transmission, AES based on Rijndael algorithm [1] was selected as a data encryption standard by the National Institute of Standards and Technology (NIST) in 1997 based on the primary criteria of security, performance, efficiency in software and hardware implementation, and flexibility. AES is one of the most common symmetric encryption algorithms and is widely adopted for a variety of encryption needs, such as wireless networks and secure transactions via the Internet. AES can be implemented on a wide range of platforms under different constraints [2]. In portable applications computing resources are usually limited and dedicated hardware implementation of the security process is essential [39]. Implementation using Field Programmable Gate Array (FPGA) is not suitable for such applications mainly due to size and power constraints. FPGA being a general purpose logic array usually there is some residual (unused) logic and I/O blocks, and consequently, highly compact implementation is difficult to achieve. In addition, FPGA implementation is prone to switching noise induced power analysis attack [42]. A compact small foot-print full-custom chip is more suitable in such a case. In addition, such a dedicated hardwired AES implementation can provide higher data rate for fast handling of ciphered network data packets in applications such as routers compared to software packages. The hardwired implementation is also physically secure since tempering by an attacker is more difficult. The overall efficiency of AES hardware implementation in terms of size, speed, security and power dissipation depends largely on the AES architecture [40]. For high throughput, loop-unrolled pipelined structure [4] is used, but on the other hand, to save power and area, iterative single round with resource sharing is implemented.
The S-Box is at the core of any AES implementation and is considered a full complexity design consuming the major portion of the power and energy budget of the AES hardware. This paper is focused on area-efficient low-voltage and low-power CMOS implementation of the S-Box/Inv S-Box. There are various reported techniques to implement the S-Box to satisfy the varying criteria such as power, speed and delay for different applications. Among them there are two main streams: (a) Implementation using look up tables (LUTs) which stores all predefined 256 8-bit values of S-Box in a Read-Only-Memory (ROM). The advantage of using LUT is that it offers a shorter critical path. However, it has a drawback of the unbreakable delay path [3] in pipelined designs, and hence it is not suitable for high speed applications. This delay prohibits each round unit from being divided into more than two sub-stages to achieve any further increase in processing speed [41]. It also requires a large area to implement both AES encryption and decryption as a different table is used in each case. (b) The alternative way is to design the S-Box circuit using combinatorial logic directly from its arithmetic operations. This approach has breakable delay-path for S-Box processing. Other S-Box architectures, such as positive polarity Reed–Muller structure [6], binary decision diagram (BDD) [7], or its variance, the twisted binary decision diagram (TBDD) [8] can achieve a high speed design but suffer from extremely large area cost. The S-Box design based on sum of product (SOP) expressions in Refs. [9], [10], [11] also suffer from large silicon area penalty.
A well-known approach to design S-Box from its arithmetic operations involves multiplicative inversion in GF (28) using composite field arithmetic [12], [13], decomposing the field operations from GF (28) to GF ((24)2). Subfield arithmetic is thus used in the computation of an inverse in the Galois Field. In this technique, hardware area cost can be reduced substantially by sharing the multiplicative inverse step for the SubBytes and the InvSubBytes operations. Also, among existing techniques, composite field S-Box architecture is the most area-efficient approach for AES encryption/decryption algorithm as the computation cost of certain Galois Field operations is lower when the operation is performed in an isomorphic composite field. The authors in Ref. [14] reported a fast composite field S-Box architecture that showed an increased throughput rate of 56.25% along with reduced pipeline latency by 40%–60% compared with other conventional designs. The approaches in Refs. [2], [15] results in a very small size of the S-Box, but suffers from a longer critical path than LUT technique. The LUT technique on the other hand has a shorter critical path compared to the composite field approach, but its area-size is 2–3 times larger.
Next, considering the S-box design methodologies reported so far, only Refs. [16], [17], [36] evaluated the performances of the S-Box using the full custom design technique. The advantage of full custom design using state of the art CMOS processes is that it is possible to scale all the transistors down with process scaling without deteriorating the overall performance along with increased speed in most cases. This leads to smaller chip area and low power consumption. Another design methodology is to reduce power consumption by using advanced process technology that offers very low supply voltage. This approach also leads to a reduction in the die area.
S-Box architectures, especially the composite field approach uses the XOR gate as the fundamental logic function along with AND gates. Consequently, enhancing the performance of the XOR gates can significantly improve the critical path performance and die area of the S-Box design. In this paper, we present a low-power design methodology for the S-Box/Inv S-Box which includes minimizing the overall circuit size and critical path delay by implementing a new XOR gate, scaling down the supply voltage and the transistor size, along with choosing an advanced technology for optimized CMOS full custom design. Our approach of optimized full-custom S-Box/Inv S-Box implementation in low cost isomorphic composite field arithmetic using low power minimal transistor count XOR gates have not been considered before in the context of AES implementations. To the best of the authors' knowledge, most reported works use standard static CMOS XOR logic gates requiring 12 transistors resulting in a larger overall silicon-area in spite of any architectural optimization. In addition, minimized implementation of InvSubBytes for Inv S-Box by sharing S-Box resources on the same chip was not considered in many previously reported works.
Section snippets
AES algorithm and s-box implementation preliminaries
AES is a symmetric encryption algorithm which processes a fixed 128-bit data block and variable length keys of 128, 192 and 256 bits. The data block is mapped into a 4×4 array of byte elements called the State matrix. Each byte in the State is considered an element in GF (28) and denoted by Sij . The AES is also an iterative algorithm which performs iteratively for 10, 12 or 14 rounds depending on the key length. The AES contains four different data transformations: SubBytes,
Design methodology and proposed S-box/INV S-box architecture
The proposed S-Box/Inv S-Box architecture employs combinational logic using composite field arithmetic based on Ref. [3] and optimized in Ref. [32] with a different choice of the polynomial coefficients and the implementation of the constant multiplication with λ. The S-Box is implemented using XOR circuits, multiplexers and AND gates. The Optimization of the low voltage and low power composite field S-Box implementation has been further enhanced in this paper by using a new six transistors XOR
Novel XOR gate for low power CMOS Galois field arithmetic
From the above Galois Field arithmetic for S-Box and the corresponding Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9 it is clearly evident that the implementation of S-Box/Inv S-Box requires a large number of XOR operations whose efficient and low power implementation can result in a substantially improved CMOS S-Box hardware design.
Compact S-Box/Inv S-Box chip and comparison with other designs
The Hardware architecture in Fig. 2 is implemented to perform both encryption and decryption with S-Box and Inv S-Box sharing the same hardware. It is an improved modification of the architecture in Ref. [32] along with this inclusion of the inverse S-Box which was not implemented in Ref. [32]. This modification enables the implementation of Inverse SubBytes for decryption by reusing the same S-Box resources. Using the novel low-power and low-area XOR gate of the previous section, a circuit
Conclusion
This paper presents a full custom hardware implementation of low power AES S-Box/Inv S-Box architecture in the 65 nm CMOS process employing circuit level optimization. The design demonstrated a new approach to minimize silicon-area of S-Box design by using a new 2-input XOR gate for low-power composite field arithmetic in order to reduce the power dissipation and delay for the overall circuit. The results indicate that our design is suitable for applications which require small area and low
Acknowledgment
The authors wishes to acknowledge the anonymous reviewers for their comments which helped in enhancing the quality of the paper. Acknowledgment is also due to Dr. Shaun Cooper of the Institute of Information and Mathematical Sciences for discussions on Galois Field arithmetic.
Nabihah Ahmad received the B.S. in electrical, electronic and system engineering from Universiti Kebangsaan Malaysia (UKM) and M.S. degrees in electronic engineering from Universiti Tun Hussein Onn Malaysia (UTHM) in 2002 and 2006, respectively. She is currently a Ph.D. candidate with the Center for Research in Analog and VLSI Microsystem Design at School of Engineering and Advanced Technology, Massey University, New Zealand. Her research interests include low power VLSI circuit design,
References (46)
- et al.
Low-power clock-less hardware implementation of the Rijndael S-box for wireless sensor networks
The Journal of China Universities of Posts and Telecommunications
(2007) - et al.
The Design of Rijndael
(2002) - et al.
Area, delay, and power characteristics of standard-cell implementations of the AES S-Box
Journal of Signal Processing Systems
(2008) - et al.
A compact Rijndael hardware architecture with S-Box optimization, ASIACRYPT 2001
Lecture Notes in Computer Science
(2001) - et al.
Architectures and VLSI implementations of the AES-proposal Rijndael
IEEE Transactions on Computers
(2002) - et al.
High-speed VLSI architectures for the AES algorithm
IEEE Transactions on VLSI Systems
(2004) - S. Morioka, A. Satoh, An optimized S-box circuit architecture for low power AES design, in: Proceedings of the Workshop...
Graph-Based Algorithms for Boolean Function Manipulation
IEEE Transactions on Computers
(1986)- et al.
A 10-Gbps Full-AES crypto design with a twisted BDD S-Box architecture
IEEE Transactions on VLSI Systems
(2004) - N. Ahmad, R. Hasan, W.M. Jubadi, Design of AES S-box using combinational logic optimization, in: Proceedings of the...
Compact S-Box for AES, workshop on cryptographic hardware and embedded systems 2005 (CHES 2005)
Lecture Notes in Computer Science
A high security and low-power AES S-Box full-custom design for wireless sensor network
Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing
Design and analysis of low-power 10-transistor full adders using novel XOR-XNOR gates
IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing
Performance analysis of low-power 1-bit CMOS full adder cells
IEEE Transactions on VLSI Systems
Low-voltage low-power CMOS full adder
IET Proceedings on Circuits, Devices and Systems
New XOR/XNOR and full adder circuits for low voltage, low power applications
Microelectronics Journal
Cited by (51)
A new ASIC implementation of an advanced encryption standard (AES) crypto-hardware accelerator
2021, Microelectronics JournalCitation Excerpt :However, the S-box in Ref. [26] is only designed for SubBytes transformation, whereas the proposed design includes both the transformations, with S-box and inverse S-box sharing the same hardware. Designs in Refs. [23,26] use the 65 nm node with low threshold voltage, allowing ultra-low Vdd (0.3–0.5 V). By using deep nanometer CMOS along with voltage scaling as an effective constraint, the proposed architecture can achieve an even lower power budget than that in Ref. [23].
A low cost fault-attack resilient AES for IoT applications
2021, Microelectronics ReliabilityCitation Excerpt :In the SB(SB−1) operation 8-bit is substituted by S-box (S-box−1) which is a nonlinear transformation in Galois field GF(28). We select the composite field-based S-box [56,57] due to its small implementation area. Another reason for selecting this type of S-box implementation is to easily use the resources sharing between encryption and decryption data-path for low-cost AES implementation (see Fig. 5(a)).
High speed and low power implementation of AES for wireless sensor networks
2018, Procedia Computer Science
Nabihah Ahmad received the B.S. in electrical, electronic and system engineering from Universiti Kebangsaan Malaysia (UKM) and M.S. degrees in electronic engineering from Universiti Tun Hussein Onn Malaysia (UTHM) in 2002 and 2006, respectively. She is currently a Ph.D. candidate with the Center for Research in Analog and VLSI Microsystem Design at School of Engineering and Advanced Technology, Massey University, New Zealand. Her research interests include low power VLSI circuit design, cryptography algorithms, and architectures for low power digital system.
S. M. Rezaul Hasan received his Ph.D. in Electronics Engineering from the University of California Los Angeles (UCLA) in 1985. From 1983 to 1986 he was a VLSI design engineer at Xerox Microelectronics Center in El Segundo, CA., where he worked in the design of CMOS VLSI microprocessors. In 1986 he moved to the Asia-Pacific region and served several institutions including Nanyang Technological University, Singapore (1986–1988), Curtin University of Technology, Perth, Western Australia (1990–1991) and University Sains Malaysia, Perak, Malaysia (1992–2000). At University Sains Malaysia he held the position of Associate Professor and was the coordinator of the Analog and VLSI research laboratory. He spent the next four years (2000–2004) in the West Asia-Gulf region where he served as an Associate Professor of Microelectronics, Integrated Circuit Design and VLSI Design in the Department of Electrical and Computer Engineering at the University of Sharjah, Sharjah, United Arab Emirates. While in Sharjah he received the National Bank of Sharjah Award for outstanding research publication in Integrated Circuit Design. Presently he is the Director of the Center for Research in Analog and VLSI microsystems dEsign (CRAVE) at Massey University, Auckland, New Zealand. He is also a senior faculty member within the School of Engineering and Advanced Technology (SEAT) in Electronics and Computer Engineering, teaching courses in Advanced Microelectronics and Integrated Circuit Design. He has published over 138 papers in international journals and conferences in the areas of Analog, Digital, RF and Mixed-Signal Integrated Circuit Design and VLSI Design. Dr. Hasan has also served as a consultant for many electronics companies. His present areas of interest include Analog and RF Integrated Circuit and Microsystem Design, VLSI signal processing, CMOS sensors, CMOS Bioelectronics and Biological (gene-protein) Circuit Design. He is a senior member of the IEEE and an editor of the Hindawi journal of active and passive electronic components.