Area-efficient fully digital memory using minimum height standard cells for near-threshold voltage computing
Introduction
Internet of Things (IoT) has emerged as a new concept in which billions of computers and sensors are interconnected each other, enabling autonomous exchange of information. In these networks, they exchange multimedia data which are much more complex than traditional text-based data. These IoT devices are also required to operate correctly for several years with only limited battery capacity, thus they require SoCs with both high-computational power and high energy-efficiency. On-chip memories are the primary bottlenecks in these SoCs, thus achieving high throughput and high energy-efficiency of these memories are a must.
As a solution to this, techniques adaptively scaling supply voltage () or transistors’ back-gate bias () have been widely studied over the past 15 years [1], [2], [3]. Simultaneous tuning of and reduces energy per operation of the circuit without degrading its speed. For the IoT application, these techniques typically result in near-threshold voltage operation where supply voltage () is near the threshold voltage (). This is because the optimum supply voltage that minimizes energy per operation of a circuit is in a near-threshold region [4]. In near-threshold operation, however, it is hard for the traditional 6T SRAM macros to guarantee stable operation since the static noise margin (SNM) gradually gets degraded as the supply voltage decreases. Major options for near-threshold on-chip memories are: 1) specially designed SRAM macros, and 2) standard-cell based memories (SCM). SRAM macros typically require a full-custom circuit design to guarantee their operation, thus designing SRAM macros is associated with a considerable design effort. As an alternative to SRAM macros, SCMs have been widely studied over the past decade [5], [6], [7], [8]. Since only standard-cells are used for SCMs, custom design effort for SCMs can be reduced to the level of fully automated cell based design with keeping their stability even in sub-/near-threshold voltage operation. However, SCMs typically consume several times larger area than SRAM macros, thus a major drawback of SCMs is their larger area than that of SRAM macros. To solve this problem, this paper proposes an area-efficient SCM structure. One of the key ideas is to use area-optimized standard-cells which have the minimum height to construct complementary CMOS logics. Unlike several previous area-efficient SCM structures, the proposed SCM is designed to reduce dynamic energy consumption. This paper also presents an energy-oriented SCM structure which has a low activity readout structure with signal gating instead of the bit-line-based structure, resulting in both high area- and energy-efficiency with keeping its operation speed satisfiable for applications in IoT.
This paper is an extension of our previous work [9]. We give detailed description of the proposed area-optimized standard-cells. The paper [9] proposed area-efficient minimum height standard-cells (MHSCs). We can effectively improve the area of SCMs by reducing the standard-cells’ height to a minimum height to construct complementary CMOS logics. SCMs with the MHSC library are designed in a 65-nm FD-SOI process technology and evaluated by post layout simulation. In this paper, we newly describe the guideline to design a MHSC library. We clarify the dominant factors that determine the physical layout of the MHSCs. Then, we show the systematic guideline to design MHSCs based on the clarification. The stability analysis of MHSCs is also the key enhancement of this paper. The MHSCs are vulnerable to process variations since they have smaller transistors than conventional standard-cells. Therefore, the stability of the MHSCs is analyzed to guarantee the proposed SCM's yields. The analytical stability model proposed in [10], [11] is used, which enables us to determine the minimum gate width to guarantee the proposed SCM's yields. Although the analytical model is used to estimate the yield of latches for different supply voltages in [10], [11], this paper uses the model to find the minimum height of a latch to guarantee its stable operation for a given low operating voltage.
The rest of this paper is organized in the following way. Section 2 describes related work and contributions of this work. The proposed standard-cell structure is presented in Section 3. Then, energy-efficient SCM structure is presented in Section 4. The proposed SCM is implemented in a 65-nm process technology and evaluated in Section 5. Section 6 concludes this paper.
Section snippets
Related work and our contribution
Over a decade, standard-cell based memories (SCMs) have been widely studied as an alternative to conventional full-custom SRAMs [5], [6], [8], [7]. One of the advantages of SCMs is their stability in low voltage operation. In low voltage operation, bit cells are the most vulnerable circuits to within-die (WID) variation. The conventional SRAM bit cell fails to operate in i) read operation, ii) write operation, or iii) hold operation due to WID variation. The readout or write failures are major
Concept
A conventional standard-cell library includes complex logic gates such as FADD, XOR and DFF. In order to keep their routability, a conventional standard-cell has the height of 6, 9 or 12 wire tracks. In SCM designs, however, those complex logic gates are not required since SCMs only have simple logic functions. For example, typical SCM has the following functions: i) reading values of the storage elements specified by address signals, ii) writing some values to the storage elements specified by
Energy-efficient memory architecture
We consider a dual port SCM with one read port and one write port which operates in a single clock cycle. We assume that the SCM has R-words and each word has C-bits with an address width m. A block diagram of the proposed SCM is depicted in Fig. 4. As described in Section 3, latch cells are implemented as bit cells. In readout operation, the outputs of latch cells are selected at the R-to-1 MUX by one-hot signals from the read address decoder. In write operation, latch cells labeled “Write
Implementation of the proposed SCM in a 65-nm FD-SOI process technology
In this section, the proposed MHSCM is implemented in a 65-nm FD-SOI process technology as a case study. We firstly analyze the stability of standard-cells to determine the yield-driven width in Section 3. Based on the design method in Section 3, we design a MHSC library in the 65-nm FD-SOI process technology. After that, we design MHSCMs with several configurations using the MHSC library and evaluate their performance. The performance is compared with the prior-art SCMs and SRAMs designed
Conclusion
This paper proposes an energy- and area-efficient SCM structure which is aimed at near-threshold operation. Minimum height standard-cells with simplified latches are designed, addressing area-overhead of SCMs. Energy-aware readout/write scheme is then presented. By utilizing AOI22-NAND2-NOR2 readout scheme with signal gating, the proposed SCM reduces dynamic energy consumption. Evaluation results using a 65-nm FD-SOI process technology show that the proposed SCM with a 4 kb capacity achieves the
Acknowledgments
This work has been partly supported by KAKENHI Grant-in-Aid for Scientific Research 16H01713, 26280013 and 16J08694 from JSPS. This work is also supported by VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with Cadence Design Systems, Inc. and Synopsys, Inc.
Jun Shiomi received his B.E. degree in Electronic Engineering in 2014, and the M.E. degree in Communications and Computer Engineering in 2016 both from Kyoto University, Kyoto, Japan. He is currently pursuing a Ph.D. degree in Communications and Computer Engineering at Kyoto University. He is a Research Fellow of the Japan Society for the Promotion of Science. His research interests include modeling and computer-aided design for low power and low voltage system-on-chips.
References (18)
- et al.
Joint dynamic voltage scaling and adaptive body biasing for heterogeneous distributed real-time embedded systems
IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
(2005) - S. Martin, K. Flautner, T. Mudge, D. Blaauw, Combined dynamic voltage scaling and adaptive body biasing for lower power...
- A. Basu, S.-C. Lin, V. Wason, A. Mehrotrat, K. Banerjee, Simultaneous optimization of supply and threshold voltages for...
- S. Jain, S. Khare, S. Yada, V. Ambili, P. Salihundam, S. Ramani, S. Muthukumar, M. Srinivasan, A. Kumar, S. Gb, R....
- et al.
A 180-mV subthreshold FFT processor using a minimum energy design methodology
IEEE J. Solid-State Circuits
(2005) - et al.
Benchmarking of standard-cell based memories in the sub-VT domain in 65-nm CMOS technology
IEEE Trans. Emerg. Sel. Top. Circuits Syst.
(2011) - A. Teman, D. Rossi, P. Meinerzhagen, L. Benini, A. Burg, Controlled placement of standard cell memory arrays for high...
- O. Andersson, B. Mohammadi, P. Meinerzhagen, J. Rodrigues, A 35 fJ/bit-access Sub-VT memory using a dual-bit...
- J. Shiomi, T. Ishihara, H. Onodera, Fully digital on-chip memory using minimum height standard cells for near-threshold...
Cited by (5)
Approximation-Based System Implementation for Real-Time Minimum Energy Point Tracking over a Wide Operating Performance Region
2023, IEICE Transactions on Fundamentals of Electronics, Communications and Computer SciencesA Standard Cell Memory Based on 2T Gain Cell DRAM for Memory-Centric Accelerator Design
2023, International System on Chip ConferenceZero-Aware Fine-Grained Power Gating for Standard-Cell Memories in Voltage-Scaled Circuits
2022, International System on Chip ConferenceMinimum energy computing via supply and threshold voltage scaling
2021, Multi-Processor System-on-Chip 1: ArchitecturesOn-chip cache architecture exploiting hybrid memory structures for near-threshold computing
2019, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Jun Shiomi received his B.E. degree in Electronic Engineering in 2014, and the M.E. degree in Communications and Computer Engineering in 2016 both from Kyoto University, Kyoto, Japan. He is currently pursuing a Ph.D. degree in Communications and Computer Engineering at Kyoto University. He is a Research Fellow of the Japan Society for the Promotion of Science. His research interests include modeling and computer-aided design for low power and low voltage system-on-chips.
Tohru Ishihara received his B.E., M.E., and Dr.E. degrees in computer science from Kyushu University in 1995, 1997 and 2000 respectively. From 1997 to 2000, he was a Research Fellow of the Japan Society for the Promotion of Science. For the next three years he worked as a Research associate in VLSI Design and Education Center, the University of Tokyo.From 2003 to 2005, he was with Fujitsu Laboratories of America, Inc. as a member of research staff. From 2005 to 2011, he was with System LSI Research Center, Kyushu University as an Associate Professor. In 2011, he joined Kyoto University, where he is currently an Associate Professor in the Department of Communications and Computer Engineering. His research interests include low power design and methodologies for embedded systems.
Hidetoshi Onodera received the B.E., M.E., and Dr. Eng. degrees in Electronic Engineering from Kyoto University, Kyoto, Japan, in 1978, 1980, and 1984, respectively. He joined the Department of Electronics, Kyoto University, in 1983, and currently a Professor in the Department of Communications and Computer Engineering, Graduate School of Informatics, Kyoto University. His research interests include design technologies for Digital, Analog, and RF LSIs, with particular emphasis on low-power design, design for manufacturability, and design for dependability. Dr. Onodera served as the Program Chair and General Chair of ICCAD and ASP[HYPHEN]DAC. He was the Chairman of the IPSJ SIG-SLDM (System LSI Design Methodology), the IEICE Technical Group on VLSI Design Technologies, the IEEE SSCS Kansai Chapter, and the IEEE CASS Kansai Chapter. He has served as the Editor-in-Chief of IEICE Transactions on Electronics and IPSJ Transactions on System LSI Design Methodology.