Abstract
Embedded memory remains a major bottleneck in current integrated circuit design in terms of silicon area, power dissipation, and performance; however, static random access memories (SRAMs) are almost exclusively supplied by a small number of vendors through memory generators, targeted at rather generic design specifications. As an alternative, standard cell memories (SCMs) can be defined, synthesized, and placed and routed as an integral part of a given digital system, providing complete design flexibility, good energy efficiency, low-voltage operation, and even area efficiency for small memory blocks. Yet implementing an SCM block with a standard digital flow often fails to exploit the distinct and regular structure of such an array, leaving room for optimization. In this article, we present a design methodology for optimizing the physical implementation of SCM macros as part of the standard design flow. This methodology introduces controlled placement, leading to a structured, noncongested layout with close to 100% placement utilization, resulting in a smaller silicon footprint, reduced wire length, and lower power consumption compared to SCMs without controlled placement. This methodology is demonstrated on SCM macros of various sizes and aspect ratios in a state-of-the-art 28nm fully depleted silicon-on-insulator technology, and compared with equivalent macros designed with the noncontrolled, standard flow, as well as with foundry-supplied SRAM macros. The controlled SCMs provide an average 25% reduction in area as compared to noncontrolled implementations while achieving a smaller size than SRAM macros of up to 1Kbyte. Power and performance comparisons of controlled SCM blocks of a commonly found 256 × 32 (1 Kbyte) memory with foundry-provided SRAMs show greater than 65% and 10% reduction in read and write power, respectively, while providing faster access than their SRAM counterparts, despite being of an aspect ratio that is typically unfavorable for SCMs. In addition, the SCM blocks function correctly with a supply voltage as low as 0.3V, well below the lower limit of even the SRAM macros optimized for low-voltage operation. The controlled placement methodology is applied within a full-chip physical implementation flow of an OpenRISC-based test chip, providing more than 50% power reduction compared to equivalently sized compiled SRAMs under a benchmark application.
- O. Andersson, B. Mohammadi, P. Meinerzhagen, A Burg, and J. N. Rodrigues. 2013. Dual-VT 4kb sub-VT memories with < 1 pW/bit leakage in 65 nm CMOS. In Proceedings of the 39th European Solid State Circuits Conference (ESSCIRC’13). 197--200. DOI:http://dx.doi.org/10.1109/ESSCIRC.2013.6649106 Google ScholarCross Ref
- Luca Benini, Alberto Macii, and Massimo Poncino. 2003. Energy-aware design of embedded memories: A survey of technologies, architectures, and optimization techniques. ACM Transactions on Embedded Computing Systems 2, 1, 5--32. Google ScholarDigital Library
- B. H. Calhoun and A. P. Chandrakasan. 2007. A 256-kb 65-nm sub-threshold SRAM design for ultra-low-voltage operation. IEEE Journal of Solid-State Circuits 42, 3, 680--688. Google ScholarCross Ref
- Philippe Flatresse. 2014. Process and design solutions for exploiting FD-SOI technology towards energy efficient SOCs. In Proceedings of the 2014 International Symposium on Low Power Electronics and Design (ISLPED’14). ACM, New York, NY, 127--130. DOI:http://dx.doi.org/10.1145/2627369.2631640 Google ScholarDigital Library
- ITRS. 2015. International Technology Roadmap for Semiconductors 2015 Edition. Available at http://www.itrs2.net/itrs-reports.html.Google Scholar
- Tae-Hyoung Kim, J. Liu, J. Keane, and C. H. Kim. 2007. A high-density subthreshold SRAM with data-independent bitline leakage and virtual ground replica scheme. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC’07). 330--606. DOI:http://dx.doi.org/10.1109/ISSCC.2007.373428 Google ScholarCross Ref
- P. Meinerzhagen, C. Roth, and A. Burg. 2010. Towards generic low-power area-efficient standard cell based memory architectures. In Proceedings of the 2010 53rd IEEE International Midwest Symposium on Circuits and Systems (MWSCAS’10). 129--132. DOI:http://dx.doi.org/10.1109/MWSCAS.2010.5548579 Google ScholarCross Ref
- P. Meinerzhagen, S. M. Y. Sherazi, A. Burg, and J. N. Rodrigues. 2011. Benchmarking of standard-cell based memories in the sub-VT domain in 65-nm CMOS technology. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 1, 2, 173--182. DOI:http://dx.doi.org/10.1109/JETCAS.2011.2162159 Google ScholarCross Ref
- P. Meinerzhagen, O. Andersson, B. Mohammadi, Y. Sherazi, A. Burg, and J. Rodrigues. 2012. A 500 fw/bit 14 fj/bit-access 4kb standard-cell based sub-VT memory in 65nm CMOS. In Proceedings of the 38th European Solid State Circuits Conference (ESSCIRC’12). 321--324. DOI:http://dx.doi.org/10.1109/ESSCIRC.2012.6341319 Google ScholarCross Ref
- Pascal Andreas Meinerzhagen, Andrea Bonetti, Georgios Karakonstantis, Christoph Roth, Frank Kagan Gurkaynak, and Andreas Peter Burg. 2015. Refresh-free dynamic standard-cell based memories: Application to a QC-LDPC decoder. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’15). IEEE, Los Alamitos, CA. Google ScholarCross Ref
- OpenCores.org. 2015. The OpenRISC Project. Retrieved April 7, 2016, from http://opencores.org/or1k.Google Scholar
- N. Planes, O. Weber, V. Barral, S. Haendler, D. Noblet, D. Croain, M. Bocat, et al. 2012. 28nm FDSOI technology platform for high-speed low-voltage digital applications. In Proceedings of the 2012 Symposium on VLSI Technology (VLSIT’12). IEEE, Los Alamitos, CA, 133--134. Google ScholarCross Ref
- C. Senning, L. Bruderer, J. Hunziker, and A. Burg. 2014. A lattice reduction-aided MIMO channel equalizer in 90 nm CMOS achieving 720 mb/s. IEEE Transactions on Circuits and Systems I: Regular Papers 61, 6, 1860--1871. DOI:http://dx.doi.org/10.1109/TCSI.2013.2295027 Google ScholarCross Ref
- M. E. Sinangil, N. Verma, and A. P. Chandrakasan. 2009. A reconfigurable 8t ultra-dynamic voltage scalable (U-DVS) SRAM in 65 nm CMOS. IEEE Journal of Solid-State Circuits 44, 11, 3163--3173. DOI:http://dx.doi.org/10.1109/JSSC.2009.2032493 Google ScholarCross Ref
- Ivan Edward Sutherland, Robert F. Sproull, and David F. Harris. 1999. Logical Effort: Designing Fast CMOS Circuits. Morgan Kaufmann. Google ScholarDigital Library
- A. Teman, L. Pergament, O. Cohen, and A. Fish. 2011. A 250 mV 8 kb 40 nm ultra-low power 9t supply feedback SRAM (SF-SRAM). IEEE Journal of Solid-State Circuits 46, 11, 2713--2726. Google ScholarCross Ref
- A. Teman, D. Rossi, P. Meinerzhagen, L. Benini, and A. Burg. 2015. Controlled placement of standard cell memory arrays for high density and low power in 28nm FD-SOI. In Proceedings of the 2015 20th Asia and South Pacific Design Automation Conference (ASP-DAC’15). 81--86. DOI:http://dx.doi.org/10.1109/ ASPDAC.2015.7058985Google Scholar
- A. Teman and R. Visotsky. 2014. A fast modular method for true variation-aware separatrix tracing in nanoscaled SRAMs. IEEE Transactions on Very Large Scale Integration (VLSI) Systems PP, 99, 1. DOI:http://dx.doi.org/10.1109/TVLSI.2014.2358699 Google ScholarCross Ref
- Naveen Verma and A. P. Chandrakasan. 2008. A 256 kb 65 nm 8t subthreshold SRAM employing sense-amplifier redundancy. IEEE Journal of Solid-State Circuits 43, 1, 141--149. Google ScholarCross Ref
- Yih Wang, H. J. Ahn, U. Bhattacharya, Z. Chen, T. Coan, F. Hamzaoglu, W. Hafez, et al. 2008. A 1.1 GHz 12 uA/Mb-leakage SRAM design in 65 nm ultra-low-power CMOS technology with integrated leakage reduction for mobile applications. IEEE Journal of Solid-State Circuits 43, 1, 172--179. DOI:http://dx.doi.org/10.1109/JSSC.2007.907996 Google ScholarCross Ref
- Bo Zhai, S. Hanson, D. Blaauw, and D. Sylvester. 2008. A variation-tolerant sub-200 mV 6-T subthreshold SRAM. IEEE Journal of Solid-State Circuits 43, 10, 2338--2348. Google ScholarCross Ref
Index Terms
- Power, Area, and Performance Optimization of Standard Cell Memory Arrays Through Controlled Placement
Recommendations
Consistent placement of macro-blocks using floorplanning and standard-cell placement
ISPD '02: Proceedings of the 2002 international symposium on Physical designWhile a number of recent works address large-scale standard-cell placement, they typically assume that all macros are fixed. Floorplanning techniques are very good at handling macros, but do not scale to hundreds of thousands of placeable objects. ...
Leakage power optimization in standard-cell designs
SBCCI '04: Proceedings of the 17th symposium on Integrated circuits and system designLeakage power consumption is a growing concern in integrated circuit design. Nanometer CMOS transistors are characterized by significant sub-threshold and gate leakage currents and feature size scaling is exacerbating this problem. In today's ...
Optimization of standard cell based detailed placement for 16 nm FinFET process
DATE '14: Proceedings of the conference on Design, Automation & Test in EuropeFinFET transistors have great advantages over traditional planar MOSFET transistors in high performance and low power applications. Major foundries are adopting the FinFET technology for CMOS semiconductor device fabrication in the 16 nm technology node ...
Comments