skip to main content
10.1145/3307650.3322233acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Designing vertical processors in monolithic 3D

Published: 22 June 2019 Publication History

Abstract

A processor laid out vertically in stacked layers can benefit from reduced wire delays, low energy consumption, and a small footprint. Such a design can be enabled by Monolithic 3D (M3D), a technology that provides short wire lengths, good thermal properties, and high integration. In current M3D technology, due to manufacturing constraints, the layers in the stack are asymmetric: the bottom-most one has a relatively higher performance.
In this paper, we examine how to partition a processor for M3D. We partition logic and storage structures into two layers, taking into account that the top layer has lower-performance transistors. In logic structures, we place the critical paths in the bottom layer. In storage structures, we partition the hardware unequally, assigning to the top layer fewer ports with larger access transistors, or a shorter bitcell subarray with larger bitcells. We find that, with conservative assumptions on M3D technology, an M3D core executes applications on average 25% faster than a 2D core, while consuming 39% less energy. With aggressive technology assumptions, the M3D core performs even better: it is on average 38% faster than a 2D core and consumes 41% less energy. Further, under a similar power budget, an M3D multicore can use twice as many cores as a 2D multicore, executing applications on average 92% faster with 39% less energy. Finally, an M3D core is thermally efficient.

References

[1]
A. Agrawal, J. Torrellas, and S. Idgunji. 2017. Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-50).
[2]
F. Andrieu, P. Batude, L. Brunet, C. Fenouillet-Beranger, D. Lattard, S. Thuries, O. Billoint, R. Fournel, and M. Vinet. 2018. A review on opportunities brought by 3D-monolithic integration for CMOS device and digital circuit. In 2018 International Conference on IC Design Technology (ICICDT).
[3]
AMD Ryzen Micro Architecture. 2017. https://arstechnica.com/gadgets/2017/03/amds-moment-of-zen-finally-an-architecture-that-can-compete/. {Online}.
[4]
Rajeev Balasubramonian, Andrew B. Kahng, Naveen Muralimanohar, Ali Shafiee, and Vaishnav Srinivas. 2017. CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories. ACM Transactions on Architecture Code and Optimization (June 2017).
[5]
P. Batude, T. Ernst, J. Arcamone, G. Arndt, P. Coudrain, and P. E. Gaillardon. 2012. 3-D Sequential Integration: A Key Enabling Technology for Heterogeneous Co-Integration of New Function With CMOS. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (Dec 2012).
[6]
P. Batude, B. Sklenard, C. Fenouillet-Beranger, B. Previtali, C. Tabone, O. Rozeau, O. Billoint, O. Turkyilmaz, H. Sarhan, S. Thuries, G. Cibrario, L. Brunet, F. Deprat, J. E. Michallet, F. Clermidy, and M. Vinet. 2014. 3D sequential integration opportunities and technology optimization. In IEEE International Interconnect Technology Conference.
[7]
Perrine Batude, Maud Vinet, Arnaud Pouydebasque, Laurent Clavelier, Cyrille LeRoyer, Claude Tabone, Bernard Previtali, Loic Sanchez, Laurence Baud, Antonio Roman, Veronique Carron, Fabrice Nemouchi, Stephane Pocas, Corine Comboroure, Vincent Mazzocchi, Helen Grampeix, Francois Aussenac, and Simon Deleonibus. 2008. Enabling 3D Monolithic Integration. ECS Transactions (2008).
[8]
P. Batude, M. Vinet, C. Xu, B. Previtali, C. Tabone, C. Le Royer, L. Sanchez, L. Baud, L. Brunet, A. Toffoli, F. Allain, D. Lafond, F. Aussenac, O. Thomas, T. Poiroux, and O. Faynot. 2011. Demonstration of low temperature 3D sequential FDSOI integration down to 50 nm gate length. In 2011 Symposium on VLSI Technology - Digest of Technical Papers.
[9]
L. Baugh and C. Zilles. 2006. Decomposing the load-store queue by function for power reduction and scalability. IBM Journal of Research and Development (March 2006).
[10]
O. Billoint, H. Sarhan, I. Rayane, M. Vinet, P. Batude, C. Fenouillet-Beranger, O. Rozeau, G. Cibrario, F. Deprat, A. Fustier, J. E. Michallet, O. Faynot, O. Turkyilmaz, J. F. Christmann, S. Thuries, and F. Clermidy. 2015. A comprehensive study of Monolithic 3D cell on cell design using commercial 2D tool. In 2015 Design, Automation Test in Europe Conference Exhibition (DATE).
[11]
B. Black, M. Annavaram, N. Brekelbaum, J. DeVale, L. Jiang, G. H. Loh, D. McCaule, P. Morrow, D. W. Nelson, D. Pantuso, P. Reed, J. Rupley, S. Shankar, J. Shen, and C. Webb. 2006. Die Stacking (3D) Microarchitecture. In 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[12]
Shashikanth Bobba, Ashutosh Chakraborty, Olivier Thomas, Perrine Batude, and Giovanni de Micheli. 2013. Cell Transformations and Physical Design Techniques for 3D Monolithic Integrated Circuits. Journal on Emerging Technologies in Computing Systems (JETC) (2013).
[13]
M. Brocard, R. Boumchedda, J. P. Noel, K. C. Akyel, B. Giraud, E. Beigne, D. Turgis, S. Thuries, G. Berhault, and O. Billoint. 2016. High density SRAM bitcell architecture in 3D sequential CoolCube 14nm technology. In 2016 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S).
[14]
L. Brunet, P. Batude, C. Fenouillet-Beranger, P. Besombes, L. Hortemel, F. Ponthenier, B. Previtali, C. Tabone, A. Royer, C. Agraffeil, C. Euvrard-Colnat, A. Seignard, C. Morales, F. Fournel, L. Benaissa, T. Signamarcheix, P. Besson, M. Jourdan, R. Kachtouli, V. Benevent, J. Hartmann, C. Comboroure, N. Allouti, N. Posseme, C. Vizioz, C. Arvet, S. Barnola, S. Kerdiles, L. Baud, L. Pasini, C. V. Lu, F. Deprat, A. Toffoli, G. Romano, C. Guedj, V. Delaye, F. Boeuf, O. Faynot, and M. Vinet. 2016. First demonstration of a CMOS over CMOS 3D VLSI Cool-Cube integration on 300mm wafers. In 2016 IEEE Symposium on VLSI Technology.
[15]
G. Van der Plas, P. Limaye, I. Loi, A. Mercha, H. Oprins, C. Torregiani, S. Thijs, D. Linten, M. Stucchi, G. Katti, D. Velenis, V. Cherman, B. Vandevelde, V. Simons, I. De Wolf, R. Labie, D. Perry, S. Bronckers, N. Minas, M. Cupac, W. Ruythooren, J. Van Olmen, A. Phommahaxay, M. de ten Broeck, A. Opdebeeck, M. Rakowski, B. De Wachter, M. Dehan, M. Nelis, R. Agarwal, A. Pullini, F. Angiolini, L. Benini, W. Dehaene, Y. Travaly, E. Beyne, and P. Marchal. 2011. Design Issues and Considerations for Low-Cost 3-D TSV IC Technology. IEEE Journal of Solid-State Circuits (Jan 2011).
[16]
P. Emma, A. Buyuktosunoglu, M. Healy, K. Kailas, V. Puente, R. Yu, A. Hartstein, P. Bose, and J. Moreno. 2014. 3D stacking of high-performance processors. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[17]
A. Gonzalez, F. Latorre, and G. Magklis. 2010. Processor Microarchitecture: An Implementation Perspective. Synthesis Lectures on Computer Architecture (2010).
[18]
B. Gopireddy, C. Song, J. Torrellas, N. S. Kim, A. Agrawal, and A. Mishra. 2016. ScalCore: Designing a Core for Voltage Scalability. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[19]
Wei Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, and M. R. Stan. 2006. HotSpot: A compact thermal modeling methodology for early-stage VLSI design. IEEE Transactions on Very Large Scale Integration (VLSI) Systems (May 2006).
[20]
S. Van Huylenbroeck, M. Stucchi, Y. Li, J. Slabbekoorn, N. Tutunjyan, S. Sardo, N. Jourdan, L. Bogaerts, F. Beirnaert, G. Beyer, and E. Beyne. 2016. Small Pitch, High Aspect Ratio Via-Last TSV Module. In 2016 IEEE 66th Electronic Components and Technology Conference (ECTC).
[21]
International Roadmap for Devices and Systems. 2017. IRDS. (2017). https://irds.ieee.org/roadmap-2017
[22]
International Technology Roadmap for Semiconductors. 2015. ITRS 2.0. (2015). http://www.itrs2.net.
[23]
S. Jain, S. Khare, S. Yada, V. Ambili, P. Salihundam, S. Ramani, S. Muthukumar, M. Srinivasan, A. Kumar, S. K. Gb, R. Ramanarayanan, V. Erraguntla, J. Howard, S. Vangal, S. Dighe, G. Ruhl, P. Aseron, H. Wilson, N. Borkar, V. De, and S. Borkar. 2012. A 280mV-to-1.2V wide-operating-range IA-32 processor in 32nm CMOS. In 2012 IEEE International Solid-State Circuits Conference.
[24]
C. H. Jan, F. Al-amoody, H. Y. Chang, T. Chang, Y. W. Chen, N. Dias, W. Hafez, D. Ingerly, M. Jang, E. Karl, S. K. Y. Shi, K. Komeyli, H. Kilambi, A. Kumar, K. Byon, C. G. Lee, J. Lee, T. Leo, P. C. Liu, N. Nidhi, R. Olac-vaw, C. Petersburg, K. Phoa, C. Prasad, C. Quincy, R. Ramaswamy, T. Rana, L. Rockford, A. Subramaniam, C. Tsai, P. Vandervoorn, L. Yang, A. Zainuddin, and P. Bai. 2015. A 14 nm SoC platform technology featuring 2nd generation Tri-Gate transistors, 70 nm gate pitch, 52 nm metal pitch, and 0.0499 um2 SRAM cells, optimized for low power, high performance and high density SoC products. In 2015 Symposium on VLSI Circuits (VLSI Circuits).
[25]
C.-H. Jan, U. Bhattacharya, R. Brain, S.-J. Choi, G. Curello, G. Gupta, W. Hafez, M. Jang, M. Kang, K. Komeyli, T. Leo, N. Nidhi, L. Pan, J. Park, K. Phoa, A. Rahman, C. Staus, H. Tashiro, C. Tsai, P. Vandervoorn, L. Yang, J.-Y. Yeh, and P. Bai. 2012. A 22nm SoC platform technology featuring 3-D tri-gate and high-k/metal gate, optimized for ultra low power, high performance and high density SoC applications. In 2012 International Electron Devices Meeting.
[26]
D. H. Kim, K. Athikulwongse, and S. K. Lim. 2009. A study of Through-Silicon-Via impact on the 3D stacked IC layout. In 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.
[27]
J. Kong, Y. Gong, and S. W. Chung. 2017. Architecting large-scale SRAM arrays with monolithic 3D integration. In 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[28]
B. W. Ku, T. Song, A. Nieuwoudt, and S. K. Lim. 2017. Transistor-level monolithic 3D standard cell layout optimization for full-chip static power integrity. In 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[29]
Y. J. Lee, D. Limbrick, and S. K. Lim. 2013. Power benefit study for ultra-high density transistor-level monolithic 3D ICs. In 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[30]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[31]
C. Liu and S. K. Lim. 2012. A design tradeoff study with monolithic 3D integration. In Thirteenth International Symposium on Quality Electronic Design (ISQED).
[32]
C. Liu and S. K. Lim. 2012. Ultra-high density 3D SRAM cell designs for monolithic 3D integration. In 2012 IEEE International Interconnect Technology Conference.
[33]
J. Meng, K. Kawakami, and A. K. Coskun. 2012. Optimizing energy efficiency of 3-D multicore systems with stacked DRAM under power and thermal constraints. In Design Automation Conference (DAC) 2012.
[34]
D. E. Nikonov and I. A. Young. 2015. Benchmarking of Beyond-CMOS Exploratory Devices for Logic Integrated Circuits. IEEE Journal on Exploratory Solid-State Computational Devices and Circuits (Dec 2015).
[35]
C. Ortolland, T. Noda, T. Chiarella, S. Kubicek, C. Kerner, W. Vandervorst, A. Opdebeeck, C. Vrancken, N. Horiguchi, M. De Potter, M. Aoulaiche, E. Rosseel, S. B. Felch, P. Absil, R. Schreutelkamp, S. Biesemans, and T. Hoffmann. 2008. Laser-annealed junctions with advanced CMOS gate stacks for 32nm node: Perspectives on device performance and manufacturability. In 2008 Symposium on VLSI Technology.
[36]
Subbarao Palacharla, Norman P. Jouppi, and James E. Smith. 1996. Quantifying the Complexity of Superscalar Processors. ftp://ftp.cs.wisc.edu/sohi/trs/complexity.1328.pdf
[37]
Subbarao Palacharla, Norman P. Jouppi, and J. E. Smith. 1997. Complexity-effective Superscalar Processors. In Proceedings of the 24th Annual International Symposium on Computer Architecture.
[38]
S. Panth, K. Samadi, Y. Du, and S. K. Lim. 2013. High-density integration of functional modules using monolithic 3D-IC technology. In 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC).
[39]
S. Panth, K. Samadi, Y. Du, and S. K. Lim. 2014. Design and CAD methodologies for low power gate-level monolithic 3D ICs. In 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[40]
S. Panth, K. Samadi, Y. Du, and S. K. Lim. 2014. Power-performance study of block-level monolithic 3D-ICs considering inter-tier performance variations. In 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[41]
K. Puttaswamy and G. H. Loh. 2007. Thermal Herding: Microarchitecture Techniques for Controlling Hotspots in High-Performance 3D-Integrated Processors. In 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[42]
K. Puttaswamy and G. H. Loh. 2009. 3D-Integrated SRAM Components for High-Performance Microprocessors. IEEE Trans. Comput. (Oct 2009).
[43]
B. Rajendran, R. S. Shenoy, D. J. Witte, N. S. Chokshi, R. L. DeLeon, G. S. Tompa, and R. F. W. Pease. 2007. Low Thermal Budget Processing for Sequential 3-D IC Fabrication. IEEE Transactions on Electron Devices (April 2007).
[44]
S. K. Samal, D. Nayak, M. Ichihashi, S. Banna, and S. K. Lim. 2016. Monolithic 3D IC vs. TSV-based 3D IC in 14nm FinFET technology. In 2016 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S).
[45]
J. Shi, D. Nayak, S. Banna, R. Fox, S. Samavedam, S. Samal, and S. K. Lim. 2016. A 14nm FinFET transistor-level 3D partitioning design to enable high-performance and low-cost monolithic 3D IC. In 2016 IEEE International Electron Devices Meeting (IEDM).
[46]
M. M. Shulaker, T. F. Wu, A. Pal, L. Zhao, Y. Nishi, K. Saraswat, H. S. P. Wong, and S. Mitra. 2014. Monolithic 3D integration of logic and memory: Carbon nanotube FETs, resistive RAM, and silicon FETs. In 2014 IEEE International Electron Devices Meeting.
[47]
S. Srinivasa, X. Li, M. Chang, J. Sampson, S. K. Gupta, and V. Narayanan. 2018. Compact 3-D-SRAM Memory With Concurrent Row and Column Data Access Capability Using Sequential Monolithic 3-D Integration. IEEE Transactions on Very Large Scale Integration (VLSI) Systems (April 2018).
[48]
Srivatsa Srinivasa, Akshay Krishna Ramanathan, Xueqing Li, Wei-Hao Chen, Fu-Kuo Hsueh, Chih-Chao Yang, Chang-Hong Shen, Jia-Min Shieh, Sumeet Gupta, Meng-Fan Marvin Chang, Swaroop Ghosh, Jack Sampson, and Vijaykrishnan Narayanan. 2018. A Monolithic-3D SRAM Design with Enhanced Robustness and In-Memory Computation Support. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED '18).
[49]
R. Ubal, B. Jang, P. Mistry, D. Schaa, and D. Kaeli. 2012. Multi2Sim: A simulation framework for CPU-GPU computing. In 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

Cited By

View all
  • (2024)TEFLON: Thermally Efficient Dataflow-aware 3D NoC for Accelerating CNN Inferencing on Manycore PIM ArchitecturesACM Transactions on Embedded Computing Systems10.1145/366527923:5(1-23)Online publication date: 14-Aug-2024
  • (2024)Bandwidth-Effective DRAM Cache for GPU s with Storage-Class Memory2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00021(139-155)Online publication date: 2-Mar-2024
  • (2023)MC-ELMM: Multi-Chip Endurance-Limited Memory ManagementProceedings of the International Symposium on Memory Systems10.1145/3631882.3631905(1-16)Online publication date: 2-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '19: Proceedings of the 46th International Symposium on Computer Architecture
June 2019
849 pages
ISBN:9781450366694
DOI:10.1145/3307650
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IEEE-CS\DATC: IEEE Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 3D integration
  2. monolithic 3D
  3. processor architecture

Qualifiers

  • Research-article

Conference

ISCA '19
Sponsor:

Acceptance Rates

ISCA '19 Paper Acceptance Rate 62 of 365 submissions, 17%;
Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)128
  • Downloads (Last 6 weeks)8
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)TEFLON: Thermally Efficient Dataflow-aware 3D NoC for Accelerating CNN Inferencing on Manycore PIM ArchitecturesACM Transactions on Embedded Computing Systems10.1145/366527923:5(1-23)Online publication date: 14-Aug-2024
  • (2024)Bandwidth-Effective DRAM Cache for GPU s with Storage-Class Memory2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00021(139-155)Online publication date: 2-Mar-2024
  • (2023)MC-ELMM: Multi-Chip Endurance-Limited Memory ManagementProceedings of the International Symposium on Memory Systems10.1145/3631882.3631905(1-16)Online publication date: 2-Oct-2023
  • (2023)Construction of All Multilayer Monolithic RSMTs and Its Application to Monolithic 3D IC RoutingACM Transactions on Design Automation of Electronic Systems10.1145/362695829:1(1-28)Online publication date: 18-Dec-2023
  • (2023)A Spatial-Designed Computing-In-Memory Architecture Based on Monolithic 3D Integration for High-Performance SystemsProceedings of the 18th ACM International Symposium on Nanoscale Architectures10.1145/3611315.3633240(1-6)Online publication date: 18-Dec-2023
  • (2023)3D SRAM Macro Design in 3D Nanofabric Process TechnologyIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2023.327265870:7(2858-2867)Online publication date: Jul-2023
  • (2023)Small Footprint 6T-SRAM Design with MIV-Transistor Utilization in M3D-IC Technology2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00027(118-125)Online publication date: 6-Nov-2023
  • (2023)Thermally-Aware Multi-Core Chiplet Stacking2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323991(1-9)Online publication date: 28-Oct-2023
  • (2022)High Bandwidth Thermal Covert Channel in 3-D-Integrated Multicore ProcessorsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2022.320343030:11(1654-1667)Online publication date: Nov-2022
  • (2022)Interconnect in the Era of 3DIC2022 IEEE Custom Integrated Circuits Conference (CICC)10.1109/CICC53496.2022.9772820(1-5)Online publication date: Apr-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media