skip to main content
10.1145/1973009.1973028acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article

Design and management of 3D-stacked NUCA cache for chip multiprocessors

Authors Info & Claims
Published:02 May 2011Publication History

ABSTRACT

Power and delay induced from long on-chip interconnections are becoming major issues of chip multiprocessor design. Both network-on-chip (NoC) and three-dimensional integration are promising ways to mitigate the interconnection problem. In this paper, we explore the design of 3Dstacked non-uniform cache architecture (NUCA) with onchip network. In addition, this paper investigates the problem of partitioning shared L2 cache for concurrently executing multiple applications in order to improve the system performance in terms of instructions per cycle. The proposed design is evaluated in an integrated power, performance, and temperature simulator. Experimental results show that the proposed method enhances system performance by 23.3% and reduces energy consumption by 17.9% for 16-core processor system compared to conventional design.

References

  1. Intel products. {Online}. http://www.intel.com/products/processor/index.htmGoogle ScholarGoogle Scholar
  2. Annavaram, M. and et al. 2005. Mitigating Amdahl's Law through EPI Throttling. In Proc. of the 32nd Ann. Int. Symp. on Comp. Architecture (ISCA). IEEE Computer Society, Washington, DC, USA, 298--309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Vangal, S. and et al. 2008. An 80-tile sub-100-W TeraFLOPS processor in 65-nm CMOS. IEEE Journal of Solid-State Circuits, 43, 1, 29--41.Google ScholarGoogle ScholarCross RefCross Ref
  4. Kim, C. and et al. 2002. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. SIGARCH Comput. Archit. News 30, 5, 211--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Loh, G. 2008. 3D-Stacked Memory Architectures for Multi-core Processors. In Proc. of the 35th Ann. Int. Symp. on Comp. Architecture (ISCA). IEEE Computer Society, Washington, DC, USA, 453--464. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kang, K. and et al. 2010. Temperature-Aware Integrated DVFS and Power Gating for Executing Tasks with Runtime Distribution. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 29, 9, 1381--1394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Zia, A. and et al. 2010. A 3-D Cache With Ultra-Wide Data Bus for 3-D Processor-Memory Integration. IEEE Trans. Very Large Scale Integr. Syst. 18, 6, 967--977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tsai, Y. and et al. 2008. Design space exploration for 3-D cache. IEEE Trans. Very Large Scale Integr. Syst. 16, 4, 444--455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Li, F. and et al. 2006. Design and Management of 3D Chip Multiprocessors Using Network-in-Memory. SIGARCH Comput. Archit. News 34, 2, 130--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sun, G. and et al. 2009. Exploration of 3D stacked L2 cache design for high performance and efficient thermal control. In Proc. of the 14th ACM/IEEE int. symp. on Low power electronics and design. 295--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chang, J. and Sohi, G. 2006. Cooperative Caching for Chip Multiprocessors. SIGARCH Comput. Archit. News 34, 2, 264--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Qureshi, M. and Patt, Y. 2006. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In Proc. of the 39th Ann. IEEE/ACM Int. Symp. on Microarchitecture (MICRO). Washington, DC, USA, 423--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jung, J. and et al. 2010. Latency-aware Utility-based NUCA Cache Partitioning in 3D-stacked multi-processor systems. In Proc. of the VLSI System on Chip Conference (VLSI-SoC), 125--130.Google ScholarGoogle Scholar
  14. Cho, S. and et al. 2008. TPTS: A Novel Framework for Very Fast Manycore Processor Architecture Simulation. In Proc. of the 2008 37th Int. Conf. on Parallel Processing (ICPP). IEEE Computer Society, Washington, DC, USA, 446--453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Weiping, L. and et al. 2005. Temperature and supply Voltage aware performance and power modeling at microarchitecture level. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 24, 7, 1042--1053. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Huang, W. and et al. 2006. Hotspot: A compact thermal modeling method for CMOS VLSI systems. IEEE Trans. VLSI Sys, 14, 5, 501--513. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Design and management of 3D-stacked NUCA cache for chip multiprocessors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      GLSVLSI '11: Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSI
      May 2011
      496 pages
      ISBN:9781450306676
      DOI:10.1145/1973009

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 May 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate312of1,156submissions,27%

      Upcoming Conference

      GLSVLSI '24
      Great Lakes Symposium on VLSI 2024
      June 12 - 14, 2024
      Clearwater , FL , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader