skip to main content
10.1145/871506.871569acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
Article

Reducing data cache energy consumption via cached load/store queue

Published:25 August 2003Publication History

ABSTRACT

High-performance processors use a large set--associative L1 data cache with multiple ports. As clock speeds and size increase such a cache consumes a significant percentage of the total processor energy. This paper proposes a method of saving energy by reducing the number of data cache accesses. It does so by modifying the Load/Store Queue design to allow "caching" of previously accessed data values on both loads and stores after the corresponding memory access instruction has been committed. It is shown that a 32-entry modified LSQ design allows an average of 38.5% of the loads in the SpecINT95 benchmarks and 18.9% in the SpecFP95 benchmarks to get their data from the LSQ. The reduction in the number of L1 cache accesses results in up to a 40% reduction in the L1 data cache energy consumption and in an up to a 16% improvement in the energy--delay product while requiring almost no additional hardware or complex control logic.

References

  1. T. M. Austin and G. S. Sohi. Zero-cycle loads: Microarchitecture support for reducing load latency. pages 82--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Bodik, R. Gupta, and M. L. Soffa. Load-reuse analysis: Design and evaluation. In SIGPLAN Conference on Programming Language Design and Implementation, pages 64--76, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Brooks, V. Tiwari, and M. Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In ISCA, pages 83--94, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Burger and T. M. Austin. The simplescalar tool set, version 2.0. Technical Report TR-97-1342, University of Wisconsin-Madison, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Diefendorff. K7 challenges Intel. Microprocessor Report, 12(14):1--7, Oct. 1998.Google ScholarGoogle Scholar
  6. G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, (Q1):13, Feb. 2001.Google ScholarGoogle Scholar
  7. K. Inoue, T. Ishihara, and K. Murakami. Way-predicting set-associative cache for high performance and low energy consumption. In ACM/IEEE International Symposium on Low Power Electronics and Design, pages 273--275, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. E. Kessler. The Alpha 21264 microprocessor. IEEE Micro, 19(2):24--36, Mar. Apr. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Kin, M. Gupta, and W. H. Mangione-Smith. The filter cache: An energy efficient memory structure. In International Symposium on Microarchitecture, pages 184--193, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. M. Lepak. Silent stores for free: Reducing the cost of store verification. Master's thesis, University of Wisconsin--Madison, 2000.Google ScholarGoogle Scholar
  11. A. Moshovos and G. S. Sohi. Read-after-read memory dependence prediction. 1999.Google ScholarGoogle Scholar
  12. D. Nicolaescu, A. Veidenbaum, and A. Nicolau. Reducing power consumption for high-associativity data caches in embedded processors. In DATE2003 Proceedings, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W. Tang, A. Veidenbaum, A. Nicolau, and R. Gupta. Simultaneous way-footprint prediction and branch prediction for energy savings in set-associative instruction caches. In IEEE Workshop on Power Management for Real-Time and Embedded Systems, 2001.Google ScholarGoogle Scholar
  14. J. Yang and R. Gupta. Energy-efficient load and store reuse. In ACM/IEEE International Symposium on Low Power Electronics and Design, pages 72--75, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reducing data cache energy consumption via cached load/store queue

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and design
      August 2003
      502 pages
      ISBN:158113682X
      DOI:10.1145/871506

      Copyright © 2003 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 August 2003

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      ISLPED '03 Paper Acceptance Rate90of221submissions,41%Overall Acceptance Rate398of1,159submissions,34%

      Upcoming Conference

      ISLPED '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader