research-article

Adaptive loop caching using lightweight runtime control flow analysis

Authors:
Marisha Rawlins

University of Florida, FL, USA

University of Florida, FL, USA
View Profile

,
Ann Gordon-Ross

University of Florida, FL, USA

University of Florida, FL, USA
View Profile

ACM Transactions on Embedded Computing Systems Volume 12 Issue 1sArticle No.: 55pp 1–23https://doi.org/10.1145/2435227.2435251

Published:29 March 2013Publication History

ACM Transactions on Embedded Computing Systems

Abstract

Loop caches provide an effective method for decreasing memory hierarchy energy consumption by storing frequently executed code (critical regions) in a more energy efficient structure than the level one cache. However, due to code structure restrictions or costly design time pre-analysis efforts, previous loop cache designs are not suitable for all applications and system scenarios. We present an adaptive loop cache that is amenable to a wider range of system scenarios, which can provide an additional 20% average instruction cache energy savings (with individual benchmark energy savings as high as 69%) compared to the next best loop cache, the preloaded loop cache.

References

Bellas, N., Hajj, I., Polychronopoulos, C., and Stamoulis, G. 1999. Energy and performance improvements in microprocessor design using a loop cache. In Proceedings of IEEE International Conference on Computer Design (ICCD'99). 378. Google ScholarDigital Library
Burger, D., Austin, T., and Bennet, S. 1996. Evaluating future microprocessors: The SimpleScalar ToolSet. Tech. rep. CS-TR-1308, Computer Science Department, University of Wisconsin-Madison.Google Scholar
Butts, J. A. and Sohi, G. S. 2000. A static power model for architects. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture (MICRO 33). Google ScholarDigital Library
Chaver, D., Rojas, M. A., and Pinuel, L. 2005. Energy-aware fetch mechanism: Trace cache and BTB customization. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED 05). Google ScholarDigital Library
EEMBC. http://www.eembc.org/.Google Scholar
Eeckhout, L., Vandierendonck, H., and De Bosschere, K. 2002. Workload design: selecting representative program-input pairs. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 83--94. Google ScholarDigital Library
Gordon-Ross, A., Viana, P., Vahid, F., Najjar, W., and Barros, E. 2007. A one-shot configurable-cache tuner for improved energy and performance. In Proceedings of Design, Automation and Test in Europe (DATE 07). Google ScholarDigital Library
Gordon-Ross, A. and Vahid, F. 2002a. Dynamic loop caching metes preloaded loop caching—A hybrid approach. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02). Google ScholarDigital Library
Gordon-Ross, A., Cotterell, and Vahid, F. 2002b. Exploiting fixed programs in embedded systems: A Loop cache example. Comput. Architect. Letters, 1. Google ScholarDigital Library
Gordon-Ross, A. and Vahid, F. 2003. Frequent loop detection using efficient non-intrusive on-chip hardware. In Proceedings of the International Conference on Compilers, Architecture and Synthesis For Embedded Systems (CASES'03). Google ScholarDigital Library
Gordon-Ross, A., Lau, J., and Calder, B. 2008. Phase-based cache reconfiguration for a highly-configurable two-level cache hierarchy. In Proceedings of the 18th ACM Great Lakes Symposium on VLSI (GLSVLSI'08). Google ScholarDigital Library
Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE 4th Annual Workshop on Workload Characterization. Google ScholarDigital Library
Hines, S., Whalley, D., and Tyson, G. 2007. Guaranteeing hits to improve the efficiency of a small instruction cache. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture. Google ScholarDigital Library
Kin, J., Gupta, M., and Mangione-Smith, W. H. 1997. The filter cache: an energy efficient memory structure. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. Google ScholarDigital Library
Lee, L. H., Moyer, W., and Arends, J. 1999. Low cost embedded program loop caching -- Revisited. Tech. rep. CSE-TR-411-99, University of Michigan.Google Scholar
Malik, A., Moyer, W., and Cermak, D. 2000. A low power unified cache architecture providing power and performance flexibility. In Proceedings of the International Symposium on Low Power Electronics and Design. Google ScholarDigital Library
Montanaro, J. and Witek, R. 1997. A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor. Digital Techn. J. Google ScholarDigital Library
Rivers, J. A., Asaad, S., Wellman, J.-D., and Moreno, J. H. 2003. Reducing instruction fetch energy with backwards branch control information and buffering. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED'03). Google ScholarDigital Library
Rotenberg, E., Bennett, S., and Smith, J. E. 1996. Trace cache: A low latency approach to high bandwidth instruction fetching. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. Google ScholarDigital Library
Scott, J., Lee, L., Arends, J., and Moyer, B. 1998. Designing the low- power M&sim;CORE Architecture. Proceedings of the International Symposium on Computer Architecture Power Driven Microarchitecture Workshop. 145--150Google Scholar
Segars, S. 2001. Low power design for microprocessors. In Proceedings of the International Solid State Circuit Conference.Google Scholar
Sherwood, T 2003. Discovering and exploiting program phases. IEEE Micro 23, 6, 84--93, 2003. Google ScholarDigital Library
Shivakumar, P. and Jouppi, N. P. 2001. Cacti3.0: An integrated cache timing and power model. COMPAQ Western Research Lab.Google Scholar
Smith, M. D. 2000. Overcoming the challenges to feedback-directed optimization. SIGPLAN Not. 35, 7, 1--11. Google ScholarDigital Library
Villarreal, J., Lysecky, R., Cotterell, S., and Vahid, F. 2002. A Study on the loop behavior of embedded programs. Tech. rep. UCR-CSE-01-03, University of California, Riverside.Google Scholar
Zhang, C. and Vahid, F. 2003. Cache configuration exploration on prototyping platforms. In Proceedings of the 14th IEEE International Workshop on Rapid System Prototyping (RSP 03). Google ScholarDigital Library
Zhang, C., Vahid, F., and Najjar, W. 2003. A highly-configurable cache architecture for embedded systems. In Proceedings of the 30th Annual International Symposium on Computer Architecture. Google ScholarDigital Library

Index Terms

Adaptive loop caching using lightweight runtime control flow analysis
1. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory

Recommendations

Lightweight runtime control flow analysis for adaptive loop caching
GLSVLSI '10: Proceedings of the 20th symposium on Great lakes symposium on VLSI

Loop caches provide an effective method for decreasing memory hierarchy energy consumption by storing frequently executed code in a more energy efficient structure than the level one cache. However, due to code structure restrictions and/or costly ...
Read More
Tiny instruction caches for low power embedded systems

Instruction caches have traditionally been used to improve software performance. Recently, several tiny instruction cache designs, including filter caches and dynamic loop caches, have been proposed to instead reduce software power. We propose several ...
Read More
Tuning of loop cache architectures to programs in embedded system design
ISSS '02: Proceedings of the 15th international symposium on System Synthesis

Adding a small loop cache to a microprocessor has been shown to reduce average instruction fetch energy for various sets of embedded system applications. With the advent of core-based design, embedded system designers can now tune a loop cache ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Embedded Computing Systems Volume 12, Issue 1s
Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
March 2013
701 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/2435227
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 29 March 2013
- Accepted: 1 May 2011
- Revised: 1 January 2011
- Received: 1 September 2010
Published in tecs Volume 12, Issue 1s

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Loop cache
architecture tuning
embedded systems
filter cache
low energy
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 134
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Adaptive loop caching using lightweight runtime control flow analysis

ACM Transactions on Embedded Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Lightweight runtime control flow analysis for adaptive loop caching

Tiny instruction caches for low power embedded systems

Tuning of loop cache architectures to programs in embedded system design

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Adaptive loop caching using lightweight runtime control flow analysis

ACM Transactions on Embedded Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Lightweight runtime control flow analysis for adaptive loop caching

Tiny instruction caches for low power embedded systems

Tuning of loop cache architectures to programs in embedded system design

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media