Article

Exploiting non-uniform reuse for cache optimization

Author:
Claudia Leopold

Friedrich-Schiller-Universität Jena, Institut für Informatik, 07740 Jena, Germany

Friedrich-Schiller-Universität Jena, Institut für Informatik, 07740 Jena, Germany
View Profile

SAC '01: Proceedings of the 2001 ACM symposium on Applied computingMarch 2001Pages 560–564https://doi.org/10.1145/372202.372462

Published:01 March 2001Publication History

SAC '01: Proceedings of the 2001 ACM symposium on Applied computing

Pages 560–564

References

1.D. F. Bacon, S. L. Graham, and O. J. Sharp. Compiler transformations for high-performance computing. ACM Computing Surveys, 26(4):345-420, Dec. 1994. Google ScholarDigital Library
2.R. Berrendorf and H. Ziegler. PCL: The Performance Counter Library: A Common Interface to Access Hardware Performance Counters on Microprocessors (Version 1.2), 1998/99. FZJ-ZAM-IB-9816, Available at http://www.fz-juelich.de/zam/PCL/.Google Scholar
3.S. Chatterjee and S. Sen. Cache-efficient matrix transposition. In Proceedings of the Sixth IEEE International Symposium on High-Performance Computer Architecture, pages 195-205, 2000.Google Scholar
4.S. Coleman and K. S. McKinley. Tile size selection using cache organization and data layout. ACM SIGPLAN Notices, 30(6):279-290, June 1995. Google ScholarDigital Library
5.D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing, 5(5):587-616, Oct. 1988. Google ScholarDigital Library
6.S. Ghosh, M. Martonosi, and S. Malik. Cache miss equations: A compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems, 21(4):703-746, Nov. 1999. Google ScholarDigital Library
7.M. Kandemir, J.Ramanujam, and A. Choudhary. Improving cache locality by a combination of loop and data transformations. IEEE Transactions on Computers, 48(2), 1999. Google ScholarDigital Library
8.I. Kodukula, K. Pingali, R. Cox, and D. Maydan. An experimental evaluation of tiling and shackling for memory hierarchy management. In Proceedings of the ACM Int. Conference on Supercomputing, pages 482-490, 1999. Google ScholarDigital Library
9.C. Leopold. Arranging statements and data of program instances for locality. Future Generation Computer Systems, 14:293-311, 1998. Google ScholarDigital Library
10.C. Leopold. Generating structured program instances with a high degree of locality. In Proceedings of the 8th Euromicro Workshop on Parallel and Distributed Processing, pages 267-274. IEEE Computer Society Press, 2000. Google ScholarDigital Library
11.K. S. McKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4):424-453, July 1996. Google ScholarDigital Library
12.S. S. Muchnick. Advanced compiler design and implementation. Morgan Kaufmann Publishers, 1997. Google ScholarDigital Library
13.N. Mukhopadhyay. On the Effectiveness of Feedback-Guided Parallelization. PhD thesis, University of Manchester, 1999.Google Scholar
14.G. Rivera and C.-W. Tseng. A comparison of compiler tiling algorithms. In Proceedings of the Int. Conference on Compiler Construction, pages 168-182. Springer LNCS 1575, 1999. Google ScholarDigital Library
15.G. Rivera and C.-W. Tseng. Locality optimizations for multi-level caches. In SC'99, 1999. Available at http://w3.csc.ucm.es/Otros/sc99/techpap.htm. Google ScholarDigital Library
16.O. Temam, E. D. Granston, and W. Jalby. To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In Proceedings IEEE Supercomputing'93. IEEE Computer Society Press, 1993. Google ScholarDigital Library
17.M. E. Wolf and M. S. Lam. A data locality optimizing algorithm. SIGPLAN Notices, 26(6):30-44, June 1991. Google ScholarDigital Library
18.M. J. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996. Google ScholarDigital Library

Index Terms

Exploiting non-uniform reuse for cache optimization
1. Information systems
  1. Information storage systems
    1. Storage management
2. Software and its engineering

Recommendations

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches ...
Read More
Modeling LRU cache with invalidation

Least Recently Used (LRU) is a very popular caching replacement policy. It is very easy to implement and offers good performance, especially when data requests are temporally correlated, as in the case of web traffic.When the data content can change ...
Read More
Reshaping cache misses to improve row-buffer locality in multicore systems
PACT '13: Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Optimizing cache locality has always been important since the emergence of caches, and numerous cache locality optimization schemes have been published in compiler literature. However, in modern architectures, cache locality is not the only factor that ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '01: Proceedings of the 2001 ACM symposium on Applied computing
March 2001
692 pages
ISBN:1581132875
DOI:10.1145/372202
Chairman:
G. B. Lamont
Air Force Institute of Technology
Copyright © 2001 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 March 2001
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
caching
data locality
program restructuring
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 207
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Exploiting non-uniform reuse for cache optimization

SAC '01: Proceedings of the 2001 ACM symposium on Applied computing

References

Cited By

Index Terms

Recommendations

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies

Modeling LRU cache with invalidation

Reshaping cache misses to improve row-buffer locality in multicore systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Exploiting non-uniform reuse for cache optimization

SAC '01: Proceedings of the 2001 ACM symposium on Applied computing

References

Cited By

Index Terms

Recommendations

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies

Modeling LRU cache with invalidation

Reshaping cache misses to improve row-buffer locality in multicore systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media