Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping

Chang, Chih-Yung; Sheu, Jang-Ping; Chen, Hsi-Chiuen

doi:10.1023/A:1014982819342

Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping

Published: June 2002

Volume 22, pages 197–219, (2002)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Chih-Yung Chang¹,
Jang-Ping Sheu¹ &
Hsi-Chiuen Chen¹

89 Accesses
1 Citation
Explore all metrics

Abstract

This article presents an algorithm to reduce cache conflicts and improve cache localities. The proposed algorithm analyzes locality reference space for each reference pattern, partitions the multi-level cache into several parts with different sizes, and then maps array data onto the scheduled cache positions to eliminate cache conflicts. A greedy method for rearranging array variables in declared statement is also developed, to reduce the memory overhead for mapping arrays onto a partitioned cache. Besides, loop tiling and the proposed schemes are combined to exploit opportunities for both temporal and spatial reuse. Atom is used as a tool to develop a simulation of the behavior of the direct-mapping cache to demonstrate that our approach is effective at reducing number of cache conflicts and exploiting cache localities. Experimental results reveal that applying the cache partitioning scheme can greatly reduce the cache conflicts and thus save program execution time in both single-level cache and multi-level cache hierarchies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Beyond Do Loops: Data Transfer Generation with Convex Array Regions

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Abstraction of Arrays Based on Non Contiguous Partitions

References

D. F. Bacon, S. L. Graham, and O. J. Sharp.Compiler transformations for high-performance computing.Technical report UCB/CSD-93-781. Computer Science Division, University of California, Berkeley 1993.
Google Scholar
D. F. Bacon, J. H. Chow, D. C. R. Ju, K. Muthukumar and V. Sarkar A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness In CASCON '94 pp 270-282 Toronto, Canada 1994.
F. Chen, T. W. O 'Neil, and E. Sha.Machine architecture optimizing overall loop schedules using prefetching and partitioning. IEEE Transactions on Parallel and Distributed Systems, 11(6):604-614, 2000.
Google Scholar
K. Hwang and F. A. Briggs. Computer Architecture and Parallel Processing McGraw-Hill, Inc. 1984.
M. Kandemir, J. Ramanujam, and A. Choudhary. Improving cache locality by a combination of loop and data transformations. IEEE Transactions on Computers, 48(2):159-167, 1999.
Google Scholar
M. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance of blocked algorithms. In Proceedings of the Fourth International Conference Architectural Support for Programming Languages and Operating Systems, pp.63-74, 1991.
A. R. Lebeck and D. A. Wood.Cache profiling and the SPEC benchmarks: A case study. IEEE Computer, 27(10):15-26, 1994.
Google Scholar
J. H. Lee, M. Y. Lee, S. U. Choi, and M. S. Park. Reducing cache conflicts in data cache prefetching. Computer Architecture News, 22(4):71-77, 1994.
Google Scholar
L. S. Liu, C. W. Ho, and J. P. Sheu. On the parallelism of nested for-loops using index shift method. In Proceedings of the International Conference on Parallel Processing, vol. II, pp. 119-123, 1990.
Google Scholar
N. Manjikian and T. S. Abdelrahman. Reduction of cache conflicts in loop nests Technical report CSRI-318 Computer Systems Research Institute, University of Toronto March 1995.
Google Scholar
T. Mowry. Tolerating latency through software-controlled data prefetching. Ph.D. dissertation. Dept. of Electrical Engineering, Standford University, 1994.
P. R. Panda, H. Nakamura, N. D. Dutt, and A. Nicolau. Augmenting loop tiling with data alignment for improved cache performance. IEEE Transactions on Computers, 48(2):142-149, 1999.
Google Scholar
S. Przybylski, M. Horowitz, and J. L. Hennessy. Performance tradeoffs in cache design. In Proceedings of the 15th Symposium Computer Architecture, pp. 290-298, 1988.
G. Rivera and C. W. Tesig. Data transformations for eliminating conflict misses. In Proceedings of the 1998 ACM SIGPLAN Conference on Programming Language Design and Implementation, 1998.
O. Temam, C. Fricker, and W. Jalby. Impact of cache interferences on usual numerical dense loop nests. Proceedings of the IEEE, 81(8):1103-1115, 1993.
Google Scholar
M. J. Wolfe. Iteration space tiling for memory hierarchies. In Proceedings of the Third SIAM Conference Parallel Processing for Scientific Computing, pp. 357-361, 1987.
M. E. Wolf and M. S. Lam.A data locality optimizing algorithm. In Proceedings of ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, pp. 30-44, 1991.
D. C. Wong, E. W. Davis, and J. O. Young. A software approach to avoiding spatial cache collisions in parallel processor systems. IEEE Transactions on Parallel and Distributed Systems, 9(6):601-608, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Science, Aletheia University, 32 Chen-Li St., Tamsui, Tapiei, Taiwan
Chih-Yung Chang, Jang-Ping Sheu & Hsi-Chiuen Chen

Authors

Chih-Yung Chang
View author publications
You can also search for this author in PubMed Google Scholar
Jang-Ping Sheu
View author publications
You can also search for this author in PubMed Google Scholar
Hsi-Chiuen Chen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, CY., Sheu, JP. & Chen, HC. Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping. The Journal of Supercomputing 22, 197–219 (2002). https://doi.org/10.1023/A:1014982819342

Download citation

Issue Date: June 2002
DOI: https://doi.org/10.1023/A:1014982819342

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping

Abstract

Access this article

Similar content being viewed by others

Beyond Do Loops: Data Transfer Generation with Convex Array Regions

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Abstraction of Arrays Based on Non Contiguous Partitions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping

Abstract

Access this article

Similar content being viewed by others

Beyond Do Loops: Data Transfer Generation with Convex Array Regions

An Affine Scheduling Framework for Integrating Data Layout and Loop Transformations

Abstraction of Arrays Based on Non Contiguous Partitions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation