Skip to main content
Log in

Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This article presents an algorithm to reduce cache conflicts and improve cache localities. The proposed algorithm analyzes locality reference space for each reference pattern, partitions the multi-level cache into several parts with different sizes, and then maps array data onto the scheduled cache positions to eliminate cache conflicts. A greedy method for rearranging array variables in declared statement is also developed, to reduce the memory overhead for mapping arrays onto a partitioned cache. Besides, loop tiling and the proposed schemes are combined to exploit opportunities for both temporal and spatial reuse. Atom is used as a tool to develop a simulation of the behavior of the direct-mapping cache to demonstrate that our approach is effective at reducing number of cache conflicts and exploiting cache localities. Experimental results reveal that applying the cache partitioning scheme can greatly reduce the cache conflicts and thus save program execution time in both single-level cache and multi-level cache hierarchies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. F. Bacon, S. L. Graham, and O. J. Sharp.Compiler transformations for high-performance computing.Technical report UCB/CSD-93-781. Computer Science Division, University of California, Berkeley 1993.

    Google Scholar 

  2. D. F. Bacon, J. H. Chow, D. C. R. Ju, K. Muthukumar and V. Sarkar A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness In CASCON '94 pp 270-282 Toronto, Canada 1994.

  3. F. Chen, T. W. O 'Neil, and E. Sha.Machine architecture optimizing overall loop schedules using prefetching and partitioning. IEEE Transactions on Parallel and Distributed Systems, 11(6):604-614, 2000.

    Google Scholar 

  4. K. Hwang and F. A. Briggs. Computer Architecture and Parallel Processing McGraw-Hill, Inc. 1984.

  5. M. Kandemir, J. Ramanujam, and A. Choudhary. Improving cache locality by a combination of loop and data transformations. IEEE Transactions on Computers, 48(2):159-167, 1999.

    Google Scholar 

  6. M. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance of blocked algorithms. In Proceedings of the Fourth International Conference Architectural Support for Programming Languages and Operating Systems, pp.63-74, 1991.

  7. A. R. Lebeck and D. A. Wood.Cache profiling and the SPEC benchmarks: A case study. IEEE Computer, 27(10):15-26, 1994.

    Google Scholar 

  8. J. H. Lee, M. Y. Lee, S. U. Choi, and M. S. Park. Reducing cache conflicts in data cache prefetching. Computer Architecture News, 22(4):71-77, 1994.

    Google Scholar 

  9. L. S. Liu, C. W. Ho, and J. P. Sheu. On the parallelism of nested for-loops using index shift method. In Proceedings of the International Conference on Parallel Processing, vol. II, pp. 119-123, 1990.

    Google Scholar 

  10. N. Manjikian and T. S. Abdelrahman. Reduction of cache conflicts in loop nests Technical report CSRI-318 Computer Systems Research Institute, University of Toronto March 1995.

    Google Scholar 

  11. T. Mowry. Tolerating latency through software-controlled data prefetching. Ph.D. dissertation. Dept. of Electrical Engineering, Standford University, 1994.

  12. P. R. Panda, H. Nakamura, N. D. Dutt, and A. Nicolau. Augmenting loop tiling with data alignment for improved cache performance. IEEE Transactions on Computers, 48(2):142-149, 1999.

    Google Scholar 

  13. S. Przybylski, M. Horowitz, and J. L. Hennessy. Performance tradeoffs in cache design. In Proceedings of the 15th Symposium Computer Architecture, pp. 290-298, 1988.

  14. G. Rivera and C. W. Tesig. Data transformations for eliminating conflict misses. In Proceedings of the 1998 ACM SIGPLAN Conference on Programming Language Design and Implementation, 1998.

  15. O. Temam, C. Fricker, and W. Jalby. Impact of cache interferences on usual numerical dense loop nests. Proceedings of the IEEE, 81(8):1103-1115, 1993.

    Google Scholar 

  16. M. J. Wolfe. Iteration space tiling for memory hierarchies. In Proceedings of the Third SIAM Conference Parallel Processing for Scientific Computing, pp. 357-361, 1987.

  17. M. E. Wolf and M. S. Lam.A data locality optimizing algorithm. In Proceedings of ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, pp. 30-44, 1991.

  18. D. C. Wong, E. W. Davis, and J. O. Young. A software approach to avoiding spatial cache collisions in parallel processor systems. IEEE Transactions on Parallel and Distributed Systems, 9(6):601-608, 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, CY., Sheu, JP. & Chen, HC. Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping. The Journal of Supercomputing 22, 197–219 (2002). https://doi.org/10.1023/A:1014982819342

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1014982819342

Navigation