Abstract
This paper presents a systematic approach that integrates compiler optimization of data layout and traditional loop transformations to reduce cache coherence overhead. A formal model based on an interference graph, overview of the optimization algorithms, and an example are given. Excerpts from an empirical evaluation of the complexity of the compiler analysis, and the simulation study of the resulting reductions in bus traffic and execution time, are also presented. Additional details appear in [7].
Preview
Unable to display preview. Download preview PDF.
References
Chang, L.Y., and Dietz, H.G., Data Layout and Loop Restructuring for Paged Memory Systems, Purdue University, 1990, TR-EE 90-43, Purdue university.
Deo, M., Graph Theory with Applications to Engineering and Computer Science, Prentice-Hall, 1974, pp. 314.
Fang, Z., Cache or Local Memory Thrashing and Compiler Strategy in Parallel Processing System, ICPP, 1990, vol. II, pp 271–275.
Eggers, S.J., and Katz, R.H., A Characterization of Sharing in Parallel Programs and its Application to Coherency Protocol Evaluation, Proceedings of the 15th Annual Internation Symposium on Computer Architecture, Honolulu HA (May 1988), pp. 373–383.
Eggers, S.J., and Katz, R.H., The Effect of Sharing on the Cache and Bus Performance of Parallel Programs, Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, MA (April 1989).
Hennessy, J.L., and Patterson, D.A., Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, Inc. 1990.
Ju, Y.J., Compiler Data Layout and Code Transformation for Redcuing Cache Coherence Overhead, Ph.D. dissertation, School of Electrical Engineering, Purdue University, 1991.
Katz, R.H., Eggers, S.J., Wood, D., Perkins, C.L., and Sheldon, R., Implementing a Cache Consistency Protocol, Proceedings of the 12th Annual International Symposium on Computer Architecture, 13, 3 (June 1985), pp. 276–283.
Mace, M.E., Memory Storage Patterns in Parallel Processing, Klumer Academic Publishers, 1987.
Owicki, S., and Agarwal, A., Evaluating the Performance of Software Cache Coherency, ASPLOS III, April 1989.
Padua, D.A., and Wolfe, M., More Iteration Space Tiling, International Conference on Supercomputing, Reno, Nevada, November 1989, 655–664.
Torrellas, J., Lam, M.S., and Hennessy, J.L., Shared Data Placement Optimization to Reduce Multiprocessor Cache Miss Rates, ICPP, 1990, vol. II, pp 266–270.
Weber, W., and Gupta, A., Analysis of Cache Invalidation Patterns in multiprocessors, 3rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-III), April 1989, pp. 243–256.
Wolfe, M.J., Optimizing Supercomputer for Supercomputers, Ph.D. Thesis, Univ. of Illinois, October 1982.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ju, Y.J., Dietz, H. (1992). Reduction of cache coherence overhead by compiler data layout and loop transformation. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1991. Lecture Notes in Computer Science, vol 589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0038675
Download citation
DOI: https://doi.org/10.1007/BFb0038675
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55422-6
Online ISBN: 978-3-540-47063-2
eBook Packages: Springer Book Archive