Abstract
In the paper, the use of tiling for compilation of reduction statements in the C[] language is considered. A class of statements is distinguished for which the tiling transformation is proven to be correct and a scheme of their transformation to a sequence of reduction statements of a wide class is given. On the basis of a cache interference model, formulas are obtained that make it possible to accurately compute tiling parameters. It is shown that the code for reduction statements generated by the C[] compiler is comparable with (and, often, even better than) specially designed subroutines in terms of the efficiency.
Similar content being viewed by others
REFERENCES
Gaissaryan, S., Lastovetsky, A., Ledovskikh, I., and Khaletskii, D., Extension of ANCI C for Vector and Superscalar Computers. Programmirovanie, 1995, vol. 21, no. 1.
Lastovetsky, A.L., Kalinov, A.Ya., Ledovskikh, I.N., Arapov, D.M., and Posypkin, M.A., A Language and Programming System for High-Performance Parallel Computations on Heterogeneous Networks, Programmirovanie, 2000, vol. 26, no. 4.
State Standard X3.159-1989: ANSI. Programming Language C.
The C[] Language Specification, http://www.ispras.ru/ cbr/cbrsp.html.
Adams, J., Brainerd, W., Martin, J., Smith, B., and Wagener, J., FORTRAN 90 Handbook, New York: McGraw-Hill, 1992.
Basic Linear Algebra Subprograms, http://www.netlib.org/blas.
Bernstein, A.J., Program Analysis for Parallel Processing, IEEE Trans. Electronic Comput., 1966, vol. 15, no. 5, pp. 757-762.
Blelloch, G. and Chatterjee, S., V-Code: A Data-Parallel Intermediate Language. Frontiers of Massively Parallel Computation, 1990, October.
Carr, S., Memory-Hierarchy Management by Steve Carr, PhD Dissertation, Rice University, 1994.
Chakravarty, M., Shroer, W., and Simons, M., V-Nested Parallelism in C, Proc. of the Working Conference on Massively Parallel Programming Models (MPPM), IEEE Computer Society Press, 1995.
Chatterjee, S., Blelloch, G.E., and Zagha, M., Scan Primitives for Vector Computers, Supercomputing'90, 1990.
Chemberlain, B., Lewis, E., and Snyder, L., Array Support for Wavefront and Pipelined Computations, Workshop on Languages and Compilers for Parallel Computing, 1999.
Chemberlain, B., Lin, C., Sung-Eun Choi, Snyder, L., Lewis, E., and Weathersby, W., Factor-Join: A Unique Approach to Compiling Array Languages for Parallel Machines, Proc. of the Ninth Int. Workshop on Languages and Compilers for Parallel Computing, 1996, pp. 481-500.
Chemberlain, B., Lin, C., Sung-Eun Choi, Snyder, L., Lewis, E., and Weathersby, W., ZPL's WYSIWYG Performance Model, Third Int. Workshop on High-Level Programming Models and Supportive Environment, 1998.
Coleman, S. and McKinley, K., The Size Selection Using Cache Organization and Data Layout, Proc. of the Conf. on Programming Language Design and Implementation, La Jolla, CA, 1995.
Gaissaryan, S. and Lastovetsky, A., ANSI C Superset for Vector and Superscalar Computers and Its Retargetable Compiler, J. C Language Translation, 1994, vol. 5, no. 3, pp. 183-198.
Ghosh, S., Martonosi, M., and Malik, S., Cache Miss Equations: An Analytical Representation of Cache Misses. Proc. of the 11th ACM Conf. on Supercomputing, Vienna, 1997.
Hennessy, J. and Patterson, D., Computer Architecture-A Quantitative Approach, Morgan Kaufmann, 1994.
Lam, M., Rothberg, E., and Wolf, M., The Cache Performance and Optimizations of Blocked Algorithms, Proc. of Fourth Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV), 1991.
Lastovetsky, A., mpC-A Multi-Paradigm Programming Language for Massively Parallel Computers, ASM SIGPLAN Notices, 1996, vol. 31, no. 2, pp. 13-20.
Lewis, E., Lin, C., and Snyder, L., The Implementation and Elevation of Fusion and Contraction in Array Languages, Proc. of the 1998 ASM SIGPLAN Conf. on Programming Languages Design and Implementation, Montreal, 1998.
Lin, C. and Snyder, L., ZPL: An Array Sublanguage, in Languages and Compilers for Parallel Computing, 1993, pp. 96-114.
Roth, G. and Kennedy, K., Dependence Analysis of Fortran 90 Array Syntax, Technical Report no. 96653, Center for Research on Parallel Computations, Rice University.
Roth, G. and Kennedy, K., Loop Fusion in High Performance Fortran, Technical Report no. 96653, Center for Research on Parallel Computations, Rice University.
Roth, G., Mellon-Crummey, J., Kennedy, K., and Brickner, R., Compiling Stencils in High Performance Fortran, Proc. of SC'97: High Performance Networking and Computing, 1997.
Song, Y. and Li, Z., New Tiling Techniques to Improve Cache Temporal Locality,ASM SIGPLAN Conf. on PLDI99, 1999.
Temam, O., Fricker, C., and Jalby, W., Cache Interference Phenomena, Proc. of ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, 1994.
Weigang, J., An Introduction to STSC's APL Computer, APL Quote Quad, 1985, vol. 15, no. 4.
Wolfe, M. and Lam, A., A Data Locality Optimizing Algorithm, ACM SIGPLAN Conf. on PLDI, 1991.
Wolfe, M.E., Improving Locality and Parallelism in Nested Loops, PhD Dissertation, Stanford University, 1992.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kalinov, A.Y., Lastovetsky, A.L., Ledovskikh, I.N. et al. Compilation of Vector Statements of C[] Language for Architectures with Multilevel Memory Hierarchy. Programming and Computer Software 27, 111–122 (2001). https://doi.org/10.1023/A:1010957814813
Issue Date:
DOI: https://doi.org/10.1023/A:1010957814813