Abstract
Despite Imagine presents an efficient memory hierarchy, the straightforward programming of scientific applications does not match the available memory hierarchy and thereby constrains the performance of stream applications. In this paper, we explore a novel matrix-based programming optimization for improving the memory hierarchy performance to sustain the operands needed for highly parallel computation. Our specific contributions include that we formulate the problem on the Data&Computation Matrix (D&C Matrix) that is proposed to abstract the relationship between streams and kernels, and present the key techniques for improving the multilevel bandwidth utilization based on this matrix. The experimental evaluation on five representative scientific applications shows that the new stream programs yielded by our optimization can effectively enhance the locality in LRF and SRF, improve the capacity utilization of LRF and SRF, make the best use of SPs and SBs, and avoid index stream overhead.
This work was supported by the National High Technology Development 863 Program of China under Grant No. 2004AA1Z2210.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Saman Amarasinghe, W.: Stream Architectures. In: PaCT 2003 (September 2003)
Khailany, B., et al.: Imagine: Media processing with streams. IEEE Micro 21(2), 35–46 (2001)
Kapasi, U.J., Rixner, S., Dally, W.J., Khailany, B., Ahn, J.H., Mattson, P., Owens, J.D.: Programmable Stream Processors. IEEE Computer, 54–62 (August 2003)
Khailany, B.: The VLSI Implementation and Evaluation of Area-and Energy-Efficient Streaming Media Processors. Ph.D. thesis, Stanford University (2003)
Zeng, L.: Fusion and Partition-Research on Memory-access-sequence Optimization. Ph.D. thesis, National University of Defense Technology, China (2006)
Johnsson, O., Stenemo, M., ul-Abdin, Z.: Programming & Implementation of Streaming Applications. Master’s thesis, Computer and Electrical Engineering Halmstad University (2005)
Amarasinghe, S., et al.: Stream Languages and Programming Models. In: PaCT 2003, September 27 (2003)
Jayasena, N.S.: Memory Hierarchy Design for Stream Computing. Ph.D. thesis, Stanford University (2005)
Mattson, P., et al.: Imagine Programming System Developer’s Guide (2002), http://cva.stanford.edu
Das, A., Mattson, P., et al.: Imagine Programming System User’s Guide 2.0 (June 2004)
Mattson, P.R.: A Programming System for the Imagine Media Processor. Dept. of Electrical Engineering. Ph.D. thesis, Stanford University (2002)
Suh, J., Kim, E.-G., Crago, S.P., Srinivasan, L., French, M.C.: A Performance Analysis of PIM, Stream Processing, and Tiled Processing on Memory-Intensive Signal Processing Kernels. In: ISCA 2003 (2003)
Kuck, D., Kuhn, R., Padua, D., Leasure, B., Wolfe, M.J.: Dependence graphs and compiler optimizations. In: Conference Record of the Eighth Annual ACM Symposium on the Principles of Programming Languages, Williamsburg, VA (January 1981)
Xue, J.: Loop Tiling for Parallelism. Kluwer Academic Publishers, Boston (2000)
Wolfe, M.J.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1996)
Du, J., Yang, X., et al.: Scientific Computing Applications on the Imagine Stream Processor. In: Jesshope, C., Egan, C. (eds.) ACSAC 2006. LNCS, vol. 4186, pp. 38–51. Springer, Heidelberg (2006)
Ahn, J.H., Dally, W.J., et al.: Evaluating the Imagine Stream Architecture. In: ISCA 2004 (2004)
Wolfe, M.J.: Optimizing Supercompilers for Supercomputers. The MIT Press, Cambridge (1989)
Wolf, M.E., Lam, M.: A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems 2(4), 452–471 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, X., Du, J., Yan, X., Deng, Y. (2006). Matrix-Based Programming Optimization for Improving Memory Hierarchy Performance on Imagine. In: Guo, M., Yang, L.T., Di Martino, B., Zima, H.P., Dongarra, J., Tang, F. (eds) Parallel and Distributed Processing and Applications. ISPA 2006. Lecture Notes in Computer Science, vol 4330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11946441_71
Download citation
DOI: https://doi.org/10.1007/11946441_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68067-3
Online ISBN: 978-3-540-68070-3
eBook Packages: Computer ScienceComputer Science (R0)