Skip to main content

Matrix-Based Programming Optimization for Improving Memory Hierarchy Performance on Imagine

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4330))

Abstract

Despite Imagine presents an efficient memory hierarchy, the straightforward programming of scientific applications does not match the available memory hierarchy and thereby constrains the performance of stream applications. In this paper, we explore a novel matrix-based programming optimization for improving the memory hierarchy performance to sustain the operands needed for highly parallel computation. Our specific contributions include that we formulate the problem on the Data&Computation Matrix (D&C Matrix) that is proposed to abstract the relationship between streams and kernels, and present the key techniques for improving the multilevel bandwidth utilization based on this matrix. The experimental evaluation on five representative scientific applications shows that the new stream programs yielded by our optimization can effectively enhance the locality in LRF and SRF, improve the capacity utilization of LRF and SRF, make the best use of SPs and SBs, and avoid index stream overhead.

This work was supported by the National High Technology Development 863 Program of China under Grant No. 2004AA1Z2210.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Saman Amarasinghe, W.: Stream Architectures. In: PaCT 2003 (September 2003)

    Google Scholar 

  2. Khailany, B., et al.: Imagine: Media processing with streams. IEEE Micro 21(2), 35–46 (2001)

    Article  Google Scholar 

  3. Kapasi, U.J., Rixner, S., Dally, W.J., Khailany, B., Ahn, J.H., Mattson, P., Owens, J.D.: Programmable Stream Processors. IEEE Computer, 54–62 (August 2003)

    Google Scholar 

  4. Khailany, B.: The VLSI Implementation and Evaluation of Area-and Energy-Efficient Streaming Media Processors. Ph.D. thesis, Stanford University (2003)

    Google Scholar 

  5. Zeng, L.: Fusion and Partition-Research on Memory-access-sequence Optimization. Ph.D. thesis, National University of Defense Technology, China (2006)

    Google Scholar 

  6. Johnsson, O., Stenemo, M., ul-Abdin, Z.: Programming & Implementation of Streaming Applications. Master’s thesis, Computer and Electrical Engineering Halmstad University (2005)

    Google Scholar 

  7. Amarasinghe, S., et al.: Stream Languages and Programming Models. In: PaCT 2003, September 27 (2003)

    Google Scholar 

  8. Jayasena, N.S.: Memory Hierarchy Design for Stream Computing. Ph.D. thesis, Stanford University (2005)

    Google Scholar 

  9. Mattson, P., et al.: Imagine Programming System Developer’s Guide (2002), http://cva.stanford.edu

  10. Das, A., Mattson, P., et al.: Imagine Programming System User’s Guide 2.0 (June 2004)

    Google Scholar 

  11. Mattson, P.R.: A Programming System for the Imagine Media Processor. Dept. of Electrical Engineering. Ph.D. thesis, Stanford University (2002)

    Google Scholar 

  12. Suh, J., Kim, E.-G., Crago, S.P., Srinivasan, L., French, M.C.: A Performance Analysis of PIM, Stream Processing, and Tiled Processing on Memory-Intensive Signal Processing Kernels. In: ISCA 2003 (2003)

    Google Scholar 

  13. Kuck, D., Kuhn, R., Padua, D., Leasure, B., Wolfe, M.J.: Dependence graphs and compiler optimizations. In: Conference Record of the Eighth Annual ACM Symposium on the Principles of Programming Languages, Williamsburg, VA (January 1981)

    Google Scholar 

  14. Xue, J.: Loop Tiling for Parallelism. Kluwer Academic Publishers, Boston (2000)

    MATH  Google Scholar 

  15. Wolfe, M.J.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1996)

    MATH  Google Scholar 

  16. Du, J., Yang, X., et al.: Scientific Computing Applications on the Imagine Stream Processor. In: Jesshope, C., Egan, C. (eds.) ACSAC 2006. LNCS, vol. 4186, pp. 38–51. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Ahn, J.H., Dally, W.J., et al.: Evaluating the Imagine Stream Architecture. In: ISCA 2004 (2004)

    Google Scholar 

  18. Wolfe, M.J.: Optimizing Supercompilers for Supercomputers. The MIT Press, Cambridge (1989)

    MATH  Google Scholar 

  19. Wolf, M.E., Lam, M.: A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems 2(4), 452–471 (1991)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, X., Du, J., Yan, X., Deng, Y. (2006). Matrix-Based Programming Optimization for Improving Memory Hierarchy Performance on Imagine. In: Guo, M., Yang, L.T., Di Martino, B., Zima, H.P., Dongarra, J., Tang, F. (eds) Parallel and Distributed Processing and Applications. ISPA 2006. Lecture Notes in Computer Science, vol 4330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11946441_71

Download citation

  • DOI: https://doi.org/10.1007/11946441_71

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68067-3

  • Online ISBN: 978-3-540-68070-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics