Skip to main content

A methodology for generating efficient disk-based algorithms from tensor product formulas

  • Conference paper
  • First Online:
Book cover Languages and Compilers for Parallel Computing (LCPC 1993)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 768))

Abstract

In this paper, we address the issue of automatic generation of disk-based algorithms from tensor product formulas. Disk-based algorithms are required in scientific applications which work with large data sets that do not fit entirely into main memory. Tensor products have been used for designing and implementing block recursive algorithms on shared-memory, vector and distributed-memory multiprocessors. We extend this theory to generate disk-based code from tensor product formulas. The methodology is based on generating algebraically equivalent tensor product formulas which have better disk performance. We demonstrate this methodology by generating disk-based code for the fast Fourier transform.

This work was supported in part by DARPA, order number 7898, monitored by NIST under grant number 60NANB1D1151, DARPA, order number 7899, monitored by NIST under grant number 60NANB1D1150.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W. O. Alltop. A computer algorithm for transposing nonsquare matrices. IEEE Transactions on Computers, C-24(10):1038–1040, 1975.

    Google Scholar 

  2. R. Alverson, D. Callahan, D. Cummings, B.Koblenz, A. Porterfield, and B. Smith. The Tera computer system. In 1990 International Conference on Supercomputing, pages 1–6, 1990.

    Google Scholar 

  3. G. L. Anderson. A stepwise approach to computing the multidimensional fast Fourier transform of large arrays. IEEE Transactions on Acoustics and Speech Signal Processing, ASSP-28(3):280–284, 1980.

    Article  Google Scholar 

  4. M. B. Ari. On transposing large 2n × 2n matrices. IEEE Transactions on Computers, C-27(1):72–75, 1979.

    Google Scholar 

  5. D. H. Bailey. FFTs in external or hierarchical memory. Journal of Supercomputing, 4:23–35, 1990.

    Article  Google Scholar 

  6. P. M. Chen and D. A. Patterson. Maximizing performance in a striped disk array. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 322–331, 1990.

    Google Scholar 

  7. T. H. Cormen. Fast permuting on disk arrays. Journal of Parallel and Distributed Computing, 17:41–57, Jan.–Feb. 1993.

    Article  Google Scholar 

  8. L. G. Delcaro and G. L. Sicuranza. A method on transposing externally stored matrices. IEEE Transactions on Computers, C-23(9):801–803, 1974.

    Google Scholar 

  9. J. O. Eklundh. A fast computer method for matrix transposing. IEEE Transactions on Computers, 20(7):801–803, 1972.

    Google Scholar 

  10. D. Fraser. Array permutation by index-digit permutation. Journal of ACM, 23(2):298–309, 1976.

    Google Scholar 

  11. G. C. Goldbogen. Prim: A fast matrix transpose method. IEEE Transactions on Software Engineering, SE-7(2):255–257, 1981.

    Google Scholar 

  12. A. Graham. Kronecker Products and Matrix Calculus: With Applications. Ellis Horwood Limited, 1981.

    Google Scholar 

  13. S. K. S. Gupta, S. D. Kaushik, C.-H. Huang, J. R. Johnson, R. W. Johnson, and P. Sadayappan. A methodology for the generation of data distributions to optimize communication. In Fourth IEEE Symposium on Parallel and Distributed Processing, pages 436–441, 1992.

    Google Scholar 

  14. R. A. Horn and C. R. Johnson. Topics in Matrix Analysis. Cambridge University Press, Cambridge, 1991.

    Google Scholar 

  15. C.-H. Huang, J. R. Johnson, and R. W. Johnson. Generating parallel programs from tensor product formulas: A case study of Strassen's matrix multiplication algorithm. In Proc. International Conference on Parallel Processing 1992, pages 104–108, 1992.

    Google Scholar 

  16. Paragon XP/S product overview. Intel Corporation, 1991.

    Google Scholar 

  17. J. R. Johnson, R. W. Johnson, D. Rodriguez, and R. Tolimieri. A methodology for designing, modifying and implementing fourier transform algorithms on various architectures. Circuits Systems Signal Process, 9(4):449–500, 1990.

    Google Scholar 

  18. R. W. Johnson, C.-H. Huang, and J. R. Johnson. Multilinear algebra and parallel programming. Journal of Supercomputing, 5:189–218, 1991.

    Google Scholar 

  19. S. D. Kaushik, C.-H. Huang, J. R. Johnson, R. W. Johnson, and P. Sadayappan. Efficient transposition algorithms for large matrices. In Supercomputing '93, 1993. To appear.

    Google Scholar 

  20. S. D. Kaushik, S. Sharma, and C.-H. Huang. An algebraic theory for modeling multistage interconnection networks. Journal of Information Science and Engineering. To appear.

    Google Scholar 

  21. S. D. Kaushik, S. Sharma, C.-H. Huang, J. R. Johnson, R. W. Johnson, and P. Sadayappan. An algebraic theory for modeling direct interconnection networks. In Supercomputing '92, pages 488–497, 1992.

    Google Scholar 

  22. D. A. Patterson, G. Gibson, and R. H. Katz. A case for redundant arrays of inexpensive disks. In Proceedings of the ACM International Conference on Management of Data (SIGMOD), pages 109–116, June 1988.

    Google Scholar 

  23. H. K. Ramapriyan. A generalization of Eklundhs's algorithm for transposing large matrices. IEEE Transactions on Computers, C-24(12):1221–1226, 1975.

    Google Scholar 

  24. A. Reddy and P. Banerjee. Evaluation of multiple-disk I/O systems. IEEE Tramactions on Computers, 38:1680–1690, December 1989.

    Google Scholar 

  25. U. Schumann. Comment on ‘a fast computer method for matrix transposing'. IEEE Transactions on Computers, C-22(5):542–543, 1973.

    Google Scholar 

  26. R. C. Singleton. A method for computing the fast Fourier transform with auxiliary memory and limited high-speed storage. IEEE Transactions on Audio and Electroacoustics, AU-15(2):91–98, 1967.

    Google Scholar 

  27. R. E. Twogood and M. P. Ekstrom. An extension of Eklundh's matrix transposition algorithm and its application to digital signal processing. IEEE Transactions on Computers, C-25(12):950–952, 1976.

    Google Scholar 

  28. C. Van Loan. Computational framework for the Fast Fourier Transform. SIAM, 1992.

    Google Scholar 

  29. J. S. Vitter and M. Shriver. Optimal disk I/O with parallel block transfer. In Twenty Second Annual ACM Symposium on Theory of Computing, pages 159–169, May 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaushik, S.D., Huang, C.H., Johnson, R.W., Sadayappan, P. (1994). A methodology for generating efficient disk-based algorithms from tensor product formulas. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1993. Lecture Notes in Computer Science, vol 768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57659-2_21

Download citation

  • DOI: https://doi.org/10.1007/3-540-57659-2_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-57659-4

  • Online ISBN: 978-3-540-48308-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics