Abstract
I/O libraries such as PANDA and DRA use blocked layouts for efficient access to disk-resident multi-dimensional arrays, with the shape of the blocks being chosen to match the expected access pattern of the array. Sometimes, different applications, or different phases of the same application, have very different access patterns for an array. In such situations, an array’s blocked layout representation must be transformed for efficient access. In this paper, we describe a new approach to solve the layout transformation problem and demonstrate its effectiveness in the context of the Disk Resident Arrays (DRA) library. The approach handles re-blocking and permutation of dimensions. Results are provided that demonstrate the performance benefit as compared to currently available mechanisms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chen, Y., Foster, I., Nieplocha, J., Winslett, W.: Optimizing collective I/O performance on parallel computers: A multisystem study. In: 11th ACM Intl. Conf. on Supercomputing (1997)
Seamons, K.E., Winslett, M.: Multidimensional array I/O in Panda 1.0. The Journal of Supercomputing 10, 191–211 (1996)
The Panda Project – Data Management for High-Performance Scientific Computation, http://drl.cs.uiuc.edu/panda/
Foster, I., Nieplocha, J.: Disk Resident Arrays: An array-oriented I/O library for out-of-core computations. In: Buyya, R., Jin, H., Cortes, T. (eds.) Disk Arrays and Parallel I/O: Theory and Practice. IEEE Computer Society Press, Los Alamitos (2001)
Anderson, G.L.: A stepwise approach to computing the multidimensional fast Fourier transform of large arrays. IEEE Transactions on Acoustics and Speech Signal Processing 28, 280–284 (1980)
Bailey, D.H.: FFTs in external or hierarchical memory. Journal of Supercomputing 4, 23–35 (1990)
Kazhiyur-Mannar, R., Wenger, R., Crawfis, R., Dey, T.K.: Adaptive resolution isosurface construction in three and four dimensions. Technical Report OSU-CISRC-7/03–TR38, School of Computer and Information Science, The Ohio State University (2003)
Tensor Contraction Engine – Synthesis of High-Performance Algorithms for Electronic Structure Calculations, http://www.cse.ohio-state.edu/~saday/TCE/
Baumgartner, G., Bernholdt, D., Cociorva, D., Harrison, R., Hirata, S., Lam, C., Nooijen, M., Pitzer, R., Ramanujam, J., Sadayappan, P.: A high-level approach to synthesis of high-performance codes for quantum chemistry. In: Proceedings of Supercomputing 2002 (2003)
Cociorva, D., Gao, X., Krishnan, S., Baumgartner, G., Lam, C., Sadayappan, P., Ramanujam, J.: Global communication optimization for tensor contraction expressions under memory constraints. In: 17th International Parallel & Distributed Processing Symposium (IPDPS) (2003)
Cociorva, D., Baumgartner, G., Lam, C., Sadayappan, P., Ramanujam, J., Nooijen, M., Bernholdt, D., Harrison, R.: Space-time trade-off optimization for a class of electronic structure calculations. In: Proc. of ACM SIGPLAN PLDI 2002 (2002)
Cociorva, D., Wilkins, J., Baumgartner, G., Sadayappan, P., Ramanujam, J., Nooijen, M., Bernholdt, D., Harrison, R.: Towards automatic synthesis of high-performance codes for electronic structure calculations: Data locality optimization. In: Monien, B., Prasanna, V.K., Vajapeyam, S. (eds.) HiPC 2001. LNCS, vol. 2228, p. 237. Springer, Heidelberg (2001)
Krishnan, S., Krishnamoorthy, S., Baumgartner, G., Cociorva, D., Lam, C., Sadayappan, P., Ramanujam, J., Bernholdt, D., Choppella, V.: Data locality optimization for synthesis of efficient out-of-core algoritms. In: Pinkston, T.M., Prasanna, V.K. (eds.) HiPC 2003. LNCS, vol. 2913, pp. 406–417. Springer, Heidelberg (2003)
Krishnan, S., Krishnamoorthy, S., Baumgartner, G., Lam, C., Ramanujam, J., Choppella, V., Sadayappan, P.: Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver. In: Proc. of 18th Intl. Parallel & Distributed Processing Symposium (IPDPS) (2004)
High Performance Computational Chemistry Group: NWChem, A Computational Chemistry Package for Parallel Computers, Version 4.6. Pacific Northwest National Laboratory, Richland, Washington 99352–0999, USA (2004)
Nieplocha, J., Harrison, R.J., Littlefield, R.J.: Global arrays: a portable programming model for distributed memory computers. In: Supercomputing, pp. 340–349 (1994)
Nieplocha, J., Harrison, R.J., Littlefield, R.J.: Global arrays: A nonuniform memory access programming model for high-performance computers. The Journal of Supercomputing 10, 169–189 (1996)
Nieplocha, J., Foster, I.: Disk resident arrays: An array-oriented I/O library for out-of-core computations. In: Proceedings of the Sixth Symposium on the Frontiers of Massively Parallel Computation, pp. 196–204. IEEE Computer Society Press, Los Alamitos (1996)
Eklundh, J.O.: A fast computer method for matrix transposing. IEEE Trans. on Computers 20, 801–803 (1972)
Kaushik, S.D., Huang, C.H., Johnson, R.W., Sadayappan, P., Johnson, J.R.: Efficient transposition algorithms for large matrices. In: Proceedings of the 1993 ACM/IEEE conference on Supercomputing, pp. 656–665. ACM Press, New York (1993)
Suh, J., Prasanna, V.K.: An efficient algorithm for out-of-core matrix transposition. IEEE Trans. on Computers 51, 420–438 (2002)
Krishnamoorthy, S., Baumgartner, G., Cociorva, D., Lam, C., Sadayappan, P.: On efficient out-of-core matrix transposition. Technical Report OSU-CIRSC-9/03-T52, School of Computer and Information Science, The Ohio State University (2003)
Krishnamoorthy, S., Baumgartner, G., Cociorva, D., Lam, C.C., Sadayappan, P.: Efficient parallel out-of-core matrix transposition. In: Proceedings of the International Conference on Cluster Computing. IEEE Computer Society Press, Los Alamitos (to appear, 2003)
The Ohio Supercomputer Center, http://www.osc.edu
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Krishnamoorthy, S., Baumgartner, G., Lam, CC., Nieplocha, J., Sadayappan, P. (2004). Efficient Layout Transformation for Disk-Based Multidimensional Arrays. In: Bougé, L., Prasanna, V.K. (eds) High Performance Computing - HiPC 2004. HiPC 2004. Lecture Notes in Computer Science, vol 3296. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30474-6_42
Download citation
DOI: https://doi.org/10.1007/978-3-540-30474-6_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24129-4
Online ISBN: 978-3-540-30474-6
eBook Packages: Computer ScienceComputer Science (R0)