Abstract
Effective data distribution and parallelization of computations involving irregular data structures is a challenging task. We address the twin-problems in the context of computations involving block-sparse matrices. The programming model provides a global view of a distributed block-sparse matrix. Abstractions are provided for the user to express the parallel tasks in the computation. The tasks are mapped onto processors to ensure load balance and locality. The abstractions are based on the Aggregate Remote Memory Copy Interface, and are interoperable with the Global Arrays programming suite and MPI. Results are presented that demonstrate the utility of the approach.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lim, A.W., Lam, M.S.: Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing 24, 445–475 (1998)
Nieplocha, J., Carpenter, B.: ARMCI: A Portable Remote Memory Copy Library for Distributed Array Libraries and Compiler Run-time Systems. In: Proc. 3rd Workshop on Runtime Systems for Parallel Programming, RTSPP (1999)
Nieplocha, J., Palmer, B., Tipparaju, V., Krishnan, M., Trease, H., Apra, E.: Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit. Intern. J. High Perf. Comp. Applications (2005) (to appear)
Baumgartner, G., Bernholdt, D., Cociorva, D., Harrison, R., Hirata, S., Lam, C., Nooijen, M., Pitzer, R., Ramanujam, J., Sadayappan, P.: A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry. In: Proc. of Supercomputing 2002 (2002)
Plimpton, S.J., Hendrickson, B.A.: Parallel molecular dynamics with the embedded atom method. In: Proc. of Materials Theory and Modelling, MRS Proceedings, p. 37 (1993)
Coarfa, C., Dotsenko, Y., Mellor-Crummey, J.: A Multi-Platform Co-Array Fortran Compiler. In: Proc. of PACT (2004)
Parzyszek, K., Nieplocha, J., Kendall, R.A.: A Generalized Portable SHMEM Library for High Performance Computing. In: Proc. of the IASTED Parallel and Distributed Computing and Systems, pp. 401–406 (2000)
High Performance Computational Chemistry Group: NWChem, A Computational Chemistry Package for Parallel Computers, Version 4.6. Pacific Northwest National Laboratory (2004)
Nieplocha, J., Foster, I.: Disk Resident Arrays: An Array-Oriented I/O Library for Out-Of-Core Computations. In: Proc. 6th Symposium on the Frontiers of Massively Parallel Computation, pp. 196–204 (1996)
Çatalyürek, U.V., Aykanat, C.: PaToH: A Multilevel Hypergraph Partitioning Tool, Version 3.0. Bilkent University, Department of Computer Engineering (1999)
Karypis, G., Aggrawal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: Applications in VLSI domain. In: Proc. of 34th Design Automation Conference (1997)
Çatalyürek, U.V., Aykanat, C.: Hypergraph-partitioning based decomposition for parallel spars e-matrix vector multiplication. IEEE TPDS 10, 673–693 (1999)
Hitara, S.: Tensor contraction engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster, and many-body perturbation theories. J. Phys. Chem. A 107, 9887–9897 (2003)
Duff, I.S., Marrone, M., Radicati, G., Vittoli, C.: Level 3 basic linear algebra subprograms for sparse matrices: a user-level interface. ACM Trans. Math. Softw. 23, 379–401 (1997)
Tuminaro, R.S., Heroux, M., Hutchinson, S.A., Shadid, J.N.: Official Aztec user’s guide: Version 2.1. Technical report, Sandia National Laboratories (1999)
Hendrickson, B., Leland, R.: The Chaco user’s guide: Version 2.0. Technical Report SAND94–2692, Sandia National Laboratories (1994)
Sinha, A., Kalé, L.: A load balancing strategy for prioritized execution of tasks. In: Seventh International Parallel Processing Symposium, Newport Beach, CA, pp. 230–237 (1993)
Kalé, L., Krishnan, S.: CHARM++: A Portable Concurrent Object Oriented System Based on C++. In: Paepcke, A. (ed.) Proceedings of OOPSLA 1993, pp. 91–108. ACM Press, New York (1993)
Randall, K.H.: Cilk: Efficient Multithreaded Computing. PhD thesis, MIT Department of Electrical Engineering and Computer Science (1998)
Chang, C., Kurc, T., Sussman, A., Çatalyürek, U.V., Saltz, J.: A hypergraph-based workload partitioning strategy for parallel data aggregation. In: Proceedings of the Eleventh SIAM Conference on Parallel Processing for Scientific Computing. SIAM, Philadelphia (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Krishnamoorthy, S., Nieplocha, J., Sadayappan, P. (2005). Data and Computation Abstractions for Dynamic and Irregular Computations. In: Bader, D.A., Parashar, M., Sridhar, V., Prasanna, V.K. (eds) High Performance Computing – HiPC 2005. HiPC 2005. Lecture Notes in Computer Science, vol 3769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602569_29
Download citation
DOI: https://doi.org/10.1007/11602569_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30936-9
Online ISBN: 978-3-540-32427-0
eBook Packages: Computer ScienceComputer Science (R0)