Abstract
A data distribution scheme of sparse arrays on a distributed memory multicomputer, in general, is composed of three phases, data partition, data distribution, and data compression. To implement the data distribution scheme, many methods proposed in the literature first perform the data partition phase, then the data distribution phase, followed by the data compression phase. We called a data distribution scheme with this order as Send Followed Compress (SFC) scheme. In this paper, we propose two other data distribution schemes, Compress Followed Send (CFS) and Encoding-Decoding (ED), for sparse array distribution. In the CFS scheme, the data compression phase is performed before the data distribution phase. In the ED scheme, the data compression phase can be divided into two steps, encoding and decoding. The encoding step and the decoding step are performed before and after the data distribution phase, respectively. To evaluate the CFS and the ED schemes, we compare them with the SFC scheme. In the data partition phase, the row partition, the column partition, and the 2D mesh partition with/without load-balancing methods are used for these three schemes. In the compression phase, the CRS/CCS methods are used to compress sparse local arrays for the SFC and the CFS schemes while the encoding/decoding step is used for the ED scheme. Both theoretical analysis and experimental tests were conducted. In the theoretical analysis, we analyze the SFC, the CFS, and the ED schemes in terms of the data distribution time and the data compression time. In experimental tests, we implemented these three schemes on an IBM SP2 parallel machine. From the experimental results, for most of test cases, the CFS and the ED schemes outperform the SFC scheme. For the CFS and the ED schemes, the ED scheme outperforms the CFS scheme for all test cases.
Similar content being viewed by others
References
Adams JC, Brainerd WS, Martin JT, Smith BT, Wagener JL (1992) FORTRAN 90 handbooks. Intertext Publications/McGraw-Hill Inc
Asenjo R, Romero LF, Ujaldon M, Zapata EL (1994) Sparse block and cyclic data distributions for matrix computations. In: Proc. High Performance Comput. Technology, Methods and Applications, 1994, pp 6–8
Asenjo R, Plata O, Tourino J, Doallo R, Zapata EL (1998) HPF-2 support for dynamic sparse computations. In: Proc. Int. Workshop Languages and Compilers for Parallel Comput., 1998, pp 230–246
Bandera G, Zapata EL (1996) Extending CRAFT data-distributions for sparse matrices. In: Proc. European Cray MPP Workshop, 1996
Barrett R, Berry M, Chan TF, Demmel J, Dongarra J, Eijkhout V, Pozo R, Romine C, Van der Vorst H (1994) Templates for the solution of linear systems: building blocks for the iterative methods, 2nd edn. SIAM
Berger MJ, Bokhari SH (1987) A partitioning strategy for nonuniform problems on multiprocessors. IEEE Trans Comput 36:570–580
Chang R-G, Chung T-R, Lee JK (2001) Parallel sparse supports for array intrinsic functions of Fortran 90. J Supercomput 18:305–339
Chang R-G, Chung T-R, Lee JK (2004) Support and optimization for parallel sparse programs with array intrinsics of Fortran 90. Parallel Comput 30:527–550
Chang R-G, Chung T-R, Lee JK (1997) Towards automatic support of parallel sparse computation in java with continuous compilation. Concurrency: practice and experiences. Parallel Comput 9:1101–1111
Cullum JK, Willoughby RA (1985) Lanczos algorithms for large symmetric eignenvalue computations. Birkhauser, Boston
Duff I, Grimes R, Lewis J (1989) Sparse matrix test problems. ACM Trans Math Soft 15:1–14
Duff I, Grimes R, Lewis J (1992) User’s giude for the harwell-boeing sparse matrix collection (Release I), Technical Report RAL 92-086, Rutherford Appleton Laboratory
Golub GH, Van Loan CF (1989) Matrix computations, 2nd edn. The John Hopkins University Press, Baltimore
High Performance Fortran Forum (1997) High performance fortran language specification, 2nd edn. Rice University
Kebler CW, Smith CH (1999) The SPARAMAT approach to automatic comprehension of sparse matrix computations. In: Proc. International Workshop Program Comprehension, 1999, pp 200–207
Kotlyar V, Pingali K, Stodghill P (1997) Compiling parallel code for sparse matrix applications. In: Proc. Supercomputing Conference, 1997, pp 20–38
Lin C-Y, Liu J-S, Chung Y-C (2002) Efficient representation scheme for multi-dimensional array operations. IEEE Trans Comput 51:327–345
Lin C-Y, Chung Y-C, Liu J-S (2003) Efficient data compression methods for multi-dimensional sparse array operations based on the EKMR scheme. IEEE Trans Comput 52:1640–1646
Lin C-Y, Chung Y-C, Liu J-S (2002) Data distribution schemes of sparse arrays on distributed memory multicomputers. In: Proc. ICPP Workshops Compile/Runtime Techniques Parallel Comput., 2002, pp 551–558
Lin C-Y, Chung Y-C, Liu J-S (2003) Efficient data parallel algorithms for multi-dimensional array operations based on the EKMR scheme for distributed memory multicomputers. IEEE Trans Parallel Distrib Syst 14:625–639
Mateev N, Pingali K, Stodghill P, Kotlyar V (2000) Next-generation generic programming and its application to sparse matrix computations. In: Proc. International Conference on Supercomput, 2000, pp 88–99
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1996) Numerical recipes in Fortran 90: the art of parallel scientific computing. Cambridge University Press
Sulatycke PD, Ghose K (1998) Caching efficient multithreaded fast multiplication of sparse matrices. In: Proc. Merged Int. Parallel Process. Symposium and Symposium Parallel Distributed Process., 1998, pp 117-124
Ujaldon M, Zapata EL, Chapman BM, Zima HP (1995) New data-parallel language features for sparse matrix computations. In: Proc. IEEE Int. Parallel Process. Sympos., 1995, pp 742–749
Ujaldon M, Zapata EL, Sharma SD, Saltz J (1996) Parallelization techniques for sparse matrix applications. J Parallel Distrib Comput 38:256–266
Ujaldon M, Zapata EL, Chapman BM, Zima HP (1997) Vienna-fortran/hpf extensions for sparse and irregular problems and their compilation. IEEE Trans Parallel Distrib Syst 8:1068–1083
Vastenhouw B, Bisseling RH (2005) A two-dimensional data distribution method for parallel sparse matrix-vector multiplication. SIAM Review 47:67–95
White JB, Sadayappan P (1997) On improving the performance of sparse matrix-vector multiplication. In: Proc. Int. Confer. High-Performance Comput., 1997, pp 711–725
Zapata EL, Plata O, Asenjo R, Trabado GP (1999) Data-parallel support for numerical irregular problems. J Parallel Comput 25:1911–1944
Ziantz LH, Ozturan CC, Szymanski BK (1994) Run-time optimization of sparse matrix-vector multiplication on SIMD machines. In: Proc. Int. Conference Parallel Architectures and Languages, 1994, pp 313–322
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lin, CY., Chung, YC. Data distribution schemes of sparse arrays on distributed memory multicomputers. J Supercomput 41, 63–87 (2007). https://doi.org/10.1007/s11227-007-0104-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-007-0104-x