Abstract
Graph scheduling has been shown effective for solving irregular problems represented as directed acyclic graphs(DAGs) on distributed memory systems. Many scientific applications can also be modeled as iterative task graphs(ITGs). In this paper, we model the SOR computation for solving sparse matrix systems in terms of ITGs and address the optimization issues for scheduling ITGs when communication overhead is not zero. We present an approach that incorporates techniques of software pipelining and graph scheduling. We demonstrate the effectiveness of our approach in mapping SOR computation and compare it with the multi-coloring method.
Preview
Unable to display preview. Download preview PDF.
References
L. Adams, and H. Jordan, Is SOR color-blind?, SIAM J. Sci. Stat. Comp, 7 (1986), pp 490–506.
A. Aiken and A. Nicolau, Optimal Loop Parallelization, SIGPLAN 88 Conf. on Programming Language Design and Implementation. pp.308–317.
F. T. Chong and R. Schreiber, Parallel sparse triangular solution with partitioned inverses and prescheduled DAGs, Tech Report, MIT, 1994.
P. Chretienne, Task Scheduling over Distributed Memory Machines, Proc. of Inter. Workshop on Parallel and Distributed Algorithms, (North Holland, Ed.), 1989.
P. Chretienne, Cyclic scheduling with communication delays: a polynomial special case. Dec 1993. Tech Report, LITP.
M. Cosnard and M. Loi, Automatic Task Graph Generation Techniques, Proc. of the Hawaii International Conference on System Sciences, IEEE, Vol II. 1995.
V. Donaldson and J. Ferrante, Determining asynchronous pipeline execution times. Tech. Report, UCSD, 1995.
P. Diniz and T. Yang. Efficient Parallelization of Relaxation Iterative Methods for Solving Banded Linear Systems on Multiprocessors, TRCS94-15, UCSB.
T. H. Dunigan, Performance of the INTEL iPSC/860 and nCUBE 6400 Hypercube, ORNL/TM-11790, Oak Ridge National Lab., TN, 1991.
I. S. Duff, R. G. Grimes and J. G. Lewis, Users' Guide for the Harwell-Boeing Sparse Matrix Collection, TR-PA-92-86.
H. Gabow and R. Tarjan, Faster scaling algorithms for network problems, SIAM J. Computing, Oct 1989.
F. Gasperoni and U. Schweigelshohn Scheduling Loops on Parallel Processors: A simple algorithm with close to optimum performance. Proc. of CONPAR 92, pp. 613–624.
A. George, M.T. Heath, and J. Liu, Parallel Cholesky Factorization on a Shared Memory Processor, Lin. Algebra Appl., Vol. 77, 1986, pp. 165–187.
A. Gerasoulis, J. Jiao, and T. Yang, A multistage approach to scheduling task graphs. To appear in DIMACS Book Series on Parallel Processing of Discrete Optimization Problems. AMS publisher. Edited by P.M. Pardalos, K.G. Ramakrishnan, and M.G.C. Resende.
A. Gerasoulis, J. Jiao and T. Yang, Scheduling of structured and unstructured computation, To appear in DIMACS Book Series, Workshop on Interconnections Networks and Mappings and Scheduling Parallel Computation, 1994, Editors: D. Hsu, A. Rosenberg, D. Sotteau.
A. Gerasoulis and T. Yang, On the Granularity and Clustering of Directed Acyclic Task Graphs, IEEE Trans. on Parallel and Distributed Systems., Vol. 4, no. 6, June 1993, pp 686–701.
G. Huang and W. Ongsakol, An Efficient Task Allocation Algorithm and its use to Parallelize Irregular Gauss-Seidel Type Algorithms, In Proc. of the Eighth International Parallel Processing Symposium, Cancun, Mexico, (1994), pp. 497–501.
J. J. Hwang, Y. C. Chow, F. D. Anger, and C. Y. Lee, Scheduling precedence graphs in systems with interprocessor communication times, SIAM J. Comput., pp. 244–257, 1989.
M. Lam, Software pipelining: an effective scheduling technique for VLIW machines, ACM Conf. on Programming Language Design and Implementation, 1988, 318–328.
S.J. Kim and J.C. Browne, A General Approach to Mapping of Parallel Computation upon Multiprocessor Architectures, Proc. of ICPP, 1988, V3, 1–8.
K. K. Parhi and D. G. Messerschmitt, Static rate-optimal scheduling of iterative dataflow programs via optimum unfolding, IEEE Trans. on Computers, 40:2, 1991, pp. 178–195.
R. Reiter, Scheduling parallel computations, J. of ACM, Oct 1968, pp. 590–599.
V. Sarkar, Partitioning and Scheduling Parallel Programs for Execution on Multiprocessors, MIT Press, 1989.
V. H. Van Dongen, G. R. Gao and Q. Ning A polynomial time method for optimal software pipelining. Proc. of CONPAR 92, pp. 613–624.
T. Von Eicken, D.E. Culler, S.C. Goldstein, and K. E. Schauser. Active messages: a mechanism for integrated communication and computation, Proc of 19th Int. Sym. on Computer Architecture, 1992.
T. Yang and A. Gerasoulis. DSC: Scheduling parallel tasks on an unbounded number of processors, IEEE Transactions on Parallel and Distributed Systems, Vol. 5, No. 9, 951–967, 1994.
T. Yang and A. Gerasoulis. List scheduling with and without communication. Parallel Computing, V. 19 (1993) pp. 1321–1344.
T. Yang and A. Gerasoulis, PYRROS: Static Task Scheduling and Code Generation for Message-Passing Multiprocessors, Proc. of 6th ACM Inter. Confer, on Supercomputing, Washington D.C., 1992, pp. 428–437.
T. Yang, C. Fu, A. Gerasoulis and V. Sarkar, Mapping iterative task graphs on distributed-memory machines, Tech. Report 1995. Part of this report will appear in Proc. of Inter. Conference on Parallel Processing, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, C., Yang, T., Gerasoulis, A. (1995). Integrating software pipelining and graph scheduling for iterative scientific computations. In: Ferreira, A., Rolim, J. (eds) Parallel Algorithms for Irregularly Structured Problems. IRREGULAR 1995. Lecture Notes in Computer Science, vol 980. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60321-2_11
Download citation
DOI: https://doi.org/10.1007/3-540-60321-2_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60321-4
Online ISBN: 978-3-540-44915-7
eBook Packages: Springer Book Archive