Abstract
The performance of conjugate gradient (CG) algorithms for the solution of the system of linear equations that results from the finite-differencing of the neutron diffusion equation was analyzed on SIMD, MIMD, and mixed-mode parallel machines. A block preconditioner based on the incomplete Cholesky factorization was used to accelerate the conjugate gradient search. The issues involved in mapping both the unpreconditioned and preconditioned conjugate gradient algorithms onto the mixed-mode PASM prototype, the SIMD MasPar MP-1, and the MIMD Intel Paragon XP/S are discussed. On PASM , the mixed-mode implementation outperformed either SIMD or MIMD alone. Theoretical performance predictions were analyzed and compared with the experimental results on the MasPar MP-1 and the Paragon XP/S. Other issues addressed include the impact on execution time of the number of processors used, the effect of the interprocessor communication network on performance, and the relationship of the number of processors to the quality of the preconditioning. Applications studies such as this are necessary in the development of software tools for mapping algorithms onto either a single parallel machine or a heterogeneous suite of parallel machines.
Similar content being viewed by others
REFERENCES
A. M. Wildberger, Review of the exploratory research program in parallel and adaptive computing at the Electric Power Research Institute, High Performance Computing Conf., pp. 3–12 (April 1994).
P. Concus, G. H. Golub, and G. Meurant, Block preconditioning for the conjugate gradient method, SIAM J. Sci. Stat. Comp. 6(1):220–252 (January 1985).
M. R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Nat. Bur. Stand 49:409–436 (1952).
EPRI, Advanced Recycle Methodology Program, System Documentation, EPRI-NP-4574-CCM, Palo Alto, California (1987).
H. J. Siegel, M. Maheswaran, D. W. Watson, J. K. Antonio, and M. J. Atallah, Mixed-mode system heterogeneous computing. In Heterogeneous Computing, M. M. Eshaghian, (ed.), Artech House, Norwood, Massachusetts, pp. 19–65 (1996).
J. R. Nickolls, The design of the MasPar MP-1: A cost effective massively parallel computer, IEEE Computer Society Int'l. Conf. (Compcon), pp. 25–28 (February 1990).
G. S. Almasi and A. Gottlieb, Highly Parallel Computing, Second Edition, Benjamin Cummings, Redwood City, California (1994).
H. J. Siegel, H. G. Dietz, and J. K. Antonio, Software support for heterogeneous computing. In The Computer Science and Engineering Handbook, A. B. Tucker, Jr., (ed.), CRC Press, Boca Raton, Florida, pp. 1886–1909 (1997).
U. R. Hanebutte and E. Lewis, A massively parallel discrete ordinates response matrix method for neutron transport, Nucl. Sci. Eng. 111(2):46–56 (May 1992).
B. L. Kirk and Y. Y. Azmy, An iterative algorithm for solving the multidimensional neutron diffusion nodal method equations on parallel computer, Nucl. Sci. Eng. 111(2):57–65 (May 1992).
R. Muller, R. Boer, and H. Finnemann, Software development for reactor simulation on multiprocessor systems, Amer. Nucl. Soc. Top. Mtg. on Adv. in Math. and Computations 25:1–12 (April 1992).
S. Hammond and R. Schreiber, Efficient ICCG on a shared memory multiprocessor, Int'l. J. High Speed Computing 4:1–22 (March 1992).
H. A. Van Der Vorst, Large tridiagonal and block tridiagonal linear systems on vector and parallel computers, Parallel Computing 5:45–54 (1987).
A. Gupta, V. Kumar, and A. Sameh, Performance and scalability of conjugate gradient methods on parallel computers, IEEE Trans. Parallel and Distributed Systems 6(5): 455–469 (May 1995).
J. J. Duderstadt and L. J. Hamilton, Nuclear Reactor Analysis, John Wiley and Sons, Inc., New York (1976).
H. Joo and T. Downar, Incomplete domain decomposition preconditioning for the coarse mesh neutron diffusion equation, Amer. Nucl. Soc. Int'l. Mathematics and Computation Topical Meeting, pp. 1854–1864 (May 1995).
O. Axelsson and V. A. Barber, Finite Element Solution of Boundary Value Problems, Academic Press, Orlando, Florida (1984).
O. Axelsson, Iterative Solution Methods, Cambridge University Press, New York (1994).
G. Meurant, Iterative methods for multiprocessor vector computers, Computer Physics Report 11:51–80 (November 1989).
H. Joo, T. Downar, and D. Barber, Methods and performance of a parallel reactor kinetics code PARCS, Proc. Int'l. Reactor Physics Conf., pp. 42–51 (September 1996).
V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms, Benjamin Cummings, Redwood City, California (1994).
H. J. Siegel, L. Wang, J. E. So, and M. Maheswaran, Data parallel algorithms. In Handbook of Parallel and Distributed Computing, A. Y. Zomaya, (ed.), McGraw-Hill, New York, pp. 466–499 (1996).
W. Gropp and D. Keyes, Complexity of parallel implementation of domain decomposition techniques for elliptic partial differential equations, SIAM J. Sci. Stat. Comp. 9:312–326 (January 1988).
P. Sonneveld, CGS, a fast Lanczos-type solver for nonsymmetric linear systems, SIAM J. Sci. Stat. Comp. 10:36–52 (1989).
H. J. Siegel, T. Schwederski, W. G. Nation, J. B. Armstrong, L. Wang, J. T. Kuehn, R. Gupta, M. D. Allemang, D. G. Meyer, and D. W. Watson, The design and prototyping of the PASM reconfigurable parallel processing system. In Parallel Computing: Paradigms and Applications, A. Y. Zomaya, (ed.), International Thomson Computer Press, London, United Kingdom, pp. 78–114 (1996).
S. D. Kim, M. A. Nichols, and H. J. Siegel, Modeling overlapped operation between the control unit and processing elements in an SIMD machines, J. Parallel and Distributed Computing 12(4):329–342 (August 1991).
M. A. Nichols, H. J. Siegel, and H. G. Dietz, Data management and control-flow aspects of an SIMD/SPMD parallel language/compiler, IEEE Trans. Parallel and Distributed Systems 4(2):222–234 (February 1993).
H. J. Siegel, J. B. Armstrong, and D. W. Watson, Mapping computer vision related tasks onto reconfigurable parallel processing systems, IEEE Computer 25(2):54–63 (February 1992).
Y. Saad and M. Schultz, GMRES: A generalized minimum residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comp. 7:856–869 (1986).
L. J. Jamieson, Characterizing parallel algorithms. In The Characteristics of Parallel Algorithms, L. H. Jamieson, D. G. Gannon, and R. G. Douglass, (eds.), The MIT Press, Cambridge, Massachusetts, pp. 65–100 (1987).
R. Freund, T. Kidd, D. Hensgen, and L. Moore, SmartNet: A scheduling framework for heterogeneous computing, Second Int'l. Symp. Parallel Architectures, Algorithms, and Networks, pp. 514–521 (April 1996).
Rights and permissions
About this article
Cite this article
So, J.J.E., Downar, T.J., Janardhan, R. et al. Mapping Conjugate Gradient Algorithms for Neutron Diffusion Applications onto SIMD, MIMD, and Mixed-Mode Machines. International Journal of Parallel Programming 26, 183–207 (1998). https://doi.org/10.1023/A:1018796903553
Issue Date:
DOI: https://doi.org/10.1023/A:1018796903553