Abstract
To efficiently execute a finite element application program on a distributed memory multicomputer, we need to distribute nodes of a finite element graph to processors of a distributed memory multicomputer as evenly as possible and minimize the communication cost of processors. This partitioning problem is known to be NP-complete. Therefore, many heuristics have been proposed to find satisfactory sub-optimal solutions. Based on these heuristics, many graph partitioners have been developed. Among them, Jostle, Metis, and Party are considered as the best graph partitioners available up-to-date. For these three graph partitioners, in order to minimize the total cut-edges, in general, they allow 3% to 5% load imbalance among processors. This is a tradeoff between the communication cost and the computation cost of the partitioning problem. In this paper, we propose an optimization method, the dynamic diffusion method (DDM), to balance the 3% to 5% load imbalance allowed by these three graph partitioners while minimizing the total cut-edges among partitioned modules. To evaluate the proposed method, we compare the performance of the dynamic diffusion method with the directed diffusion method and the multilevel diffusion method on an IBM SP2 parallel machine. Three 2D and two 3D irregular finite element graphs are used as test samples. For each test sample, 3% and 5% load imbalance situations are tested. From the experimental results, we have the following conclusions. (1) The dynamic diffusion method can improve the partition results of these three partitioners in terms of the total cut-edges and the execution time of a Laplace solver in most test cases while the directed diffusion method and the multilevel diffusion method may fail in many cases. (2) The optimization results of the dynamic diffusion method are better than those of the directed diffusion method and the multilevel diffusion method in terms of the total cut-edges and the execution time of a Laplace solver for most test cases. (3) The dynamic diffusion method can balance the load of processors for all test cases.
Similar content being viewed by others
References
S. T. Barnard and H. D. Simon. Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. Concurrency: Practice andExperience, 6(2): 101-117, April 1994.
G. Cybenko. Dynamic load balancing for distributed memory multiprocessors. Journal of Parallel and Distributed Computing, 7(2): 279-301, October 1989.
F. Ercal, J. Ramanujam, and P. Sadayappan. Task allocation onto a hypercube by recursive mincut bipartitioning. Journal of Parallel and Distributed Computing, 10: 35-44, 1990.
C. M. Fiduccia and R. M. Mattheyes. A linear-time heuristic for improving network partitions. In Proceedings of the 19th IEEE Design Automation Conference, pp. 175-181, 1982.
M. R. Garey and D. S. Johnson. Computers and Intractability, A Guide to Theory of NP-Completeness. Freeman, San Francisco, Calif., 1979.
J. R. Gilbert, G. L. Miller, and S. H. Teng. Geometric mesh partitioning: implementation and experiments. In Proceedings of the 9th International Parallel Processing Symposium, Santa Barbara, Calif., pp. 418-427, April 1995.
A. Heirich and S. Taylor. A parabolic load balancing method. In '95, pp. 192-202, 1995.
B. Hendrickson and R. Leland. The Chaco user's guide: version 2.0. Technical Report SAND94-2692, Sandia National Laboratories, Albuquerque, NM, October 1994.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Yang, DL., Chung, YC., Chen, CC. et al. A Dynamic Diffusion Optimization Method for Irregular Finite Element Graph Partitioning. The Journal of Supercomputing 17, 91–110 (2000). https://doi.org/10.1023/A:1008123922971
Issue Date:
DOI: https://doi.org/10.1023/A:1008123922971