Abstract
The performance of unstructured mesh applications presents a number of complexities and subtleties that do not arise for dense structured meshes. From a programming point of view, the handling of unstructured meshes has an increased complexity in order to manage the necessary data structures and interactions between mesh-cells. From a performance point of view, there are added difficulties in understanding both the processing time on a single processor and the scaling characteristics when using large-scale parallel systems. In this work we present a general performance model for the calculation of deterministic S N transport on unstructured meshes that is also applicable to structured meshes. The model captures the key processing characteristics of the calculation and is parametric using both system performance data (latency, bandwidth, processing rate etc.) and application data (mesh size etc.) as input. A single formulation of the model is used to predict the performance of two quite different implementations of the same calculation. It is validated on two clusters (an HP AlphaServer and an Itanium-2 system) showing high prediction accuracy.
Similar content being viewed by others
References
K. Davis, A. Hoisie, G. Johnson, D. J. Kerbyson, M. Lang, S. Pakin, and F. Petrini. A performance and scalability analysis of the BlueGene/L architecture. In Proc. IEEE/ACM Supercomputing, Pittsburgh, PA, 2004.
A. Hoisie, O. Lubeck, and H. Wasserman. Performance and scalability analysis of Teraflop-scale parallel architectures using multidimensional wavefront applications. Int. J. of High Performance Computing Applications, 14(4):330–346, 2000.
A. Hoisie, O. Lubeck, H. Wasserman, F. Petrini, and H. Alme. A general predictive performance model for wavefront algorithms on clusters of SMPs. In Proc. of ICPP 2000, pages 20–25, Toronto, Canada, 2000.
G. Karypis and V. Kumar. METIS 4.0: Unstructured Graph Partitioning and Sparse Matrix Ordering System. Technical report, Department of Computer Science, University of Minnesota, 1998.
D. J. Kerbyson, H. Alme, A. Hoisie, F. Petrini, H. Wasserman, and M. Gittings. Predictive performance and scalability modeling of a large-scale application. In Proc. Supercomputing, Denver, CO, 2001.
D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. Modeling the performance of large-scale systems. IEE Proceedings (Software), 150(4):214–221, 2003.
D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. A performance comparison between the earth simulator and other terascale systems on a characteristic ASCI workload. Concurrency and Computation, Practice and Experience, 17(10):1219–1238, 2004.
D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. Use of predictive performance modeling during large-scale system installation. To appear in Parallel Processing Letters, 2005.
K. R. Koch, R. S. Baker, and R. E. Alcouffe. Solution of the first-order form of the 3D discrete ordinates equation on a massively parallel processor. Transactions of the American Nuclear Society, 65:198–199, 1992.
M. M. Mathis, N. M. Amato, and M. L. Adams. A general performance model for parallel sweeps on orthogonal grids for particle transport calculations. In Proc. ACM Int. Conf. Supercomputing (ICS), pp. 255–263, Santa Fe, NM, 2000.
M. M. Mathis and D. J. Kerbyson. Performance modeling of unstructured mesh particle transport computations. In Proc. ACM/IEEE Int. Parallel and Distributed Processing Symposium (IPDPS), Santa Fe, NM, 2004.
M. M. Mathis, D. J. Kerbyson, and A. Hoisie. A performance model of non-deterministic particle transport on large-scale systems. In Proc. Int. Conf. on Computational Science (ICCS), LNCS, vol. 2659, pp. 936–945, Melbourne, Australia, 2003.
S. D. Pautz. An algorithm for parallel sn sweeps on unstructured meshes. J. Nuclear Science and Engineering, 140:111–136, 2002.
F. Petrini, W. C. Feng, A. Hoisie, S. Coll, and E. Frachtenberg. The Quadrics Network: High-Performance Clustering Technology. IEEE Micro, 22(1):46–57, 2002.
F. Petrini, D. J. Kerbyson, and S. Pakin. The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of ASCI Q. In Proc. IEEE/ACM SuperComputing, Phoenix, 2003.
S. Plimpton, B. Hendrickson, S. Burns, and W. McLendon. Parallel algorithms for radiation transport on unstructured grids. In Proc. IEEE/ACM Supercomputing, Dallas, 2000.
The ASCI SWEEP3D README File. Available from: www.llnl.gov/asci_benchmarks/asci/limited /sweep3d/sweep3d_readme.html
The UMT2K (UMT 1.2) README File. Available from: www.llnl.gov/asci/purple/benchmarks/limited/umt/umt1.2.readme.html
J. S. Vetter and A. Yoo. An empirical performance evaluation of scalable scientific applications. In Proc. IEEE/ACM Supercomputing, Baltimore, MD, 2002.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mathis, M.M., Kerbyson, D.J. A General Performance Model of Structured and Unstructured Mesh Particle Transport Computations. J Supercomput 34, 181–199 (2005). https://doi.org/10.1007/s11227-005-2339-8
Issue Date:
DOI: https://doi.org/10.1007/s11227-005-2339-8