Abstract
Irregular and dynamic memory reference patterns can cause significant performance variations for low level algorithms in general and especially for parallel algorithms. We have previously shown that parallel reduction algorithms are quite input sensitive and thus can benefit from an adaptive, reference pattern directed selection. In this paper we extend our previous work by detailing a systematic approach to dynamically select the best parallel algorithm. First we model the characteristics of the input, i.e., the memory reference pattern, with a descriptor vector. Then we measure the performance of several reduction algorithms for various values of the pattern descriptor. Finally we establish a (many-to-one) mapping (function) between a finite set of descriptor values and a set of algorithms. We thus obtain a performance ranking of the available algorithms with respect to a limited set of descriptor values. The actual dynamic selection code is generated using statistical regression methods or a decision tree. Finally we present experimental results to validate our modeling and prediction techniques.
This research supported in part by NSF CAREER Awards CCR-9624315 and CCR-9734471, NSF Grants ACI-9872126, EIA-9975018, EIA-0103742, and by the DOE ASCI ASAP program grant B347886.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Charmm: A program for macromolecular energy, minimization, and dynamics calculations. J. of Computational Chemistry 4(6) (1983)
Blume, W., et al.: Advanced Program Restructuring for High-Performance Computers with Polaris. IEEE Computer 29(12), 78–82 (1996)
Eigenmann, R., Hoeflinger, J., Li, Z., Padua, D.: Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs. In: Banerjee, U., Nicolau, A., Gelernter, D., Padua, D.A. (eds.) LCPC 1991. LNCS, vol. 589, pp. 65–83. Springer, Heidelberg (1992)
Han, H., Tseng, C.-W.: Improving compiler and run-time support for adaptive irregular codes. In: Int. Conf. on Parallel Architectures and Compilation Techniques (October 1998)
Han, H., Tseng, C.-W.: A comparison of locality transformations for irregular codes. In: Dwarkadas, S. (ed.) LCR 2000. LNCS, vol. 1915, pp. 70–84. Springer, Heidelberg (2000)
Jain, R.: The Art of Computer Systems Performance Analysis. John Wiley & Sons, Inc., Chichester (1991)
Kruskal, C.: Efficient parallel algorithms for graph problems. In: Proc. of the 1986 Int. Conf. on Parallel Processing, August 1986, pp. 869–876 (1986)
Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufmann, San Francisco (1992)
Lin, Y., Padua, D.: On the automatic parallelization of sprase and irregular fortran programs. In: Proc. of the Workshop on Languages, Compilers and Run-time Systems for Scalable Computers, Pittsburgh, PA, May 1998, pp. 41–56 (1998)
Frisch, M.J., et al.: Gaussian 94, Revision B.1. Gaussian, Inc., Pittsburgh (1995)
Mitchell, T.: Machine Learning. MIT Press/The McGraw-Hill Companies, Inc. (1997)
Nagel, L.: SPICE2: A Computer Program to Simulate Semiconductor Circuits. PhD thesis, Univ. of California (May 1975)
Pottenger, W.M.: Theory, Techniques, and Experiments in Solving Recurrences in Computer Programs. PhD thesis, CSRD, Univ. of Illinois at Urbana-Champaign (May 1997)
Quinlan, R.: C4.5 Release 8, http://www.cse.unsw.edu.au/quinlan/
Whirley, R.G., Engelmann, B.: DYNA3D: A Nonlinear, Explicit. In: Three-Dimensional Finite Element Code For Solid and Structural Mechanics, November 1993, Lawrence Livermore National Lab. (1993)
Wu, J., Saltz, J., Hiranandani, S., Berryman, H.: Runtime compilation methods for multicomputers. In: Schwetman, H.D. (ed.) Proc. of the 1991 Int. Conf. on Parallel Processing, vol. II - Software, pp. 26–30. CRC Press, Inc., Boca Raton (1991)
Yu, H., Rauchwerger, L.: Adaptive reduction parallelization. In: Proc. of the 14th ACM Int.Conf. on Supercomputing, Santa Fe, NM (May 2000)
Yu, H., Rauchwerger, L.: Run-time parallelization overhead reduction techniques. In: Proc. of the 9th Int. Conf. on Compiler Construction, CC 2000, Berlin, Germany. LNCS, vol. 1781. Springer, Heidelberg (2000)
Zima, H.: Supercompilers for Parallel and Vector Computers. ACM Press, New York (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, H., Dang, F., Rauchwerger, L. (2005). Parallel Reductions: An Application of Adaptive Algorithm Selection. In: Pugh, B., Tseng, CW. (eds) Languages and Compilers for Parallel Computing. LCPC 2002. Lecture Notes in Computer Science, vol 2481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596110_13
Download citation
DOI: https://doi.org/10.1007/11596110_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30781-5
Online ISBN: 978-3-540-31612-1
eBook Packages: Computer ScienceComputer Science (R0)