Abstract
Task granularity and functional parallelism are fundamental issues in the optimization of parallel programs. Appropriate granularity for exploitation of parallelism is affected by characteristics of both the program and the execution environment. In this paper we demonstrate the efficacy of dynamic granularity control. The scheme we propose uses dynamic runtime information to select the task size of exploited parallelism at various stages of the execution of a program. We also demonstrate that functional parallelism can be an important factor in improving the performance of parallel programs, both in the presence and absence of loop-level parallelism. Functional parallelism can increase the amount of large-grain parallelism as well as provide finer-grain parallelism that leads to better load balance. Analytical models and benchmark results quantify the impact of granularity control and functional parallelism. The underlying implementation for this research is a low-overhead threads model based on user-level scheduling.
This work was supported by the Office of Naval Research under grant N00014-94-1-0234. Computational facilities were provided by the National Center for Supercomputing Applications. José Moreira was at the University of Illinois during the development of this research.
Preview
Unable to display preview. Download preview PDF.
References
Thomas Anderson, Edward Lazowska, and Henry Levy. The performance implications of thread management alternatives for shared-memory multiprocessors. IEEE Transactions on Computers, 38(12), December 1989.
Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska, and Henry M. Levy. Scheduler activations: Effective kernel support for the user-level management of parallelism. In 13th ACM Symposium on Operating Systems Principles, pages 95–109. ACM Sigops, October 1991.
Carl J. Beckmann. Hardware and Software for Functional and Fine Grain Parallelism. PhD thesis, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 1993.
Jyh-Herng Chow and Williams Ludwell Harrison. Switch-stacks: A scheme for microtasking nested parallel loops. In Supercomputing 90, pages 190–199, Nov. 1990.
Peter Dinda, Thomas Gross, David O'Hallaron, Edward Segall, James Stichnoth, Jaspal Subhlok, Jon Webb, and Bwolen Yang. The CMU task parallel program suite. Technical Report CMU-CS-94-131, School of Computer Science, Carnegie-Mellon University, March 1994.
Derek Eager and John Zahorjan. Chores: Enhanced run-time support for shared-memory parallel computing. ACM Transactions on Computer Systems, 11(1), February 1993.
Mike Galles and Eric Williams. Performance optimizations, implementation, and verification of the SGI Challenge multiprocessor. Silicon Graphics Technical Report. Available from http: //www. sgi. com.
M. Girkar and C. D. Polychronopoulos. The HTG: An intermediate representation for programs based on control and data dependences. Technical Report 1046, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, May 1991.
Milind Girkar. Functional Parallelism: Theoretical Foundations and Implementation. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, 1992.
Milind Girkar and Constantine Polychronopoulos. Automatic detection and generation of unstructured parallelism in ordinary programs. IEEE Transactions on Parallel and Distributed Systems, 3(2), April 1992.
Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, MD, 1989.
Anoop Gupta, Andrew Tucker, and Luis Stevens. Making effective use of shared-memory multiprocessors: The process control approach. Technical Report CSL-TR-91-475A, Computer Systems Laboratory, Stanford University, 1991.
S. F. Hummel and E. Schonberg. Low-overhead scheduling of nested parallelism. IBM J. Res. Develp., 35(5/6):743–765, Sept/Nov 1991.
D. E. Knuth. The Art of Computer Programming, Vol. 3 Sorting and Searching. Addison-Wesley, Reading, Mass., 1973.
S. L. Lyons, T. J. Hanratty, and J. B. MacLaughlin. Large-scale computer simulation of fully developed channel flow with heat transfer. International Journal of Numerical Methods for Fluids, 13:999–1028, 1991.
Brian D. Marsh, Michael L. Scott, Thomas J. LeBlanc, and Evangelos P. Markatos. Firstclass user-level threads. In Proceedings of the 13th ACM Symposium on Operating Systems Principles, pages 110–121, October 1991.
C. D. Polychronopoulos, M. B. Girkar, Mohammad R. Haghighat, C. L. Lee, B. Leung, and D. A. Schouten. Parafrase-2: An environment for parallelizing, partitioning, synchronizing, and scheduling programs on multiprocessors. International Journal of High Speed Computing, 1(1):45–72, May 1989.
Constantine Polychronopoulos, Nawaf Bitar, and Steve Kleiman. nanothreads: A user-level threads architecture. In Proceedings of the ACM Symposium on Principles of Operating Systems, 1993.
Constantine D. Polychronopoulos. Autoscheduling: Control flow and data flow come together. Technical Report 1058, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, 1990.
Shankar Ramaswamy and Prithviraj Banerjee. Processor allocation and scheduling of macro dataflow graphs on distributed memory multicomputers by the PARADIGM compiler. In International Conference on Parallel Processing, pages 11:134–138, St. Charles, IL, August 1993.
Martin C. Rinard, Daniel J. Scales, and Monica S. Lam. Jade: A high-level machineindependent language for parallel programming. IEEE Computer, 26(6):28–38, June 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Moreira, J.E., Schouten, D., Polychronopoulos, C. (1996). The performance impact of granularity control and functional parallelism. In: Huang, CH., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1995. Lecture Notes in Computer Science, vol 1033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014225
Download citation
DOI: https://doi.org/10.1007/BFb0014225
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60765-6
Online ISBN: 978-3-540-49446-1
eBook Packages: Springer Book Archive