Abstract
To support high level coordination, parallel functional languages need effective and automatic work distribution mechanisms. Many implementations distribute potential work, i.e. sparks or closures, but there is good evidence that the performance of certain classes of program can be improved if current work, or threads, are also distributed. Migrating a thread incurs significant execution cost and requires careful scheduling and an elaborate implementation.
This paper describes the design, implementation and performance of thread migration in the GUM runtime system underlying Glasgow parallel Haskell (GpH). Measurements of nontrivial programs on a highlatency cluster architecture show that thread migration can improve the performance of data-parallel and divide-and-conquer programs with low processor utilisation. Thread migration also reduces the variation in performance results obtained in separate executions of a program. Moreover, migration does not incur significant overheads if there are no migratable threads, or on a single processor. However, for programs that already exhibit good processor utilisation, migration may increase performance variability and very occasionally reduce performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. Barak and O. Laádan. The MOSIX Multicomputer Operating System for High-Performance Cluster Computing. Future Generation Computer Systems, 13(4–5):361–372, 1998.
R. Baron, R. Rashid, E. Siegel, A. Tevanian, and M. Young. Mach-1: An Operating Environment for Large-Scale Multiprocessor Applications. IEEE Software, 2(4):65–67, July 1985.
R.D. Blumofe, C.F. Joerg, C.E. Leiserson, K.H. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In PPoPP’95 — Symp. on Principles and Practice of Parallel Programming, pages 207–216, Santa Barbara, USA, 1995.
S. Breitinger, R. Loogen, Y. Ortega Mallén, and R. Peña Marí. Eden — The Paradise of Functional Concurrent Programming. In EuroPar’96 — European Conf. on Parallel Processing, LNCS 1123, pages 710–713, Lyon, France, 1996. Springer.
T. Bülck, A. Held, W. Kluge, S. Pantke, C. Rathsack, S-B. Scholz, and R. Schröder. Experience with the Implementation of a Concurrent Graph Reduction System on an nCUBE/2 Platform. In CONPAR’94 — Conf. on Parallel and Vector Processing, LNCS 854, pages 497–508. Springer, 1994.
J.S. Chase, F.G Amador, E.D. Lazowska, H.M Levy, and R.J. Littlefield. the Amber System: Parallel Programming on a Network of Multiprocessors. In Symp. on Operating Systems Principles, pages 147–158, Litchfield Park, AZ, USA, 1989.
D.E. Culler, S.C. Goldstein, K.E. Schauser, and T. von Eicken. TAM — A Compiler Controlled Threaded Abstract Machine. J. of Parallel and Distributed Computing, 18:347–370, June 1993.
A. R. Du Bois, R. Pointon, H-W. Loidl, and P. W. Trinder. Implementing Declarative Parallel Bottom-Avoiding Choice. In 14th Symposium on Computer Architecture and High Performance Computing, pages 82–89, Vitoria, Brazil, october 2002. IEEE Press.
K. Hammond and S.L. Peyton Jones. Some Early Experiments on the GRIP Parallel Reducer. In IFL’90 — Intl. Workshop on the Parallel Implementation of Functional Languages, pages 51–72, Nijmegen, The Netherlands, June 1990.
K. Hammond and S.L. Peyton Jones. Profiling Scheduling Strategies on the GRIP Multiprocessor. In IFL’92 — Intl. Workshop on the Parallel Implementation of Functional Languages, pages 73–98, RWTH Aachen, Germany, September 1992.
Impala. Impala-(IMplicitly PArallel LAnguage Application Suite). <URL:http://www.csg.lcs.mit.edu/impala/>, July 2001.
A. Itzkovitz, A. Schuster, and L. Shalev. Thread Migration and its Applications in Distributed Shared Memory Systems. J. of Systems and Software, 42(1):71–87, 1998.
M. H. G. Kesseler. The Implementation of Functional Languages on Parallel Machines with Distributed Memory. PhD thesis, Wiskunde en Informatica, Katholieke Universiteit van Nijmegen, The Netherlands, 1996.
H. Kingdon, D.R. Lester, and G. Burn. The HDG-machine: a Highly Distributed Graph-Reducer for a Transputer Network. Computer Journal, 34(4):290–301, 1991.
H-W. Loidl. Granularity in Large-Scale Parallel Functional Programming. PhD thesis, University of Glasgow, March 1998.
H-W. Loidl and K. Hammond. Making a Packet: Cost-Effective Communication for a Parallel Graph Reducer. In IFL’96 — Intl. Workshop on the Implementation of Functional Languages, LNCS 1268, pages 184–199, Bonn/Bad-Godesberg, Germany, September 1996. Springer.
H-W. Loidl, U. Klusik, K. Hammond, R. Loogen, and P.W. Trinder. GpH and Eden: Comparing Two Parallel Functional Languages on a Beowulf Cluster. In SFP’00 — Scottish Functional Programming Workshop, volume 2 of Trends in Functional Programming, pages 39–52, University of St Andrews, Scotland, July 2000. Intellect.
H-W. Loidl, P.W. Trinder, and C. Butz. Tuning Task Granularity and Data Locality of Data Parallel GpH Programs. Parallel Processing Letters, 11(4):471–486, December 2001.
H-W. Loidl, P.W. Trinder, K. Hammond, S.B. Junaidu, R.G. Morgan, and S.L. Peyton Jones. Engineering Parallel Symbolic Programs in GPH. Concurrency — Practice and Experience, 11:701–752, 1999.
D.K. Lowenthal, V.W. Freeh, and G.R. Andrews. Using Fine-Grain Threads and Run-Time Decision Making in Parallel Computing. J. of Parallel and Distributed Computing, 37:42–54, 1996.
E. Mascarenhas and V. Rego. Ariadne: Architecture of a Portable Threads System Supporting Thread Migration. Software — Practice and Experience, 26(3):327–356, March 1996.
B. Mathiske, F. Matthes, and J.W. Schmidt. On Migrating Threads. In Intl. Workshop on Next Generation Information Technologies and Systems, Naharia, Israel, June 1995.
D. Milojicić, F. Douglis, and R. Weeler. Mobility: Processes, Computers, and Agents. Addison-Wesley, Reading, MA, USA, 1999.
R.S. Nikhil. Parallel Symbolic Computing in Cid. In Workshop on Parallel Symbolic Computing, LNCS 1068, pages 217–242, Beaune, France, Oct. 1995. Springer.
R.S. Nikhil and A. Singla. Automatic Granularity Control and Load-Balancing in Cid. Technical report, DEC Research Labs, December 1994.
S.L. Peyton Jones. Implementing Lazy Functional Languages on Stock Hardware: the Spineless Tagless G-machine. J. of Functional Programming, 2(2):127–202, July 1992.
D. Ridge, D. Becker, P. Merkey, and T. Sterling. Beowulf: Harnessing the Power of Parallelism in a Pile-of-PCs. In IEEE Aerospace Conference, pages 79–91, 1997.
P.W. Trinder, K. Hammond, H-W. Loidl, and S.L. Peyton Jones. Algorithm + Strategy = Parallelism. J. of Functional Programming, 8(1):23–60, January 1998.
P.W. Trinder, K. Hammond, J.S. Mattson Jr., A.S. Partridge, and S.L. Peyton Jones. GUM: a Portable Parallel Implementation of Haskell. In PLDI’96 — Conf. on Programming Language Design and Implementation, pages 79–88, Philadephia, USA, May 1996.
T. von Eicken, D.E. Culler, S.C. Goldstein, and K.E. Schauser. Active Messages: a Mechanism for Integrated Communication and Computation. In ISCA’92 — Intl. Symp. on Computer Architecture, pages 256–266, Gold Coast, Australia, May 1992. ACM Press.
P. Wegner. Programming Languages, Information Structures and Machine Organisation. McGraw-Hill, New York, 1971.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Du Bois, A.R., Loidl, HW., Trinder, P. (2003). Thread Migration in a Parallel Graph Reducer. In: Peña, R., Arts, T. (eds) Implementation of Functional Languages. IFL 2002. Lecture Notes in Computer Science, vol 2670. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44854-3_13
Download citation
DOI: https://doi.org/10.1007/3-540-44854-3_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40190-2
Online ISBN: 978-3-540-44854-9
eBook Packages: Springer Book Archive