Thread Migration in a Parallel Graph Reducer

Du Bois, André Rauber; Loidl, Hans-Wolfgang; Trinder, Phil

doi:10.1007/3-540-44854-3_13

André Rauber Du Bois⁶,
Hans-Wolfgang Loidl⁷ &
Phil Trinder⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2670))

Included in the following conference series:

Symposium on Implementation and Application of Functional Languages

265 Accesses

Abstract

To support high level coordination, parallel functional languages need effective and automatic work distribution mechanisms. Many implementations distribute potential work, i.e. sparks or closures, but there is good evidence that the performance of certain classes of program can be improved if current work, or threads, are also distributed. Migrating a thread incurs significant execution cost and requires careful scheduling and an elaborate implementation.

This paper describes the design, implementation and performance of thread migration in the GUM runtime system underlying Glasgow parallel Haskell (GpH). Measurements of nontrivial programs on a highlatency cluster architecture show that thread migration can improve the performance of data-parallel and divide-and-conquer programs with low processor utilisation. Thread migration also reduces the variation in performance results obtained in separate executions of a program. Moreover, migration does not incur significant overheads if there are no migratable threads, or on a single processor. However, for programs that already exhibit good processor utilisation, migration may increase performance variability and very occasionally reduce performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Multithreaded runtime framework for parallel and adaptive applications

Article 31 July 2022

The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism

Extending Hedgehog’s Dataflow Graphs to Multi-node GPU Architectures

References

A. Barak and O. Laádan. The MOSIX Multicomputer Operating System for High-Performance Cluster Computing. Future Generation Computer Systems, 13(4–5):361–372, 1998.
Article Google Scholar
R. Baron, R. Rashid, E. Siegel, A. Tevanian, and M. Young. Mach-1: An Operating Environment for Large-Scale Multiprocessor Applications. IEEE Software, 2(4):65–67, July 1985.
Article Google Scholar
R.D. Blumofe, C.F. Joerg, C.E. Leiserson, K.H. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In PPoPP’95 — Symp. on Principles and Practice of Parallel Programming, pages 207–216, Santa Barbara, USA, 1995.
Google Scholar
S. Breitinger, R. Loogen, Y. Ortega Mallén, and R. Peña Marí. Eden — The Paradise of Functional Concurrent Programming. In EuroPar’96 — European Conf. on Parallel Processing, LNCS 1123, pages 710–713, Lyon, France, 1996. Springer.
Google Scholar
T. Bülck, A. Held, W. Kluge, S. Pantke, C. Rathsack, S-B. Scholz, and R. Schröder. Experience with the Implementation of a Concurrent Graph Reduction System on an nCUBE/2 Platform. In CONPAR’94 — Conf. on Parallel and Vector Processing, LNCS 854, pages 497–508. Springer, 1994.
Google Scholar
J.S. Chase, F.G Amador, E.D. Lazowska, H.M Levy, and R.J. Littlefield. the Amber System: Parallel Programming on a Network of Multiprocessors. In Symp. on Operating Systems Principles, pages 147–158, Litchfield Park, AZ, USA, 1989.
Google Scholar
D.E. Culler, S.C. Goldstein, K.E. Schauser, and T. von Eicken. TAM — A Compiler Controlled Threaded Abstract Machine. J. of Parallel and Distributed Computing, 18:347–370, June 1993.
Article Google Scholar
A. R. Du Bois, R. Pointon, H-W. Loidl, and P. W. Trinder. Implementing Declarative Parallel Bottom-Avoiding Choice. In 14th Symposium on Computer Architecture and High Performance Computing, pages 82–89, Vitoria, Brazil, october 2002. IEEE Press.
Google Scholar
K. Hammond and S.L. Peyton Jones. Some Early Experiments on the GRIP Parallel Reducer. In IFL’90 — Intl. Workshop on the Parallel Implementation of Functional Languages, pages 51–72, Nijmegen, The Netherlands, June 1990.
Google Scholar
K. Hammond and S.L. Peyton Jones. Profiling Scheduling Strategies on the GRIP Multiprocessor. In IFL’92 — Intl. Workshop on the Parallel Implementation of Functional Languages, pages 73–98, RWTH Aachen, Germany, September 1992.
Google Scholar
Impala. Impala-(IMplicitly PArallel LAnguage Application Suite). <URL:http://www.csg.lcs.mit.edu/impala/>, July 2001.
A. Itzkovitz, A. Schuster, and L. Shalev. Thread Migration and its Applications in Distributed Shared Memory Systems. J. of Systems and Software, 42(1):71–87, 1998.
Article Google Scholar
M. H. G. Kesseler. The Implementation of Functional Languages on Parallel Machines with Distributed Memory. PhD thesis, Wiskunde en Informatica, Katholieke Universiteit van Nijmegen, The Netherlands, 1996.
Google Scholar
H. Kingdon, D.R. Lester, and G. Burn. The HDG-machine: a Highly Distributed Graph-Reducer for a Transputer Network. Computer Journal, 34(4):290–301, 1991.
Article Google Scholar
H-W. Loidl. Granularity in Large-Scale Parallel Functional Programming. PhD thesis, University of Glasgow, March 1998.
Google Scholar
H-W. Loidl and K. Hammond. Making a Packet: Cost-Effective Communication for a Parallel Graph Reducer. In IFL’96 — Intl. Workshop on the Implementation of Functional Languages, LNCS 1268, pages 184–199, Bonn/Bad-Godesberg, Germany, September 1996. Springer.
Google Scholar
H-W. Loidl, U. Klusik, K. Hammond, R. Loogen, and P.W. Trinder. GpH and Eden: Comparing Two Parallel Functional Languages on a Beowulf Cluster. In SFP’00 — Scottish Functional Programming Workshop, volume 2 of Trends in Functional Programming, pages 39–52, University of St Andrews, Scotland, July 2000. Intellect.
Google Scholar
H-W. Loidl, P.W. Trinder, and C. Butz. Tuning Task Granularity and Data Locality of Data Parallel GpH Programs. Parallel Processing Letters, 11(4):471–486, December 2001.
Google Scholar
H-W. Loidl, P.W. Trinder, K. Hammond, S.B. Junaidu, R.G. Morgan, and S.L. Peyton Jones. Engineering Parallel Symbolic Programs in GPH. Concurrency — Practice and Experience, 11:701–752, 1999.
Article Google Scholar
D.K. Lowenthal, V.W. Freeh, and G.R. Andrews. Using Fine-Grain Threads and Run-Time Decision Making in Parallel Computing. J. of Parallel and Distributed Computing, 37:42–54, 1996.
Article Google Scholar
E. Mascarenhas and V. Rego. Ariadne: Architecture of a Portable Threads System Supporting Thread Migration. Software — Practice and Experience, 26(3):327–356, March 1996.
Article Google Scholar
B. Mathiske, F. Matthes, and J.W. Schmidt. On Migrating Threads. In Intl. Workshop on Next Generation Information Technologies and Systems, Naharia, Israel, June 1995.
Google Scholar
D. Milojicić, F. Douglis, and R. Weeler. Mobility: Processes, Computers, and Agents. Addison-Wesley, Reading, MA, USA, 1999.
Google Scholar
R.S. Nikhil. Parallel Symbolic Computing in Cid. In Workshop on Parallel Symbolic Computing, LNCS 1068, pages 217–242, Beaune, France, Oct. 1995. Springer.
Google Scholar
R.S. Nikhil and A. Singla. Automatic Granularity Control and Load-Balancing in Cid. Technical report, DEC Research Labs, December 1994.
Google Scholar
S.L. Peyton Jones. Implementing Lazy Functional Languages on Stock Hardware: the Spineless Tagless G-machine. J. of Functional Programming, 2(2):127–202, July 1992.
Article MATH Google Scholar
D. Ridge, D. Becker, P. Merkey, and T. Sterling. Beowulf: Harnessing the Power of Parallelism in a Pile-of-PCs. In IEEE Aerospace Conference, pages 79–91, 1997.
Google Scholar
P.W. Trinder, K. Hammond, H-W. Loidl, and S.L. Peyton Jones. Algorithm + Strategy = Parallelism. J. of Functional Programming, 8(1):23–60, January 1998.
Article MATH Google Scholar
P.W. Trinder, K. Hammond, J.S. Mattson Jr., A.S. Partridge, and S.L. Peyton Jones. GUM: a Portable Parallel Implementation of Haskell. In PLDI’96 — Conf. on Programming Language Design and Implementation, pages 79–88, Philadephia, USA, May 1996.
Google Scholar
T. von Eicken, D.E. Culler, S.C. Goldstein, and K.E. Schauser. Active Messages: a Mechanism for Integrated Communication and Computation. In ISCA’92 — Intl. Symp. on Computer Architecture, pages 256–266, Gold Coast, Australia, May 1992. ACM Press.
Google Scholar
P. Wegner. Programming Languages, Information Structures and Machine Organisation. McGraw-Hill, New York, 1971.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical and Computer Sciences, Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS, UK
André Rauber Du Bois & Phil Trinder
Institut für Informatik, Ludwig-Maximilians-Universität München, D-80538, Munchen, Germany
Hans-Wolfgang Loidl

Authors

André Rauber Du Bois
View author publications
You can also search for this author in PubMed Google Scholar
Hans-Wolfgang Loidl
View author publications
You can also search for this author in PubMed Google Scholar
Phil Trinder
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Facultad de Informática Departamento Sistemas Informáticos y Programación, Universidad Complutense de Madrid, 28040, Madrid, Spain
Ricardo Peña
Software Engineering and Management, IT-University in Gothenburg, Box 8718, 40275, Gothenburg, Sweden
Thomas Arts

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Du Bois, A.R., Loidl, HW., Trinder, P. (2003). Thread Migration in a Parallel Graph Reducer. In: Peña, R., Arts, T. (eds) Implementation of Functional Languages. IFL 2002. Lecture Notes in Computer Science, vol 2670. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44854-3_13

Download citation

DOI: https://doi.org/10.1007/3-540-44854-3_13
Published: 18 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40190-2
Online ISBN: 978-3-540-44854-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Thread Migration in a Parallel Graph Reducer

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Multithreaded runtime framework for parallel and adaptive applications

The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism

Extending Hedgehog’s Dataflow Graphs to Multi-node GPU Architectures

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Thread Migration in a Parallel Graph Reducer

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Multithreaded runtime framework for parallel and adaptive applications

The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism

Extending Hedgehog’s Dataflow Graphs to Multi-node GPU Architectures

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation