Hostname: page-component-8448b6f56d-c4f8m Total loading time: 0 Render date: 2024-04-24T04:22:10.217Z Has data issue: false hasContentIssue false

Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence

Published online by Cambridge University Press:  04 September 2001

ANDREW J. BENNETT
Affiliation:
Department of Computing, Imperial College, 180 Queen's Gate, London SW7 2BZ, UK (e-mail: p.kelly@doc.ic.ac.uk)
PAUL H. J. KELLY
Affiliation:
Department of Computing, Imperial College, 180 Queen's Gate, London SW7 2BZ, UK (e-mail: p.kelly@doc.ic.ac.uk)
ROSS A. PATERSON
Affiliation:
Department of Computer Science, City University, London, UK
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This paper is an exploration of the parallel graph reduction approach to parallel functional programming, illustrated by a particular example: pipelined, dynamically-scheduled implementation of search, updates and read-modify-write transactions on an in-store binary search tree. We use program transformation, execution-driven simulation and analytical modelling to expose the maximum potential parallelism, the minimum communication and synchronisation overheads, and to control the overall space requirement. We begin with a lazy functional program specifying a series of transactions on a binary tree, each involving several searches and updates, in a side-effect-free fashion. Transformation of the source code produces a formulation of the program with greater locality and larger grain size than can be achieved using naive parallelization methods, and we show that, with care, these tasks can be scheduled effectively. Even with a workload using random keys, significant spatial locality is found, and we evaluate a modified cache coherency protocol which avoids false sharing so that large cache lines can be used to minimise the number of messages required. As expected with a pipeline, the application should reach a steady state as soon as the first transaction is completed. However, if the network latency is too large, the rate of completion lags behind the rate at which work is admitted, and internal queues grow without bound. We determine the conditions under which this occurs, and show how it can be avoided while maximising speedup.

Type
Research Article
Copyright
© 2001 Cambridge University Press
Submit a response

Discussions

No Discussions have been published for this article.