Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence

ANDREW J. BENNETT; PAUL H. J. KELLY; ROSS A. PATERSON

doi:10.1017/S0956796801003793

Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence

Published online by Cambridge University Press: 04 September 2001

ANDREW J. BENNETT ,

PAUL H. J. KELLY and

ROSS A. PATERSON

Show author details

ANDREW J. BENNETT: Affiliation:
Department of Computing, Imperial College, 180 Queen's Gate, London SW7 2BZ, UK (e-mail: p.kelly@doc.ic.ac.uk)
PAUL H. J. KELLY: Affiliation:
Department of Computing, Imperial College, 180 Queen's Gate, London SW7 2BZ, UK (e-mail: p.kelly@doc.ic.ac.uk)
ROSS A. PATERSON: Affiliation:
Department of Computer Science, City University, London, UK

Article contents

Abstract

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This paper is an exploration of the parallel graph reduction approach to parallel functional programming, illustrated by a particular example: pipelined, dynamically-scheduled implementation of search, updates and read-modify-write transactions on an in-store binary search tree. We use program transformation, execution-driven simulation and analytical modelling to expose the maximum potential parallelism, the minimum communication and synchronisation overheads, and to control the overall space requirement. We begin with a lazy functional program specifying a series of transactions on a binary tree, each involving several searches and updates, in a side-effect-free fashion. Transformation of the source code produces a formulation of the program with greater locality and larger grain size than can be achieved using naive parallelization methods, and we show that, with care, these tasks can be scheduled effectively. Even with a workload using random keys, significant spatial locality is found, and we evaluate a modified cache coherency protocol which avoids false sharing so that large cache lines can be used to minimise the number of messages required. As expected with a pipeline, the application should reach a steady state as soon as the first transaction is completed. However, if the network latency is too large, the rate of completion lags behind the rate at which work is admitted, and internal queues grow without bound. We determine the conditions under which this occurs, and show how it can be avoided while maximising speedup.

Type: Research Article
Information: Journal of Functional Programming , Volume 11 , Issue 4 , July 2001 , pp. 359 - 393

DOI: https://doi.org/10.1017/S0956796801003793 [Opens in a new window]

Submit a response

Discussions

No Discussions have been published for this article.

Article contents

Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence

Abstract

Discussions

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests