A scalable implementation of barrier synchronization using an adaptive combining tree

Gupta, Rajiv; Hill, Charles R.

doi:10.1007/BF01407897

A scalable implementation of barrier synchronization using an adaptive combining tree

Published: June 1989

Volume 18, pages 161–180, (1989)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Rajiv Gupta¹ &
Charles R. Hill¹

223 Accesses
27 Citations
Explore all metrics

Abstract

Barrier synchronization is commonly used for synchronizing processors prior to a join operation and to enforce data dependencies during the execution of parallelized loops. Simple software implementations of barrier synchronization can result in memory hot-spots, especially in large scale shared-memory multiprocessors containing hundreds of processors and memory modules communicating through an interconnection network. A software combining tree can be used to substantially reduce memory contention due to hot-spots. However, such an implementation results inO(logn) latency in recognition of barrier synchronization, wheren is the number of processors. In this paper anadaptive software combining tree is used to implement a scalable barrier withO(1) recognition latency. The processors that arrive early at the barrier adapt the combining tree so that it has a structure appropriate for reducing the latency for the processors that arrive later. We also show how adaptive combining trees can be used to implement the fuzzy barrier. The fuzzy barrier mechanism reduces the idling of processors at the barriers by allowing the processors to execute useful instructions while they are waiting at the barrier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

E. D. Brooks, The Butterfly Barrier,International Journal of Parallel Programming,15(4):295–307 (August 1986).
Google Scholar
D. Hansgen, R. Finkel, and U. Manber, Two Algorithms for Barrier Synchronization,International Journal of Parallel Programming,17(1):1–18 (February 1988).
Google Scholar
P. C. Yew, N. F. Tzeng, and D. H. Lawrie, Distributing Hot-Spot Addressing in Large Scale Multiprocessors,IEEE Transactions on Computers,C-36(4):388–395 (April 1987).
Google Scholar
D. H. Lawrie, Access and Alignment of Data in an Array Processor,IEEE Transactions on Computers,C-24:1145–1155 (December 1975).
Google Scholar
D. J. Kuck, E. S. Davidson, D. H. Lawrie, and A. H. Sameh, Parallel Supercomputing Today and the Cedar Approach,Science,231:967–974 (February 1986).
Google Scholar
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, The NYU Ultracomputer-Designing a MIMD Shared Memory Parallel Machine,IEEE Transactions on Computers,C-32(2):175–189 (February 1983).
Google Scholar
G. F. Pfister, The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture, InProc. of the International Conf. on Parallel Processing, pp. 764–771 (August 1985).
R. Gupta, The Fuzzy Barrier: A Mechanism for High Speed Synchronization of Processors, InProc. of the Third International Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 54–64 (April 1989).

Download references

Author information

Authors and Affiliations

Philips Laboratories, North American Philips Corporation, 345 Scarborough Road, Briarcliff Manor, New York
Rajiv Gupta & Charles R. Hill

Authors

Rajiv Gupta
View author publications
You can also search for this author inPubMed Google Scholar
Charles R. Hill
View author publications
You can also search for this author inPubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, R., Hill, C.R. A scalable implementation of barrier synchronization using an adaptive combining tree. Int J Parallel Prog 18, 161–180 (1989). https://doi.org/10.1007/BF01407897

Download citation

Issue Date: June 1989
DOI: https://doi.org/10.1007/BF01407897

Key Words

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A scalable implementation of barrier synchronization using an adaptive combining tree

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Time-Efficient Read/Write Register in Crash-Prone Asynchronous Message-Passing Systems

Extending the wait-free hierarchy to multi-threaded systems

Deadlock and WCET analysis of barrier-synchronized concurrent programs

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key Words

Subscribe and save

Buy Now

A scalable implementation of barrier synchronization using an adaptive combining tree

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Time-Efficient Read/Write Register in Crash-Prone Asynchronous Message-Passing Systems

Extending the wait-free hierarchy to multi-threaded systems

Deadlock and WCET analysis of barrier-synchronized concurrent programs

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Subscribe and save

Buy Now