Abstract
Reaching agreement among processes sharing read/write memory is possible only in the presence of an eventual unique leader. A leader that fails must be recoverable, but on the other hand, a live and well-performing leader should never be decrowned. This paper presents the first leader algorithm in shared memory environments that guarantees an eventual leader following global stabilization time. The construction is built using light-weight lease and renew primitives. The implementation is simple, yet efficient. It is uniform, in the sense that the number of potentially contending processes for leadership is not a priori known.
Similar content being viewed by others
References
Abadi M. and Lamport L. (September 1994). An Old-Fashioned Recipe for Real Time. ACM Transactions on Programming Languages and Systems 16(5):1543–1571
Alur R., Attiya H., and Taubenfeld G. (1997). Time-Adaptive Algorithms for Synchronization. SIAM Journal on Computing 26(2):539–556
R. Alur and G. Taubenfeld, How to share a data structure: A fast timing-based solution, in Proceedings of the 5th IEEE Symposium on Parallel and Distributed Processing, pp. 470–477 (1993).
Alur R. and Taubenfeld G. (1996). Fast Timing-based Algorithms. Distributed Computing 10(1):1–10
Alur R. and Taubenfeld G. (1996). Contention-free Complexity of Shared Memory Algorithms. Information and Computation 126(1):62–73
K. Amiri, G.A. Gibson, and R. Golding, Highly Concurrent Shared Storage, in Proceedings of the International Conference on Distributed Computing Systems (ICDCS2000), (April 2000).
Attiya H., Bar-Noy A., and Dolev D. (1995). Sharing Memory Robustly in Message-Passing Systems. Journal of the ACM 42(1):124–142
H. Attiya and A. Bar-Or, Sharing Memory with Semi-Byzantine Clients and Faulty Storage Servers. The 22nd Symposium on Reliable Distributed Systems (SRDS), (October, 2003).
A. Barry et al. An Overview of Version 0.9.5 Proposed SCSI Device Locks, in Proceedings of the 17th IEEE Symposium on Mass Storage Systems, pp. 243–252, College Park, Maryland, March 27–30, IEEE Computer Society, (2000).
R. Boichat, P. Dutta, and R. Guerraoui, Asynchronous Leasing. Invited Paper at the 7th IEEE International Workshop on Object-oriented Real-time Dependable Systems (WORDS 2002), San Diego, California (January 2002).
Burns R. (March, 2000). Data Management in a Distributed File System for Storage Area Networks. PhD Thesis. Department of Computer Science, University of California, Santa Cruz
Burns J. and Lynch N. (December 1993). Bounds on Shared Memory for Mutual Exclusion. Information and Computation 107 (2):171–184
Cheng Shao, E. Pierce, J. Welch, Multi-Writer Consistency Conditions for Shared Memory Objects. in Proceedings of the 17th International Symposium on Distributed Computing (DISC’2003), (to appear).
G. Chockler and D. Malkhi, Active Disk Paxos with Infinitely Many Processes. Proceedings of the 21st ACM Symposium on Principles of Distributed Computing (PODC), (August 2002).
G. Chockler, D. Malkhi, and M. K. Reiter, Backoff Protocols for Distributed Mutual Exclusion and Ordering. Proceedings of the 21st International Conference on Distributed Computing Systems, pp. 11–20, (April 2001).
Chandra T.D. and Toueg S. (March 1996). Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM 43(2):225–267
Cristian F. and Fetzer C. (1999). The Timed Asynchronous Distributed System Model. IEEE Transactions on Parallel and Distributed Systems 10(6):642–657
Dwork C., Lynch N., and Stockmeyer L. (1988). Consensus in the Presence of Partial Synchrony. Journal of the ACM 35(2):288–323
Gafni E. and Lamport L. (2003). Disk Paxos. Distributed Computing 16(1):1–20
E. Gafni and M. Mitzenmacher, Analysis of Timing-Based Mutual Exclusion with Random Times, in Proceedings of the 18th Annual ACM Symposium on Principles of Distributed Computing (PODC’99), pp. 13–21, May 3–6, Atlanta, Georgia, USA (1999).
J. S. Glider, C. F. Fuente, and W. J. Scales, Software Architecture of a SAN Storage Control System. IBM Systems Journal, 2(42) (2003).
R. Golding and O. Rodeh, Group Communication – Still Complex after All These Years, in International Workshop on Large-Scale Group Communication (in conjunction with SRDS’2003), October 5, Florence, Italy (2003).
C. Gray and D. Cheriton, Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency, in Proceedings of the 12th ACM Symposium on Operating Systems Principles, pp. 202–210 (1989).
Jayanti P., Chandra T., and Toueg S. (May 1998). Fault-Tolerant Wait-free Shared Objects. Journal of the ACM 45(3):451–500
D. K. Kaynar, N. Lynch, R. Segala, and F. Vaandrager, Timed I/O Automata. Manuscript in progress (2003).
Lamport L. (February 1987) A Fast Mutual Exclusion Algorithm. ACM Transactions on Computer Systems 5(1):1–11 Also appeared as SRC Research Report 7.
Lamport L. December (2001). Paxos Made Simple. Distributed Computing Column of SIGACT News 32(4):34–58
B. W. Lampson, How to Build a Highly Available System using Consensus, in Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG), Vol. 1151: pp. 1–17, Springer-Verlag LNCS Berlin (1996).
B. W. Lampson, The ABCD’s of Paxos. Lamport Celebration Lecture 2, Presented on the 20th Annual ACM Symposium on Principles of Distributed Computing (PODC’01), August 26–29, Newport, Rhode Island, USA, (2001).
W.K. Lo and V. Hadzilacos, Using Failure Detectors to Solve Consensus in Asynchronous Shared-Memory Systems, in Proceedings of the 8th International Workshop on Distributed Algorithms (WDAG), pp. 280–295, The Netherlands (1994).
Lynch N. (1996). Distributed Algorithms. Morgan Kaufman Publishers, San Mateo, CA
N. Lynch and N. Shavit, Timing-based Mutual Exclusion, in Proceedings of the 13rd Real-Time Systems Symposium, pp. 2–11, Phoenix, Arizona, IEEE Computer Society, (December 1992).
J. Menon, D. Pease, R. Rees, L. Duyanovich, and B. Hillsberg. StorageTank, a Heterogeneous Scalable SAN File System. IBM Systems Journal 2(42) (2003).
The Object-Based Storage Devices Technical Work Group. www.snia.org/ tech_activities/workgroups/osd.
K. Preslan, et al. A 64-bit, Shared Disk File System for Linux, in Proceedings of the 16th IEEE Symposium on Mass Storage Systems, pp. 22–41, San Diego, California, March 15–18, IEEE Computer Society, (1999).
K. Preslan, S. Soltis, C. Sabol, and M. O’Keefe, Device Locks: Mutual Exclusion for Storage Area Networks, in Proceedings of the 16th IEEE Symposium on Mass Storage Systems, pp. 262–274, San Diego, California, March 15–18, IEEE Computer Society, (1999).
O. Rodeh and A. Teperman. zFS – a scalable distributed file system using object disks, in Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, pages 207–218, San Diego, California, April 7–10, IEEE Computer Society, (2003).
F. Schmuck and R. Haskin, GPFS: A Shared-Disk File System for Large Computing Clusters. in Proceedings of the First Conference on File and Storage Technologies (FAST) (January 2002).
S. Soltis, T. Ruwart, and M. O’Keefe, The Global File System, in Proceedings of the 5th NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, Maryland (September, 1996).
S. Soltis, G. Erickson, K. Preslan, M. O’Keefe, and T. Ruwart, The Design and Performance of a Shared File System for IRIX, in Proceedings of the 6th NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, Maryland, March 23–26 (1998).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chockler, G., Malkhi, D. Light-Weight Leases for Storage-Centric Coordination. Int J Parallel Prog 34, 143–170 (2006). https://doi.org/10.1007/s10766-006-0008-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-006-0008-z