Recoverable mutual exclusion

Golab, Wojciech; Ramaraju, Aditya

doi:10.1007/s00446-019-00364-0

Recoverable mutual exclusion

Published: 05 November 2019

Volume 32, pages 535–564, (2019)
Cite this article

Distributed Computing Aims and scope Submit manuscript

500 Accesses
14 Citations
Explore all metrics

Abstract

Mutex locks have traditionally been the most common mechanism for protecting shared data structures in concurrent programs. However, the robustness of such locks against process failures has not been studied thoroughly. The vast majority of mutex algorithms are designed around the assumption that processes are reliable, meaning that a process may not fail while executing the lock acquisition and release code, or while inside the critical section. If such a failure does occur, then the liveness properties of a conventional mutex lock may cease to hold until the application or operating system intervenes by cleaning up the internal structure of the lock. For example, a process that is attempting to acquire an otherwise starvation-free mutex may be blocked forever waiting for a failed process to release the critical section. Adding to the difficulty, if the failed process recovers and attempts to acquire the same mutex again without appropriate cleanup, then the mutex may become corrupted to the point where it loses safety, notably the mutual exclusion property. We address this challenge by formalizing the problem of recoverable mutual exclusion, and proposing several solutions that vary both in their assumptions regarding hardware support for synchronization, and in their efficiency. Compared to known solutions, our algorithms are more robust as they do not restrict where or when a process may crash, and provide stricter guarantees in terms of efficiency, which we define in terms of remote memory references.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recycling Memory in Recoverable Mutex Locks

Recoverable Mutual Exclusion with Abortability

Optimal Recoverable Mutual Exclusion Using only FASAS

Notes

The term bounded in reference to a piece of code means that there exists a function f of the number of processes N such that the code performs at most f(N) shared memory operations in all executions of the algorithm instantiated for N processes.
As explained later on in the model near the discussion of First-Come-First-Served fairness, we assume that the doorway is well-defined and bounded only in a subset of execution histories that are relevant to our weaker notion of FCFS.
In a practical implementation, the code of and can be packaged in a single procedure for simplicity.
The term cleanup-concurrent defined in the conference version of this paper [20] is analogous to 1-failure-concurrent in this model.
The Bounded Recovery property defined in the conference version of this paper [20] is analogous to 1-BR in this model.
Despite the prevalence of cache-coherent architectures, the DSM model remains important in practice because of its inherent scalability. Intel’s Single-chip Cloud Computer, for example, sacrifices cache-coherence “to simplify the design, reduce power consumption and to encourage the exploration of datacenter distributed memory software models” [26].
The RMR complexity of is unbounded if F does not exist for a given history H.
The “\(\wedge \)” operator at line 94 should be interpreted like&& in C++, meaning that the right operand is evaluated only if the left operand is true.

References

Afek, Y., Greenberg, D.S., Merritt, M., Taubenfeld, G.: Computing with faulty shared objects. J. ACM 42(6), 1231–1274 (1995)
Article MathSciNet Google Scholar
Anderson, J., Kim, Y.-J.: A new fast-path mechanism for mutual exclusion. Distrib. Comput. 14(1), 17–29 (2001)
Article Google Scholar
Anderson, J., Kim, Y.-J.: An improved lower bound for the time complexity of mutual exclusion. Distrib. Comput. 15(4), 221–253 (2002)
Article Google Scholar
Anderson, J., Kim, Y.-J., Herman, T.: Shared-memory mutual exclusion: major research trends since 1986. Distrib. Comput. 16(2–3), 75–110 (2003)
Article Google Scholar
Anderson, T.: The performance of spin lock alternatives for shared-memory multiprocessors. IEEE Trans. Parallel Distrib. Syst. 1(1), 6–16 (1990)
Article Google Scholar
Attiya, H., Hendler, D., Woelfel, P.: Tight RMR lower bounds for mutual exclusion and other problems. In: Proceedings of the 40th ACM symposium on theory of computing (STOC), pp. 217–226 (2008)
Bender, M.A., Gilbert, S.: Mutual exclusion with \(O(\log ^{2}\log n)\) amortized work. In: Proceedings of the 52nd symposium on foundations of computer science (FOCS), pp. 728–737 (2011)
Bohannon, P., Lieuwen, D.F., Silberschatz, A.: Recovering scalable spin locks. In: Proceedings of the 8th IEEE symposium on parallel and distributed processing (SPDP), pp. 314–322 (1996)
Bohannon, P., Lieuwen, D.F., Silberschatz, A., Sudarshan, S., Gava, J.: Recoverable user-level mutual exclusion. In: Proceedings of the 7th IEEE symposium on parallel and distributed processing (SPDP), pp. 293–301 (1995)
Burns, J.E., Lynch, N.A.: Bounds on shared memory for mutual exclusion. Inf. Comput. 107(2), 171–184 (1993)
Article MathSciNet Google Scholar
Cypher, R.: The communication requirements of mutual exclusion. In: Proceedings of the 7th ACM symposium on parallel algorithms and architectures (SPAA), pp. 147–156 (1995)
Dijkstra, E.W.: Solution of a problem in concurrent programming control. Commun. ACM 8(9), 569 (1965)
Article Google Scholar
Dijkstra, E.W.: Self-stabilizing systems in spite of distributed control. Commun. ACM 17(11), 643–644 (1974)
Article Google Scholar
Fan, R., Lynch, N.: An \(\Omega (n \log n)\) lower bound on the cost of mutual exclusion. In: Proceedings of the 25th ACM symposium on principles of distributed computing (PODC), pp. 275–284 (2006)
Giakkoupis, G., Woelfel, P.: Randomized mutual exclusion with constant amortized RMR complexity on the DSM. In: Proceedings of the 55th symposium on foundations of computer science (FOCS), pp. 504–513 (2014)
Gibbons, P.B.: How emerging memory technologies will have you rethinking algorithm design. In: Proceedings of the 35th ACM symposium on principles of distributed computing (PODC), p. 303 (2016)
Golab, W., Hadzilacos, V., Hendler, D., Woelfel, P.: RMR-efficient implementations of comparison primitives using read and write operations. Distrib. Comput. 25(2), 109–162 (2012)
Article Google Scholar
Golab, W., Hendler, D.: Recoverable mutual exclusion in sub-logarithmic time. In: Proceedings of the 36th annual ACM symposium on principles of distributed computing (PODC), pp. 211–220 (2017)
Golab, W., Hendler, D.: Recoverable mutual exclusion under system-wide failures. In: Proceedings of the 37th annual ACM symposium on principles of distributed computing (PODC), pp. 17–26 (2018)
Golab, W., Ramaraju, A.: Recoverable mutual exclusion. In: Proceedings of the 35th ACM symposium on principles of distributed computing (PODC), pp. 65–74 (2016)
Graunke, G., Thakkar, S.: Synchronization algorithms for shared-memory multiprocessors. IEEE Comput. 23(6), 60–69 (1990)
Article Google Scholar
Gray, J., Reuter, A.: Transaction processing: concepts and techniques. Morgan Kaufmann, Burlington (1993)
MATH Google Scholar
Hendler, D., Woelfel, P.: Randomized mutual exclusion with sub-logarithmic RMR-complexity. Distrib. Comput. 24(1), 3–19 (2011)
Article Google Scholar
Herlihy, M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)
Article Google Scholar
Hoepman, J.-H., Papatriantafilou, M., Tsigas, P.: Self-stabilization of wait-free shared memory objects. In: Proceedings of the 9th international workshop on distributed algorithms (WDAG), pp. 273–287 (1995)
Intel Corporation. Single-chip cloud computer. http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-single-chip-cloud-overview-paper.pdf. Accessed 31 Oct 2019
Jayanti, P.: F-arrays: implementation and applications. In: Proceedings of the 21st annual ACM symposium on principles of distributed computing (PODC), pp. 270–279 (2002)
Jayanti, P., Chandra, T., Toueg, S.: Fault-tolerant wait-free shared objects. J. ACM 45(3), 451–500 (1998)
Article MathSciNet Google Scholar
Jayanti, P., Joshi, A.: Recoverable FCFS mutual exclusion with wait-free recovery. In: Proceedings of the 31st international symposium on distributed computing (DISC), pp. 30:1–30:15 (2017)
Jayanti, P., Jayanti, S., Joshi, A.: A recoverable Mutex algorithm with sub-logarithmic RMR on both CC and DSM. In: Proceedings of the 38th annual ACM symposium on principles of distributed computing (PODC), pp. 177–186 (2019)
Johnen, C., Higham, L.: Fault-tolerant implementations of regular registers by safe registers with applications to networks. In: Proceedings of 10th international conference of distributed computing and networking (ICDCN), pp. 337–348 (2009)
Kim, Y.-J., Anderson, J.H.: A space- and time-efficient local-spin spin lock. Inf. Process. Lett. 84(1), 47–55 (2002)
Article MathSciNet Google Scholar
Kessels, J.: Arbitration without common modifiable variables. Acta Informatica 17, 135–141 (1982)
Article MathSciNet Google Scholar
Lamport, L.: A new solution of Dijkstra’s concurrent programming problem. Commun. ACM 17(8), 453–455 (1974)
Article MathSciNet Google Scholar
Lamport, L.: The mutual exclusion problem: part I—a theory of interprocess communication. J. ACM 33(2), 313–326 (1986)
Article MathSciNet Google Scholar
Lamport, L.: The mutual exclusion problem: part II—statement and solutions. J. ACM 33(2), 327–348 (1986)
Article MathSciNet Google Scholar
Lamport, L.: A fast mutual exclusion algorithm. ACM Trans. Comput. Syst. 5(1), 1–11 (1987)
Article Google Scholar
Magnusson, P., Landin, A., Hagersten, E.: Queue locks on cache coherent multiprocessors. In: Proceedings of the 8th international parallel processing symposium (IPPS), pp. 165–171 (1994)
Mellor-Crummey, J., Scott, M.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9(1), 21–65 (1991)
Article Google Scholar
Michael, M., Kim, Y.: Fault tolerant mutual exclusion locks for shared memory systems. US Patent (2009)
Mittal, S., Vetter, J.S.: A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Trans. Parallel Distrib. Syst. 27(5), 1537–1550 (2016)
Article Google Scholar
Mogul, J.C., Argollo, E., Shah, M.A., Faraboschi, P.: Operating system support for NVM + DRAM hybrid main memory. In: Proceedings of the 12th workshop on hot topics in operating systems (HotOS) (2009)
Moscibroda, T., Oshman, R.: Resilience of mutual exclusion algorithms to transient memory faults. In: Proceedings of the 30th ACM symposium on principles of distributed computing (PODC), pp. 69–78 (2011)
Narayanan, D., Hodson, O.: Whole-system persistence. In: Proceedings of the 17th international conference on architectural support for programming languages and operating systems (ASPLOS), pp. 401–410 (2012)
Ramaraju, A.: RGLock: Recoverable mutual exclusion for non-volatile main memory systems. Master’s thesis, University of Waterloo (2015). https://uwspace.uwaterloo.ca/handle/10012/9473. Accessed 31 Oct 2019
Raynal, M.: Algorithms for Mutual Exclusion. MIT Press, Cambridge (1986)
MATH Google Scholar
Scott, M., Scherer, W.: Scalable queue-based spin locks with timeout. In: Proceedings of the 8th ACM SIGPLAN symposium on principles and practices of parallel programming (PPoPP), pp. 44–52 (2001)
Taubenfeld, G.: Synchronization Algorithms and Concurrent Programming. Prentice Hall, Upper Saddle (2006)
Google Scholar
Yang, J.-H., Anderson, J.: A fast, scalable mutual exclusion algorithm. Distrib. Comput. 9(1), 51–60 (1995)
Article Google Scholar

Download references

Acknowledgements

Sincere thanks to Peter Buhr, Patrick Lam, and the anonymous referees of PODC’16 and Distributed Computing for detailed feedback and helpful suggestions on earlier drafts of this work. We are grateful also to Vassos Hadzilacos, Danny Hendler, Prasad Jayanti, Gadi Taubenfeld, and Sam Toueg for stimulating technical discussions.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Canada
Wojciech Golab & Aditya Ramaraju

Authors

Wojciech Golab
View author publications
You can also search for this author in PubMed Google Scholar
Aditya Ramaraju
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wojciech Golab.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research is supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada, Discovery Grants Program; the Ontario Early Researcher Awards Program; and the Google Faculty Research Awards Program.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Golab, W., Ramaraju, A. Recoverable mutual exclusion. Distrib. Comput. 32, 535–564 (2019). https://doi.org/10.1007/s00446-019-00364-0

Download citation

Received: 30 September 2016
Accepted: 20 October 2019
Published: 05 November 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s00446-019-00364-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recoverable mutual exclusion

Abstract

Access this article

Similar content being viewed by others

Recycling Memory in Recoverable Mutex Locks

Recoverable Mutual Exclusion with Abortability

Optimal Recoverable Mutual Exclusion Using only FASAS

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recoverable mutual exclusion

Abstract

Access this article

Similar content being viewed by others

Recycling Memory in Recoverable Mutex Locks

Recoverable Mutual Exclusion with Abortability

Optimal Recoverable Mutual Exclusion Using only FASAS

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation