skip to main content
10.1145/2933057.2933087acmconferencesArticle/Chapter ViewAbstractPublication PagespodcConference Proceedingsconference-collections
research-article

Recoverable Mutual Exclusion: [Extended Abstract]

Published:25 July 2016Publication History

ABSTRACT

Mutex locks have traditionally been the most common mechanism for protecting shared data structures in parallel programs. However, the robustness of such locks against process failures has not been studied thoroughly. Most (user-level) mutex algorithms are designed around the assumption that processes are reliable, meaning that a process may not fail while executing the lock acquisition and release code, or while inside the critical section.

If such a failure does occur, then the liveness properties of a conventional mutex lock may cease to hold until the application or operating system intervenes by cleaning up the internal structure of the lock. For example, a process that is attempting to acquire an otherwise starvation-free mutex may be blocked forever waiting for a failed process to release the critical section. Adding to the difficulty, if the failed process recovers and attempts to acquire the same mutex again without appropriate cleanup, then the mutex may become corrupted to the point where it loses safety, notably the mutual exclusion property. We address this challenge by formalizing the problem of recoverable mutual exclusion, and proposing several solutions that vary both in their assumptions regarding hardware support for synchronization, and in their time complexity. Compared to known solutions, our algorithms are more robust as they do not restrict where or when a process may crash, and provide stricter guarantees in terms of time complexity, which we define in terms of remote memory references.

References

  1. Y. Afek, D. S. Greenberg, M. Merritt, and G. Taubenfeld. Computing with faulty shared memory. In Proc. of the 11th ACM Symposium on Principles of Distributed Computing (PODC), pages 47--58, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Anderson and Y.-J. Kim. A new fast-path mechanism for mutual exclusion. Distributed Computing, 14(1):17--29, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Anderson and Y.-J. Kim. An improved lower bound for the time complexity of mutual exclusion. Distributed Computing, 15(4):221--253, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Anderson, Y.-J. Kim, and T. Herman. Shared-memory mutual exclusion: Major research trends since 1986. Distributed Computing, 16(2--3):75--110, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Anderson. The performance of spin lock alternatives for shared-memory multiprocessors.break IEEE Transactions on Parallel and Distributed Systems, 1(1):6--16, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Attiya, D. Hendler, and P. Woelfel. Tight RMR lower bounds for mutual exclusion and other problems. In Proc. of the 40th ACM Symposium on Theory of Computing (STOC), pages 217--226, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. A. Bender and S. Gilbert. Mutual Exclusion with O(łog2 łog n) Amortized Work. In Proc. of the 52nd Symposium on Foundations of Computer Science (FOCS), pages 728--737, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Bohannon, D. F. Lieuwen, and A. Silberschatz. Recovering scalable spin locks. In Proc. of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP), pages 314--322, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Bohannon, D. F. Lieuwen, A. Silberschatz, S. Sudarshan, and J. Gava. Recoverable user-level mutual exclusion. In Proc. of the 7th IEEE Symposium on Parallel and Distributed Processing (SPDP), pages 293--301, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Cypher. The communication requirements of mutual exclusion. In Proc. of the 7th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 147--156, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. W. Dijkstra. Solution of a problem in concurrent programming control. Communications of the ACM, 8(9):569, 1965. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. W. Dijkstra. Self-stabilizing systems in spite of distributed control. Communications of the ACM, 17(11):643--644, 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Fan and N. Lynch. An Ω(n łog n) lower bound on the cost of mutual exclusion. In Proc. of the 25th ACM Symposium on Principles of Distributed Computing (PODC), pages 275--284, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Giakkoupis and P. Woelfel. A tight RMR lower bound for randomized mutual exclusion. In Proc. of the 44th Symposium on Theory of Computing (STOC), pages 983--1002, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Giakkoupis and P. Woelfel. Randomized Mutual Exclusion with Constant Amortized RMR Complexity on the DSM. In Proc. of the 55th Symposium on Foundations of Computer Science (FOCS), pages 504--513, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. Golab, V. Hadzilacos, D. Hendler, and P. Woelfel. RMR-efficient implementations of comparison primitives using read and write operations. Distributed Computing, 25(2):109--162, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  17. G. Graunke and S. Thakkar. Synchronization algorithms for shared-memory multiprocessors. IEEE Computer, 23(6):60--69, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Hendler and P. Woelfel. Randomized mutual exclusion with sub-logarithmic RMR-complexity. Distributed Computing, 24(1):3--19, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124--149, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J.-H. Hoepman, M. Papatriantafilou, and P. Tsigas. Self-stabilization of wait-free shared memory objects. In Proc. of the 9th International Workshop on Distributed Algorithms (WDAG), pages 273--287, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Intel Corporation. Single-chip cloud computer. http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-single-chip-cloud-overview-paper.pdf.Google ScholarGoogle Scholar
  23. P. Jayanti, T. Chandra, and S. Toueg. Fault-tolerant wait-free shared objects. In Proc. of the 33rd Symposium on Foundations of Computer Science (FOCS), pages 157--166, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Johnen and L. Higham. Fault-tolerant implementations of regular registers by safe registers with applications to networks. In Proc. of 10th International Conference of Distributed Computing and Networking (ICDCN), pages 337--348, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Kessels. Arbitration without common modifiable variables. Acta Informatica, 17:135--141, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Lamport. A new solution of Dijkstra's concurrent programming problem. Communications of the ACM, 17(8):453--455, 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Lamport. The mutual exclusion problem: part I -- a theory of interprocess communication. Journal of the ACM, 33(2):313--326, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Lamport. The mutual exclusion problem: part II -- statement and solutions. Journal of the ACM, 33(2):327--348, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. L. Lamport. A fast mutual exclusion algorithm. ACM Transactions on Computer Systems, 5(1):1--11, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Magnusson, A. Landin, and E. Hagersten. Queue locks on cache coherent multiprocessors. In Proc. of the 8th International Parallel Processing Symposium (IPPS), pages 165--171, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Mellor-Crummey and M. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Michael and Y. Kim. Fault tolerant mutual exclusion locks for shared memory systems. US Patent 7,493,618, 2009.Google ScholarGoogle Scholar
  33. D. Narayanan and O. Hodson. Whole-system persistence. In Proc. of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 401--410, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Ramaraju. RGLock: Recoverable mutual exclusion for non-volatile main memory systems. Master's thesis, University of Waterloo, 2015. https://uwspace.uwaterloo.ca/handle/10012/9473.Google ScholarGoogle Scholar
  35. M. Raynal. Algorithms for Mutual Exclusion. MIT Press, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. Scott and W. Scherer. Scalable queue-based spin locks with timeout. In Proc. of the 8th ACM SIGPLAN symposium on Principles and Practices of Parallel Programming (PPoPP), pages 44--52, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. G. Taubenfeld. Synchronization Algorithms and Concurrent Programming. Prentice Hall, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J.-H. Yang and J. Anderson. A fast, scalable mutual exclusion algorithm. Distributed Computing, 9(1):51--60, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Recoverable Mutual Exclusion: [Extended Abstract]

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PODC '16: Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing
        July 2016
        508 pages
        ISBN:9781450339643
        DOI:10.1145/2933057

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 July 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        PODC '16 Paper Acceptance Rate40of149submissions,27%Overall Acceptance Rate740of2,477submissions,30%

        Upcoming Conference

        PODC '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader