skip to main content
10.1145/3278681.3278713acmotherconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

A preliminary study of minimal-contention locks

Published:26 September 2018Publication History

ABSTRACT

As multicore CPUs become more common, scalable synchronization primitives have wider use and ideas previously used in large-scale computation are worth re-opening for wider use. In this paper I explore one approach to scalable synchronization, a minimal-contention lock (M-lock). The key idea is to avoid spinning on a global variable but instead for each blocked task (process or thread) to spin on a local lock representing the task that immediately preceded it in attempting to acquire the lock. This creates an ordering based on the order in which tasks attempt to acquire the lock, preventing starvation. The only globally shared variable is a pointer to the next local lock to be contended for. Each contending task swaps the value of this pointer for a pointer to its own variable. It spins on the variable previously pointed to by the global pointer. Each waiting task spins on a lock only seen by itself and the owner of that lock variable. While a task is spinning, the lock variable can be held in its local cache until invalidated by the lock owner when it unsets the lock. Consequently, the amount of bus traffic is considerably less than with a spinlock, which has the pernicious feature that the task releasing the lock is delayed by all the other bus traffic arising from contention for the lock. An MCS lock has similar properties but is more complicated and requires more memory contention-causing operations. This paper outlines the design of the M-lock and provides a preliminary performance analysis.

References

  1. Silas Boyd-Wickizer, Austin T. Clements, Yandong Mao, Aleksey Pesterev, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. 2010. An Analysis of Linux Scalability to Many Cores. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI'10). USENIX Association, Berkeley, CA, USA, 1--16. http://dl.acm.org/citation.cfm?id=1924943.1924944 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Silas Boyd-Wickizer, M Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. 2012. Non-scalable locks are dangerous. In Proceedings of the Linux Symposium. 119--130.Google ScholarGoogle Scholar
  3. David R. Cheriton, Hendrik A. Goosen, Hugh Holbrook, and Philip Machanick. 1993. Restructuring a Parallel Simulation to Improve Cache Behavior in a Shared-memory Multiprocessor: The Value of Distributed Synchronization. In Proceedings of the Seventh Workshop on Parallel and Distributed Simulation (PADS '93). ACM, New York, NY, USA, 159--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Austin T Clements, M Frans Kaashoek, Nickolai Zeldovich, Robert T Morris, and Eddie Kohler. 2015. The scalable commutativity rule: Designing scalable software for multicore processors. ACM Transactions on Computer Systems 32, 4 (2015), 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Travis Craig. 1993. Building FIFO and priority queuing spin locks from atomic swap. Technical Report. University of Washington, Seattle. ftp://trout.cs.washington. edu/tr/1993/02/UW-CSE-93-02-02.pdfGoogle ScholarGoogle Scholar
  6. Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. 2013. Everything you always wanted to know about synchronization but were afraid to ask. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 33--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Robert I Davis and Alan Burns. 2011. A survey of hard real-time scheduling for multiprocessor systems. ACM computing surveys 43, 4 (2011), 35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. David Dice. 2011. Brief Announcement: A Partitioned Ticket Lock. In Proceedings of the Twenty-third Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '11). ACM, New York, NY, USA, 309--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Johan Eker, Jörn W Janneck, Edward A Lee, Jie Liu, Xiaojun Liu, Jozsef Ludvig, Stephen Neuendorffer, Sonia Sachs, and Yuhong Xiong. 2003. Taming heterogeneity-the Ptolemy approach. Proc. IEEE 91, 1 (2003), 127--144.Google ScholarGoogle ScholarCross RefCross Ref
  10. Hugo Guiroux, Renaud Lachaize, and Vivien Quéma. 2016. Multicore Locks: The Case is Not Closed Yet. In Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC '16). USENIX Association, Berkeley, CA, USA, 649--662. http://dl.acm.org/citation.cfm?id=3026959.3027018 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jonathan MD Hill and David B Skillicorn. 1998. Practical barrier synchronisation. In Proceedings of the Sixth Euromicro Workshop on Parallel and Distributed Processing (PDP'98). IEEE, 438--444.Google ScholarGoogle ScholarCross RefCross Ref
  12. Intel. 2016. Intel 64 and IA-32 Architectures Optimization Reference Manual. Technical Report. Intel. https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf, accessed 27 June 2018.Google ScholarGoogle Scholar
  13. John Mellor-Crummey. 2017. Algorithms for Scalable Lock Synchronization on Shared-memory Multiprocessors. https://www.clear.rice.edu/comp422/lecture-notes/comp422-534-2017-Lecture21-HWLocks.pdf {Accessed 30 June 2018}.Google ScholarGoogle Scholar
  14. John M. Mellor-Crummey and Michael L. Scott. 1991. Algorithms for Scalable Synchronization on Shared-memory Multiprocessors. ACM Transactions on Computer Systems 9, 1 (Feb. 1991), 21--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mitesh R Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, and Gabriel H Loh. 2015. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories. In 21st International Symposium on High Performance Computer Architecture (HPCA). IEEE, 126--136.Google ScholarGoogle ScholarCross RefCross Ref
  16. Maged M. Michael. 2013. The Balancing Act of Choosing Nonblocking Features. Commun. ACM 56, 9 (Sept. 2013), 46--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Molka, D. Hackenberg, R. Schone, and M.S. Muller. 2009. Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System. In Proc. 18th Int. Conf. on Parallel Architectures and Compilation Techniques (PACT'09). 261--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Bradford Nichols, Dick Buttlar, Jacqueline Farrell, and Jackie Farrell. 1996. Pthreads programming: A POSIX standard for better multiprocessing. O'Reilly, Sebastopol, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Steven Pelley, Peter M Chen, and Thomas F Wenisch. 2014. Memory persistency. In 41st International Symposium on Computer Architecture (ISCA). IEEE, 265--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. David P Reed and Rajendra K Kanodia. 1979. Synchronization with eventcounts and sequencers. Commun. ACM 22, 2 (1979), 115--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Paul Rosenfeld. 2014. Performance exploration of the hybrid memory cube. Ph.D. Dissertation. University of Maryland. https://drum.lib.umd.edu/handle/1903/15372Google ScholarGoogle Scholar
  22. Avinash Sodani, Roger Gramunt, Jesus Corbal, Ho-Seop Kim, Krishna Vinod, Sundaram Chinthamani, Steven Hutsell, Rajat Agarwal, and Yen-Chen Liu. 2016. Knights landing: Second-generation Intel Xeon Phi product. IEEE Micro 36, 2 (2016), 34--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S Swaminathan, John Stultz, Jack F Vogel, and Paul E McKenney. 2002. Fairlocks A High Performance Fair Locking Scheme. In Parallel and Distributed Computing and Systems (PDCS). 241--246.Google ScholarGoogle Scholar
  24. Josep Torrellas, HS Lam, and John L. Hennessy. 1994. False sharing and spatial locality in multiprocessor caches. IEEE Trans. Comput. 43, 6 (1994), 651--663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Roberto Vitali, Alessandro Pellegrini, and Gionata Cerasuolo. 2012. Cacheaware Memory Manager for Optimistic Simulations. In Proceedings of the 5th International ICST Conference on Simulation Tools and Techniques (SIMUTOOLS '12). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), ICST, Brussels, Belgium, Belgium, 129--138. http://dl.acm.org/citation.cfm?id=2263019.2263035 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Andrew Waterman, Yunsup Lee, David A Patterson, and Krste Asanovic. 2011. The RISC-V instruction set manual, Volume I: Base user-level ISA. Technical Report UCB/EECS-2011-62. EECS Department, UC Berkeley.Google ScholarGoogle Scholar
  27. Wm A Wulf and Sally A McKee. 1995. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news 23, 1 (1995), 20--24. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A preliminary study of minimal-contention locks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        SAICSIT '18: Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists
        September 2018
        362 pages
        ISBN:9781450366472
        DOI:10.1145/3278681

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 September 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate187of439submissions,43%
      • Article Metrics

        • Downloads (Last 12 months)2
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader