skip to main content
10.1145/2926697.2926699acmconferencesArticle/Chapter ViewAbstractPublication PagesismmConference Proceedingsconference-collections
research-article

Fast non-intrusive memory reclamation for highly-concurrent data structures

Published: 14 June 2016 Publication History

Abstract

Current memory reclamation mechanisms for highly-concurrent data structures present an awkward trade-off. Techniques such as epoch-based reclamation perform well when all threads are running on dedicated processors, but the delay or failure of a single thread will prevent any other thread from reclaiming memory. Alternatives such as hazard pointers are highly robust, but they are expensive because they require a large number of memory barriers. This paper proposes three novel ways to alleviate the costs of the memory barriers associated with hazard pointers and related techniques. These new proposals are backward-compatible with existing code that uses hazard pointers. They move the cost of memory management from the principal code path to the infrequent memory reclamation procedure, significantly reducing or eliminating memory barriers executed on the principal code path. These proposals include (1) exploiting the operating system's memory protection ability, (2) exploiting certain x86 hardware features to trigger memory barriers only when needed, and (3) a novel hardware-assisted mechanism, called a hazard lookaside buffer (HLB) that allows a reclaiming thread to query whether there are hazardous pointers that need to be flushed to memory. We evaluate our proposals using a few fundamental data structures (linked lists and skiplists) and libcuckoo, a recent high-throughput hash-table library, and show significant improvements over the hazard pointer technique.

References

[1]
D. Alistarh, P. Eugster, M. Herlihy, A. Matveev, and N. Shavit. Stacktrack: An automated transactional approach to concurrent memory reclamation. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys), 2014.
[2]
D. Alistarh, W. M. Leiserson, A. Matveev, and N. Shavit. Threadscan: Automatic and scalable memory reclamation. In Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 123–132, 2015.
[3]
M. M. Bach, M. Charney, R. Cohn, E. Demikhovsky, T. Devor, K. Hazelwood, A. Jaleel, C.-K. Luk, G. Lyons, H. Patil, and A. Tal. Analyzing parallel programs with pin. Computer, 43(3):34–41, Mar. 2010.
[4]
R. Bayer and M. Schkolnick. Concurrency of operations on b-trees. Acta Informatica, 9:1–21, 1977.
[5]
A. Braginsky, A. Kogan, and E. Petrank. Drop the anchor: Lightweight memory management for non-blocking data structures. In Proceedings of the Twenty-fifth Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 33–42, 2013.
[6]
T. A. Brown. Reclaiming memory for lock-free data structures: There has to be a better way. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing (PODC), pages 261–270, 2015.
[7]
N. Cohen and E. Petrank. Efficient memory management for lock-free data structures with optimistic access. In Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 254–263, 2015.
[8]
J. Corbet. sys_membarrier(). https://lwn.net/Articles/ 369567, Jan. 2010. Date Accessed: February 10, 2016.
[9]
D. Dice. Qpi quiescence. http://blogs.oracle.com/ dave/entry/qpi_quiescence, Feb. 2010. Date Accessed: November 11, 2015.
[10]
D. Dice, H. Huang, and M. Yang. Asymmetric Dekker synchronization. Technical report, Sun Microsystems, 2001.
[11]
D. Dice, H. Huang, and M. Yang. Techniques for accessing a shared resource using an improved synchronization mechanism, 2004. US Patent 7644409 B2.
[12]
D. Dice, M. S. Moir, and W. N. Scherer, III. Quickly reacquirable locks, 2002. US Patent 7814488 B1.
[13]
A. Dragojevic, M. Herlihy, Y. Lev, and M. Moir. On the power of hardware transactional memory to simplify memory management. In Proceedings of the 30th Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 99–108, 2011.
[14]
M. Fomitchev and E. Ruppert. Lock-free linked lists and skip lists. In Proceedings of the Twenty-third Annual ACM Symposium on Principles of Distributed Computing, PODC ’04, pages 50–59, New York, NY, USA, 2004. ACM.
[15]
K. Fraser. Practical lock-freedom. Technical Report UCAMCL-TR-579, University of Cambridge, Computer Laboratory, Feb. 2004.
[16]
E. Gidron, I. Keidar, D. Perelman, and Y. Perez. SALSA: Scalable and Low Synchronization NUMA-aware Algorithm for Producer-consumer Pools. In Proceedings of the Twentyfourth Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 151–160, 2012.
[17]
T. Harris. A pragmatic implementation of non-blocking linkedlists. In Proceedings of 15th International Symposium on Distributed Computing (DISC 2001), Lisbon, Portugal, volume 2180 of Lecture Notes in Computer Science, pages 300—314. Springer Verlag, Oct. 2001.
[18]
T. E. Hart, P. E. McKenney, A. D. Brown, and J. Walpole. Performance of memory reclamation for lockless synchronization. J. Parallel Distrib. Comput., 67(12):1270–1285, Dec. 2007.
[19]
S. Heller, M. Herlihy, V. Luchangco, M. Moir, W. N. S. III, and N. Shavit. A lazy concurrent list-based set algorithm. In J. H. Anderson, G. Prencipe, and R. Wattenhofer, editors, Proceedings of the 9th International Conference on Principles of Distributed Systems (OPODIS 2005), Revised Selected Papers, volume 3974 of Lecture Notes in Computer Science, pages 3–16. Springer, 2006.
[20]
M. Herlihy, Y. Lev, V. Luchangco, and N. Shavit. A simple optimistic skiplist algorithm. In SIROCCO, pages 124–138, 2007.
[21]
M. Herlihy, V. Luchangco, and M. Moir. The repeat offender problem: A mechanism for supporting dynamic-sized, lockfree data structures. In Proceedings of the 16th International Conference on Distributed Computing (DISC), pages 339–353, 2002.
[22]
M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann, Mar. 2008.
[23]
G. C. Hunt, M. M. Michael, S. Parthasarathy, and M. L. Scott. An efficient algorithm for concurrent priority queue heaps. Inf. Process. Lett., 60(3):151–157, 1996.
[24]
K. Kawachiya, A. Koseki, and T. Onodera. Lock reservation: Java locks can mostly do without atomic operations. SIGPLAN Not., 37(11):130–141, 2002.
[25]
X. Li, D. G. Andersen, M. Kaminsky, and M. J. Freedman. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the European Conference on Computer Systems (EuroSys), pages 1–14, 2014.
[26]
R. Maddox, G. Singh, and R. Safranek. Weaving High Performance Multiprocessor Fabric. Intel Press, 2009.
[27]
P. McKenney and J. Slingwine. Read-Copy Update: Using execution history to solve concurrency problems. Parallel and Distributed Computing and Systems, pages 509–518, 1998.
[28]
M. M. Michael. High performance dynamic lock-free hash tables and list-based sets. In Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, pages 73–82. ACM Press, 2002.
[29]
M. M. Michael. Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst., 15:491– 504, June 2004.
[30]
A. Morrison and Y. Afek. Fence-free work stealing on bounded tso processors. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 413–426, 2014.
[31]
A. Morrison and Y. Afek. Temporally bounding TSO for fencefree asymmetric synchronization. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 45–58, 2015.
[32]
O. Shalev and N. Shavit. Split-ordered lists: Lock-free extensible hash tables. In The 22nd Annual ACM Symposium on Principles of Distributed Computing, pages 102–111. ACM Press, 2003.

Cited By

View all
  • (2024)Are Your Epochs Too Epic? Batch Free Can Be HarmfulProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638491(30-41)Online publication date: 2-Mar-2024
  • (2024)Expediting Hazard Pointers with Bounded RCU Critical SectionsProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659941(1-13)Online publication date: 17-Jun-2024
  • (2024)Simple, Fast and Widely Applicable Concurrent Memory Reclamation via NeutralizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.333567135:2(203-220)Online publication date: Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISMM 2016: Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management
June 2016
133 pages
ISBN:9781450343176
DOI:10.1145/2926697
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concurrent data structures
  2. hazard pointers
  3. memory barriers
  4. memory reclamation

Qualifiers

  • Research-article

Conference

ISMM '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 72 of 156 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Are Your Epochs Too Epic? Batch Free Can Be HarmfulProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638491(30-41)Online publication date: 2-Mar-2024
  • (2024)Expediting Hazard Pointers with Bounded RCU Critical SectionsProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659941(1-13)Online publication date: 17-Jun-2024
  • (2024)Simple, Fast and Widely Applicable Concurrent Memory Reclamation via NeutralizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.333567135:2(203-220)Online publication date: Feb-2024
  • (2023)The ERA Theorem for Safe Memory ReclamationProceedings of the 2023 ACM Symposium on Principles of Distributed Computing10.1145/3583668.3594564(102-112)Online publication date: 19-Jun-2023
  • (2023)The ERA Theorem for Safe Memory ReclamationProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577491(435-437)Online publication date: 25-Feb-2023
  • (2023)Applying Hazard Pointers to More Concurrent Data StructuresProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591102(213-226)Online publication date: 17-Jun-2023
  • (2023)Releasing Memory with Optimistic Access: A Hybrid Approach to Memory Reclamation and Allocation in Lock-Free ProgramsProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591089(177-186)Online publication date: 17-Jun-2023
  • (2023)Efficient Hardware Primitives for Immediate Memory Reclamation in Optimistic Data Structures2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00021(112-122)Online publication date: May-2023
  • (2021)Snapshot-free, transparent, and robust memory reclamation for lock-free data structuresProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454090(987-1002)Online publication date: 19-Jun-2021
  • (2021)Concurrent deferred reference counting with constant-time overheadProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454060(526-541)Online publication date: 19-Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media