research-article

A highly-efficient wait-free universal construction

Authors:

Panagiota Fatourou,

Nikolaos D. KallimanisAuthors Info & Claims

SPAA '11: Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures

Pages 325 - 334

https://doi.org/10.1145/1989493.1989549

Published: 04 June 2011 Publication History

Abstract

We present a new simple wait-free universal construction, called Sim, that uses just a Fetch&Add and an LL/SC object and performs a constant number of shared memory accesses. We have implemented SIM in a real shared-memory machine. In theory terms, our practical version of SIM, called P-SIM, has worse complexity than its theoretical analog; in practice though, we experimentally show that P-SIM outperforms several state-of-the-art lock-based and lock-free techniques, and this given that it is wait-free, i.e., that it satisfies a stronger progress condition than all the algorithms it outperforms.

We have used P-SIM to get highly-efficient wait-free implementations of stacks and queues. Our experiments show that our implementations outperform the currently state-of-the-art shared stack and queue implementations which ensure only weaker progress properties than wait-freedom.

References

[1]

Yehuda Afek, Dalia Dauber, and Dan Touitou. Wait-free made fast. In Proceedings of the 27th ACM Symposium on Theory of Computing, pages 538--547, 1995.

Digital Library

[2]

Yehuda Afek, Gideon Stupp, and Dan Touitou. Long-lived adaptive collect with applications. In Proceedings of the 40th Symposium on Foundations of Computer Science, pages 262--272, 1999.

Digital Library

[3]

James H. Anderson and Mark Moir. Universal constructions for multi-object operations. In Proceedings of the 14th ACM Symposium on Principles of Distributed Computing, pages 184--193, 1995.

Digital Library

[4]

James H. Anderson and Mark Moir. Universal constructions for large objects. IEEE Transactions on Parallel and Distributed Systems, 10(12):1317--1332, dec 1999.

Digital Library

[5]

Hagit Attiya, Rachid Guerraoui, and Eric Ruppert. Partial snapshot objects. In Proceedings of the 20th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 336--343, 2008.

Digital Library

[6]

Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. Hoard: A scalable memory allocator for multithreaded applications. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 117--128, 2000.

Digital Library

[7]

Phong Chuong, Faith Ellen, and Vijaya Ramachandran. A universal construction for wait-free transaction friendly data structures. In Proceedings of the 22nd Annual ACM Symposium on Parallel Algorithms and Architectures, pages 335--344, 2010.

Digital Library

[8]

Pat Conway, Nathan Kalyanasundharam, Gregg Donley, Kevin Lepak, and Bill Hughes. Blade computing with the amd opteron processor (magny-cours). Hot chips 21, August 2009.

[9]

T. S. Craig. Building fifo and priority-queueing spin locks from atomic swap. Technical Report TR 93-02-02, Department of Computer Science, University of Washington, February 1993.

[10]

Panagiota Fatourou and Nikolaos D. Kallimanis. The RedBlue adaptive universal constractions. In Proceedings of the 23rd International Symposium on Distributed Computing, pages 127--141, 2009.

Digital Library

[11]

Panagiota Fatourou and Nikolaos D. Kallimanis. Fast implementations of shared objects using fetch&add. Technical Report TR 02-2010, Department of Computer Science, University of Ioannina, February 2010.

[12]

D. George S. Harvey W. Kleinfelder K. McAuliffe E. Melton V. Norton G. Pfister, W. Brantley and J. Weiss. The ibm research parallel processor prototype (rp3): Introduction and architecture. pages 764--771, 1985.

[13]

P. Heidelberger, A. Norton, and John T. Robinson. Parallel quicksort using fetch-and-add. IEEE Transactions on Computers., 39(1):133--138, 1990.

Digital Library

[14]

Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. The code for flat combining. http://github.com/mit-carbon/flat-combining.

[15]

Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proceedings of the 22nd Annual ACM Symposium on Parallel Algorithms and Architectures, pages 355--364, 2010.

Digital Library

[16]

Danny Hendler, Nir Shavit, and Lena Yerushalmi. A scalable lock-free stack algorithm. In Proceedings of the 16th ACM Symposium on Parallel Algorithms and Architectures, pages 206--215, 2004.

Digital Library

[17]

Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems (TOPLAS), 13:124--149, jan 1991.

Digital Library

[18]

Maurice Herlihy. A methodology for implementing highly concurrent data objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 15(5):745--770, nov 1993.

Digital Library

[19]

Maurice P. Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 12:463--492, 1990.

Digital Library

[20]

Damien Imbs and Michel Raynal. Help when needed, but no more: Efficient read/write partial snapshot. In Proceedings of the 23rd International Symposium on Distributed Computing, pages 142--156. Springer, 2009.

Digital Library

[21]

Prasad Jayanti. A time complexity lower bound for randomized implementations of some shared objects. In Proceedings of the 17th ACM Symposium on Principles of Distributed Computing, pages 201--210, 1998.

Digital Library

[22]

Peter S. Magnusson, Anders Landin, and Erik Hagersten. Queue locks on cache coherent multiprocessors. In Proceedings of the 8th International Parallel Processing Symposium, pages 165--171, 1994.

Digital Library

[23]

John M. Mellor-Crummey and Michael L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991.

Digital Library

[24]

Maged M. Michael and Michael L. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 267--275, 1996.

Digital Library

[25]

Dimitrios S. Nikolopoulos and Theodore S. Papatheodorou. A quantitative architectural evaluation of synchronization algorithms and disciplines on ccnuma systems: the case of the sgi origin2000. In Proceedings of the 13th international conference on Supercomputing (ICS '99), pages 319--328, New York, NY, USA, 1999. ACM.

Digital Library

[26]

Ori Shalev and Nir Shavit. Predictive log-synchronization. In EuroSys, pages 305--315, 2006.

Digital Library

[27]

Nir Shavit and Asaph Zemach. Combining funnels: A dynamic approach to software combining. Journal of Parallel and Distributed Computing, 60(11):1355--1387, 2000.

Digital Library

[28]

Gadi Taubenfeld. Synchronization Algorithms and Concurrent Programming. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 2006.

Digital Library

[29]

R. K. Treiber. Systems programming: Coping with parallelism. Technical Report RJ 5118, IBM Almaden Research Center, April 1986.

[30]

Pen-Chung Yew, Nian-Feng Tzeng, and D.H. Lawrie. Distributing hot-spot addressing in large-scale multiprocessors. IEEE Transactions on Computers, C-36(4):388 --395, April 1987.

Digital Library

Cited By

von Geijer KTsigas PJohansson EHermansson S(2025)Balanced Allocations over Efficient Queues: A Fast Relaxed FIFO QueueProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710892(382-395)Online publication date: 28-Feb-2025
https://dl.acm.org/doi/10.1145/3710848.3710892
Le CGopinathan KLee KGilbert SSergey I(2024)Concurrent Data Structures Made EasyProceedings of the ACM on Programming Languages10.1145/36897758:OOPSLA2(1814-1842)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689775
Attiya HBender MFarach-Colton MOshman RSchiller NKuznetsov PGelles ROlivetti D(2024)History-Independent Concurrent ObjectsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662814(14-24)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3662158.3662814
Show More Cited By

Index Terms

A highly-efficient wait-free universal construction
1. Information systems
  1. Information storage systems
    1. Record storage systems
      1. Record storage alternatives
        Linked lists
    2. Storage architectures
      1. Distributed storage

Recommendations

Highly-Efficient Wait-Free Synchronization

We study a simple technique, originally presented by Herlihy (ACM Trans. Program. Lang. Syst. 15(5):745---770, 1993 ), for executing concurrently, in a wait-free manner, blocks of code that have been programmed for sequential execution and require ...
A methodology for creating fast wait-free data structures
PPOPP '12

Lock-freedom is a progress guarantee that ensures overall program progress. Wait-freedom is a stronger progress guarantee that ensures the progress of each thread in the program. While many practical lock-free algorithms exist, wait-free algorithms are ...
A methodology for creating fast wait-free data structures
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

Lock-freedom is a progress guarantee that ensures overall program progress. Wait-freedom is a stronger progress guarantee that ensures the progress of each thread in the program. While many practical lock-free algorithms exist, wait-free algorithms are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SPAA '11: Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures

June 2011

404 pages

ISBN:9781450307437

DOI:10.1145/1989493

Co-chairs:
Friedhelm Meyer auf der Heide
University of Paderborn, Germany
,
Rajmohan Rajaraman
Northeastern University, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

EATCS: European Association for Theoretical Computer Science

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SPAA '11

Sponsor:

SPAA '11: 23rd ACM Symposium on Parallelism in Algorithms and Architectures

June 4 - 6, 2011

California, San Jose, USA

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25

Sponsor:
sigact
sigact

37th ACM Symposium on Parallelism in Algorithms and Architectures

July 28 - August 1, 2025

Portland , OR , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

95
Total Citations
View Citations
824
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)7

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

von Geijer KTsigas PJohansson EHermansson S(2025)Balanced Allocations over Efficient Queues: A Fast Relaxed FIFO QueueProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710892(382-395)Online publication date: 28-Feb-2025
https://dl.acm.org/doi/10.1145/3710848.3710892
Le CGopinathan KLee KGilbert SSergey I(2024)Concurrent Data Structures Made EasyProceedings of the ACM on Programming Languages10.1145/36897758:OOPSLA2(1814-1842)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689775
Attiya HBender MFarach-Colton MOshman RSchiller NKuznetsov PGelles ROlivetti D(2024)History-Independent Concurrent ObjectsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662814(14-24)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3662158.3662814
Nikolaev RRavindran B(2024)A Family of Fast and Memory Efficient Lock- and Wait-Free ReclamationProceedings of the ACM on Programming Languages10.1145/36588518:PLDI(2174-2198)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3658851
Du XPiccione APimpini ABortoli SKnoll APellegrini A(2024)HUILLY: A Non-Blocking Ingestion Buffer for Timestepped Simulation Analytics2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00022(113-118)Online publication date: 6-May-2024
https://doi.org/10.1109/CCGrid59990.2024.00022
Fatourou PGiachoudis NMallis G(2024)Highly-Efficient Persistent FIFO QueuesStructural Information and Communication Complexity10.1007/978-3-031-60603-8_14(238-261)Online publication date: 27-May-2024
https://dl.acm.org/doi/10.1007/978-3-031-60603-8_14
Cho KJeon SRaad AKang J(2023)Memento: A Framework for Detectable Recoverability in Persistent MemoryProceedings of the ACM on Programming Languages10.1145/35912327:PLDI(292-317)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591232
Bhardwaj GChatterjee BJain APeri S(2023)Wait-Free Updates and Range Search Using UruvStabilization, Safety, and Security of Distributed Systems10.1007/978-3-031-44274-2_33(435-450)Online publication date: 30-Sep-2023
https://doi.org/10.1007/978-3-031-44274-2_33
Zhang JYi QPeterson CDechev D(2023)Compiler‐driven approach for automating nonblocking synchronization in concurrent data abstractionsConcurrency and Computation: Practice and Experience10.1002/cpe.793536:5Online publication date: 24-Oct-2023
https://doi.org/10.1002/cpe.7935
Milman-Sela GKogan ALev YLuchangco VPetrank E(2022)BQ: A Lock-Free Queue with BatchingACM Transactions on Parallel Computing10.1145/35127579:1(1-49)Online publication date: 23-Mar-2022
https://dl.acm.org/doi/10.1145/3512757
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten